TRENDING NOW: RackNerd’s BLACK FRIDAY! NEW DEALS, NEW LOCATIONS, NEW HARDWARE + 100’s of GIVEAWAYS!

bchot · December 2025

@noob404 said:

@Saragoldfarb said:

@noob404 said:

@Saragoldfarb said:

@noob404 said:

@Saragoldfarb said:

@noob404 said:

@bchot said:
but Kate Beckinsale insists she did nothing either.. very silly.. so often while trying to hide their work, it goes somehow wrong..

The right image looked a bit like if ScarJo had a procedure that went horribly wrong.

And soooo pretty on the left... Like I'm watching a mirror.

Yup, definitely an example of going under the knife gone wrong.

Word

Why don't surgeons recommend them better options. I mean, I am sure these aren't simple procedures that cost only a few hundred dollars.

There are but not in the us I guess.

I mean, I have heard South Korea is the preferred country for cosmetic procedure for many.

get in there, then get out looking like a Korean..

noob404 · December 2025

@bchot said:

@noob404 said:

@ralf said:

@noob404 said:
@ralf Over the past few months, I researched a few TTS models and here are my findings that might help you.

These are the models I found best for most purposes:-
1. https://github.com/index-tts/index-tts - Imo, the best out there, but, no clear licence information. So, can't be sure if it can be used for commercial projects. One of the best features here is that it can also very reliably do emotions with weightage. There are several emotions that it supports by default. I tried mailing their team regarding clarification on the licence terms, but, they never replied. So, I moved onto the next one I found most suitable for my use case.
2. https://github.com/SWivid/F5-TTS - A very close second. Doesn't have emotion control and such. But, I guess, it can be trained to do that using multiple voices and using specific datasets using the custom toml file they specify on their Github. I did try that though. But, didn't find emotion control very smooth. Sometimes, teh voice is just shouting. But, cloning is prefect, I'd dare sya close to a 100% match.

Training your own voice
Both of the above models, esp. F5 has a training pipeline within the original source code that you can use very easily. For preparation of the dataset, i.e., to record your voice, I recommend https://github.com/rhasspy/piper-recording-studio. It has readmymade dataset for many popular accents. You can also add your own, which I do recommend to make sure there is good phoneme coverage. Remember, most of these models use pinyin characters, so phoneme coverage is important.
As for the dataset size, it depends, if you wanna train from scratch, you will need thousands of hours of raw voice data. But, if you can work by pre-training an already available model, I recommend F5_TTS instead of E2_TTS. Might take a bit of trial and error to find what workds for you best. If pre-training an already available model, you just need an hour of data, will work even with a few minutes of data, but, I do not recommend that.
Also, use a good condensor mic and maybe some denoising software for the best reults.

If you have any doubts, let me know. I will try to clarify according to my knowledge. Please understand I am no expert in this matter. EVerything I have written above is based on what I researched in the past few months and I even tested a few other models to find which one would be best for me like Orpheus, Dia, etc. Hope this helps someone.

Oh cool, thanks. I'll take a look at both of those as well. I hadn't found either.

The one that looked the best when I looked was Spark-TTS:
https://huggingface.co/spaces/Mobvoi/Offical-Spark-TTS
https://arxiv.org/pdf/2503.01710
https://sparkaudio.github.io/spark-tts/

Others I found that didn't seem to do cloning of your own voice were:
https://github.com/rhasspy/piper
https://github.com/WhisperSpeech/WhisperSpeech
https://www.reddit.com/r/LocalLLaMA/comments/1f0awd6/best_local_open_source_texttospeech_and/
https://github.com/coqui-ai/TTS
https://mycroft-ai.gitbook.io/docs/mycroft-technologies/mimic-tts
https://github.com/myshell-ai/OpenVoice
https://www.resemble.ai/open-source-voice-cloning-multiple-languages/
https://github.com/yantze/vim-tts
https://github.com/myshell-ai/MeloTTS
https://github.com/RVC-Boss/GPT-SoVITS

Of those, coqui TTS seems to promise voice cloning, but nothing to indicate it's actually possible. Also looks like the open source effort has been abandoned.

Well, I would have thought that as well, but, there are still AI TTS projects that do see active development. I know you prolly already know this, but, make sure the licence for both the models and the app say they are suitable for your purpose. This is why I prefer F5-TTS cause they have a permissive licence.

are you talking about TiTS?

Farthest thing from that

TTS = Text to speech

noob404 · December 2025

@bchot said:

@noob404 said:

@Saragoldfarb said:

@noob404 said:

@Saragoldfarb said:

@noob404 said:

@Saragoldfarb said:

@noob404 said:

@bchot said:
but Kate Beckinsale insists she did nothing either.. very silly.. so often while trying to hide their work, it goes somehow wrong..

The right image looked a bit like if ScarJo had a procedure that went horribly wrong.

And soooo pretty on the left... Like I'm watching a mirror.

Yup, definitely an example of going under the knife gone wrong.

Word

Why don't surgeons recommend them better options. I mean, I am sure these aren't simple procedures that cost only a few hundred dollars.

There are but not in the us I guess.

I mean, I have heard South Korea is the preferred country for cosmetic procedure for many.

get in there, then get out looking like a Korean..

Oh yes. Glass skin I guess it's called.

bchot · December 2025

@noob404 said:

@bchot said:
roughly 1/20 chance of winning something.. i like those odds.
💜💜💜💙💙💙💚💚💚💛💛💛

How's that though? There are about 1951 unique users right now.

come on, bro. you were just pretending to like math..

@dustinc said: Mark your calendars! Tomorrow @ 8 PM PST, we will be streaming on YouTube and randomly selecting the 90 winners for the main giveaway!

bchot · December 2025

@noob404 said:

@bchot said:

@noob404 said:

@ralf said:

@noob404 said:
@ralf Over the past few months, I researched a few TTS models and here are my findings that might help you.

These are the models I found best for most purposes:-
1. https://github.com/index-tts/index-tts - Imo, the best out there, but, no clear licence information. So, can't be sure if it can be used for commercial projects. One of the best features here is that it can also very reliably do emotions with weightage. There are several emotions that it supports by default. I tried mailing their team regarding clarification on the licence terms, but, they never replied. So, I moved onto the next one I found most suitable for my use case.
2. https://github.com/SWivid/F5-TTS - A very close second. Doesn't have emotion control and such. But, I guess, it can be trained to do that using multiple voices and using specific datasets using the custom toml file they specify on their Github. I did try that though. But, didn't find emotion control very smooth. Sometimes, teh voice is just shouting. But, cloning is prefect, I'd dare sya close to a 100% match.

Training your own voice
Both of the above models, esp. F5 has a training pipeline within the original source code that you can use very easily. For preparation of the dataset, i.e., to record your voice, I recommend https://github.com/rhasspy/piper-recording-studio. It has readmymade dataset for many popular accents. You can also add your own, which I do recommend to make sure there is good phoneme coverage. Remember, most of these models use pinyin characters, so phoneme coverage is important.
As for the dataset size, it depends, if you wanna train from scratch, you will need thousands of hours of raw voice data. But, if you can work by pre-training an already available model, I recommend F5_TTS instead of E2_TTS. Might take a bit of trial and error to find what workds for you best. If pre-training an already available model, you just need an hour of data, will work even with a few minutes of data, but, I do not recommend that.
Also, use a good condensor mic and maybe some denoising software for the best reults.

If you have any doubts, let me know. I will try to clarify according to my knowledge. Please understand I am no expert in this matter. EVerything I have written above is based on what I researched in the past few months and I even tested a few other models to find which one would be best for me like Orpheus, Dia, etc. Hope this helps someone.

Oh cool, thanks. I'll take a look at both of those as well. I hadn't found either.

The one that looked the best when I looked was Spark-TTS:
https://huggingface.co/spaces/Mobvoi/Offical-Spark-TTS
https://arxiv.org/pdf/2503.01710
https://sparkaudio.github.io/spark-tts/

Others I found that didn't seem to do cloning of your own voice were:
https://github.com/rhasspy/piper
https://github.com/WhisperSpeech/WhisperSpeech
https://www.reddit.com/r/LocalLLaMA/comments/1f0awd6/best_local_open_source_texttospeech_and/
https://github.com/coqui-ai/TTS
https://mycroft-ai.gitbook.io/docs/mycroft-technologies/mimic-tts
https://github.com/myshell-ai/OpenVoice
https://www.resemble.ai/open-source-voice-cloning-multiple-languages/
https://github.com/yantze/vim-tts
https://github.com/myshell-ai/MeloTTS
https://github.com/RVC-Boss/GPT-SoVITS

Of those, coqui TTS seems to promise voice cloning, but nothing to indicate it's actually possible. Also looks like the open source effort has been abandoned.

Well, I would have thought that as well, but, there are still AI TTS projects that do see active development. I know you prolly already know this, but, make sure the licence for both the models and the app say they are suitable for your purpose. This is why I prefer F5-TTS cause they have a permissive licence.

are you talking about TiTS?

Farthest thing from that

TTS = Text to speech

best talk about tits.. TTS = tits to screen..

codemonkeyx · December 2025

Hello, I would like to double my bandwidth.
Invoice #19758606
Thanks!

noob404 · December 2025

@bchot said:

@noob404 said:

@bchot said:
roughly 1/20 chance of winning something.. i like those odds.
💜💜💜💙💙💙💚💚💚💛💛💛

How's that though? There are about 1951 unique users right now.

come on, bro. you were just pretending to like math..

@dustinc said: Mark your calendars! Tomorrow @ 8 PM PST, we will be streaming on YouTube and randomly selecting the 90 winners for the main giveaway!

Caught red handed. Pardon me. Given there are gonna be 90 winners, your math is definitely mathing!

noob404 · December 2025

@bchot said:

@noob404 said:

@bchot said:

@noob404 said:

@ralf said:

@noob404 said:
@ralf Over the past few months, I researched a few TTS models and here are my findings that might help you.

These are the models I found best for most purposes:-
1. https://github.com/index-tts/index-tts - Imo, the best out there, but, no clear licence information. So, can't be sure if it can be used for commercial projects. One of the best features here is that it can also very reliably do emotions with weightage. There are several emotions that it supports by default. I tried mailing their team regarding clarification on the licence terms, but, they never replied. So, I moved onto the next one I found most suitable for my use case.
2. https://github.com/SWivid/F5-TTS - A very close second. Doesn't have emotion control and such. But, I guess, it can be trained to do that using multiple voices and using specific datasets using the custom toml file they specify on their Github. I did try that though. But, didn't find emotion control very smooth. Sometimes, teh voice is just shouting. But, cloning is prefect, I'd dare sya close to a 100% match.

Training your own voice
Both of the above models, esp. F5 has a training pipeline within the original source code that you can use very easily. For preparation of the dataset, i.e., to record your voice, I recommend https://github.com/rhasspy/piper-recording-studio. It has readmymade dataset for many popular accents. You can also add your own, which I do recommend to make sure there is good phoneme coverage. Remember, most of these models use pinyin characters, so phoneme coverage is important.
As for the dataset size, it depends, if you wanna train from scratch, you will need thousands of hours of raw voice data. But, if you can work by pre-training an already available model, I recommend F5_TTS instead of E2_TTS. Might take a bit of trial and error to find what workds for you best. If pre-training an already available model, you just need an hour of data, will work even with a few minutes of data, but, I do not recommend that.
Also, use a good condensor mic and maybe some denoising software for the best reults.

If you have any doubts, let me know. I will try to clarify according to my knowledge. Please understand I am no expert in this matter. EVerything I have written above is based on what I researched in the past few months and I even tested a few other models to find which one would be best for me like Orpheus, Dia, etc. Hope this helps someone.

Oh cool, thanks. I'll take a look at both of those as well. I hadn't found either.

The one that looked the best when I looked was Spark-TTS:
https://huggingface.co/spaces/Mobvoi/Offical-Spark-TTS
https://arxiv.org/pdf/2503.01710
https://sparkaudio.github.io/spark-tts/

Others I found that didn't seem to do cloning of your own voice were:
https://github.com/rhasspy/piper
https://github.com/WhisperSpeech/WhisperSpeech
https://www.reddit.com/r/LocalLLaMA/comments/1f0awd6/best_local_open_source_texttospeech_and/
https://github.com/coqui-ai/TTS
https://mycroft-ai.gitbook.io/docs/mycroft-technologies/mimic-tts
https://github.com/myshell-ai/OpenVoice
https://www.resemble.ai/open-source-voice-cloning-multiple-languages/
https://github.com/yantze/vim-tts
https://github.com/myshell-ai/MeloTTS
https://github.com/RVC-Boss/GPT-SoVITS

Of those, coqui TTS seems to promise voice cloning, but nothing to indicate it's actually possible. Also looks like the open source effort has been abandoned.

Well, I would have thought that as well, but, there are still AI TTS projects that do see active development. I know you prolly already know this, but, make sure the licence for both the models and the app say they are suitable for your purpose. This is why I prefer F5-TTS cause they have a permissive licence.

are you talking about TiTS?

Farthest thing from that

TTS = Text to speech

best talk about tits.. TTS = tits to screen..

Not on this thread hopefully. Sup? Excited about day after tomorrow's draw?
Also got your daily bonus for today on the December giveaway?

bchot · December 2025

@noob404 said:

@bchot said:

@noob404 said:

@bchot said:

@noob404 said:

@ralf said:

@noob404 said:
@ralf Over the past few months, I researched a few TTS models and here are my findings that might help you.

These are the models I found best for most purposes:-
1. https://github.com/index-tts/index-tts - Imo, the best out there, but, no clear licence information. So, can't be sure if it can be used for commercial projects. One of the best features here is that it can also very reliably do emotions with weightage. There are several emotions that it supports by default. I tried mailing their team regarding clarification on the licence terms, but, they never replied. So, I moved onto the next one I found most suitable for my use case.
2. https://github.com/SWivid/F5-TTS - A very close second. Doesn't have emotion control and such. But, I guess, it can be trained to do that using multiple voices and using specific datasets using the custom toml file they specify on their Github. I did try that though. But, didn't find emotion control very smooth. Sometimes, teh voice is just shouting. But, cloning is prefect, I'd dare sya close to a 100% match.

Training your own voice
Both of the above models, esp. F5 has a training pipeline within the original source code that you can use very easily. For preparation of the dataset, i.e., to record your voice, I recommend https://github.com/rhasspy/piper-recording-studio. It has readmymade dataset for many popular accents. You can also add your own, which I do recommend to make sure there is good phoneme coverage. Remember, most of these models use pinyin characters, so phoneme coverage is important.
As for the dataset size, it depends, if you wanna train from scratch, you will need thousands of hours of raw voice data. But, if you can work by pre-training an already available model, I recommend F5_TTS instead of E2_TTS. Might take a bit of trial and error to find what workds for you best. If pre-training an already available model, you just need an hour of data, will work even with a few minutes of data, but, I do not recommend that.
Also, use a good condensor mic and maybe some denoising software for the best reults.

If you have any doubts, let me know. I will try to clarify according to my knowledge. Please understand I am no expert in this matter. EVerything I have written above is based on what I researched in the past few months and I even tested a few other models to find which one would be best for me like Orpheus, Dia, etc. Hope this helps someone.

Oh cool, thanks. I'll take a look at both of those as well. I hadn't found either.

The one that looked the best when I looked was Spark-TTS:
https://huggingface.co/spaces/Mobvoi/Offical-Spark-TTS
https://arxiv.org/pdf/2503.01710
https://sparkaudio.github.io/spark-tts/

Others I found that didn't seem to do cloning of your own voice were:
https://github.com/rhasspy/piper
https://github.com/WhisperSpeech/WhisperSpeech
https://www.reddit.com/r/LocalLLaMA/comments/1f0awd6/best_local_open_source_texttospeech_and/
https://github.com/coqui-ai/TTS
https://mycroft-ai.gitbook.io/docs/mycroft-technologies/mimic-tts
https://github.com/myshell-ai/OpenVoice
https://www.resemble.ai/open-source-voice-cloning-multiple-languages/
https://github.com/yantze/vim-tts
https://github.com/myshell-ai/MeloTTS
https://github.com/RVC-Boss/GPT-SoVITS

Of those, coqui TTS seems to promise voice cloning, but nothing to indicate it's actually possible. Also looks like the open source effort has been abandoned.

Well, I would have thought that as well, but, there are still AI TTS projects that do see active development. I know you prolly already know this, but, make sure the licence for both the models and the app say they are suitable for your purpose. This is why I prefer F5-TTS cause they have a permissive licence.

are you talking about TiTS?

Farthest thing from that

TTS = Text to speech

best talk about tits.. TTS = tits to screen..

Not on this thread hopefully. Sup? Excited about day after tomorrow's draw?
Also got your daily bonus for today on the December giveaway?

yep, got it all sorted.. thanks.

bchot · December 2025

BTW if you lookign for a crazy comedy series, i recommend this

bchot · December 2025

is about dudes back in the era of pirate radio stations, great techno, trance and house music, great vibes, hilarious situations...

noob404 · December 2025

@bchot said:

@noob404 said:

@bchot said:

@noob404 said:

@bchot said:

@noob404 said:

@ralf said:

@noob404 said:
@ralf Over the past few months, I researched a few TTS models and here are my findings that might help you.

These are the models I found best for most purposes:-
1. https://github.com/index-tts/index-tts - Imo, the best out there, but, no clear licence information. So, can't be sure if it can be used for commercial projects. One of the best features here is that it can also very reliably do emotions with weightage. There are several emotions that it supports by default. I tried mailing their team regarding clarification on the licence terms, but, they never replied. So, I moved onto the next one I found most suitable for my use case.
2. https://github.com/SWivid/F5-TTS - A very close second. Doesn't have emotion control and such. But, I guess, it can be trained to do that using multiple voices and using specific datasets using the custom toml file they specify on their Github. I did try that though. But, didn't find emotion control very smooth. Sometimes, teh voice is just shouting. But, cloning is prefect, I'd dare sya close to a 100% match.

Training your own voice
Both of the above models, esp. F5 has a training pipeline within the original source code that you can use very easily. For preparation of the dataset, i.e., to record your voice, I recommend https://github.com/rhasspy/piper-recording-studio. It has readmymade dataset for many popular accents. You can also add your own, which I do recommend to make sure there is good phoneme coverage. Remember, most of these models use pinyin characters, so phoneme coverage is important.
As for the dataset size, it depends, if you wanna train from scratch, you will need thousands of hours of raw voice data. But, if you can work by pre-training an already available model, I recommend F5_TTS instead of E2_TTS. Might take a bit of trial and error to find what workds for you best. If pre-training an already available model, you just need an hour of data, will work even with a few minutes of data, but, I do not recommend that.
Also, use a good condensor mic and maybe some denoising software for the best reults.

If you have any doubts, let me know. I will try to clarify according to my knowledge. Please understand I am no expert in this matter. EVerything I have written above is based on what I researched in the past few months and I even tested a few other models to find which one would be best for me like Orpheus, Dia, etc. Hope this helps someone.

Oh cool, thanks. I'll take a look at both of those as well. I hadn't found either.

The one that looked the best when I looked was Spark-TTS:
https://huggingface.co/spaces/Mobvoi/Offical-Spark-TTS
https://arxiv.org/pdf/2503.01710
https://sparkaudio.github.io/spark-tts/

Others I found that didn't seem to do cloning of your own voice were:
https://github.com/rhasspy/piper
https://github.com/WhisperSpeech/WhisperSpeech
https://www.reddit.com/r/LocalLLaMA/comments/1f0awd6/best_local_open_source_texttospeech_and/
https://github.com/coqui-ai/TTS
https://mycroft-ai.gitbook.io/docs/mycroft-technologies/mimic-tts
https://github.com/myshell-ai/OpenVoice
https://www.resemble.ai/open-source-voice-cloning-multiple-languages/
https://github.com/yantze/vim-tts
https://github.com/myshell-ai/MeloTTS
https://github.com/RVC-Boss/GPT-SoVITS

Of those, coqui TTS seems to promise voice cloning, but nothing to indicate it's actually possible. Also looks like the open source effort has been abandoned.

Well, I would have thought that as well, but, there are still AI TTS projects that do see active development. I know you prolly already know this, but, make sure the licence for both the models and the app say they are suitable for your purpose. This is why I prefer F5-TTS cause they have a permissive licence.

are you talking about TiTS?

Farthest thing from that

TTS = Text to speech

best talk about tits.. TTS = tits to screen..

Not on this thread hopefully. Sup? Excited about day after tomorrow's draw?
Also got your daily bonus for today on the December giveaway?

yep, got it all sorted.. thanks.

I almost missed getting mine today.

noob404 · December 2025

@bchot said:
BTW if you lookign for a crazy comedy series, i recommend this

Isnt this British? I think I saw a few episodes on Comedy Central an year or so ago.

bchot · December 2025

there's also this character - Chabuddy - super amazing entrepreneur and a funny guy

noob404 · December 2025

@bchot said:
is about dudes back in the era of pirate radio stations, great techno, trance and house music, great vibes, hilarious situations...

I guess some of these guys are involved in shady business and such and the comedy revolves around their messups, right?

noob404 · December 2025

@bchot said:
there's also this character - Chabuddy - super amazing entrepreneur and a funny guy

I think there is an issue with the image

bchot · December 2025

@noob404 said:

@bchot said:
is about dudes back in the era of pirate radio stations, great techno, trance and house music, great vibes, hilarious situations...

I guess some of these guys are involved in shady business and such and the comedy revolves around their messups, right?

it's not a business per-say. it's a pirate radio station broadcasting from apartments buildings.. it's soo freaking funny.. or at least IMO

bchot · December 2025

@noob404 said:

@bchot said:
there's also this character - Chabuddy - super amazing entrepreneur and a funny guy

I think there is an issue with the image

not anymore there isn't

bchot · December 2025

@bchot said:

@noob404 said:

@bchot said:
there's also this character - Chabuddy - super amazing entrepreneur and a funny guy

I think there is an issue with the image

not anymore there isn't

innit?

noob404 · December 2025

@bchot said:

@noob404 said:

@bchot said:
is about dudes back in the era of pirate radio stations, great techno, trance and house music, great vibes, hilarious situations...

I guess some of these guys are involved in shady business and such and the comedy revolves around their messups, right?

it's not a business per-say. it's a pirate radio station broadcasting from apartments buildings.. it's soo freaking funny.. or at least IMO

Will have to watch it again. I tried the first few episodes but, didn't like that brand of comedy for some reason..but, tbf, I only watched the first one or two and I know it's hard to judge with that.

bchot · December 2025

for me it's time to eat
TTYL

noob404 · December 2025

@bchot said:

@noob404 said:

@bchot said:
there's also this character - Chabuddy - super amazing entrepreneur and a funny guy

I think there is an issue with the image

not anymore there isn't

Yup, thanks for fixing it. I did see the episode where he was doing some shady stuff in his van with a mannequin or something, iirc.

noob404 · December 2025

@bchot said:
for me it's time to eat
TTYL

Sure, bchot. I will see you later then. Promised myself I would go to bed early today. See you tomorrow then

Good night everyone

ralf · December 2025

@noob404 said:

@Saragoldfarb said:

@noob404 said:

@Saragoldfarb said:

@noob404 said:

@Saragoldfarb said:

@noob404 said:

@bchot said:
but Kate Beckinsale insists she did nothing either.. very silly.. so often while trying to hide their work, it goes somehow wrong..

The right image looked a bit like if ScarJo had a procedure that went horribly wrong.

And soooo pretty on the left... Like I'm watching a mirror.

Yup, definitely an example of going under the knife gone wrong.

Word

Why don't surgeons recommend them better options. I mean, I am sure these aren't simple procedures that cost only a few hundred dollars.

There are but not in the us I guess.

I mean, I have heard South Korea is the preferred country for cosmetic procedure for many.

Yeah, I was told that actually it's even relatively common for mothers to gift their daughters cosmetic surgery work for their 18th birthday. Kind of nuts.

But it seems that there's a very sexist culture in the workplace though. It seems that many roles are more gendered than elsewhere and attractiveness is the highest ranking criteria for most of the office jobs, more than actually having any qualifications at all. So if you want to get ahead in work, you get some surgery done.

Of course, only 1 girl told me that, could be completely anecdotal and/or BS.

shaoxianbilly · December 2025

Please Double the bandwidth, my order#19678595 thanks

shaoxianbilly · December 2025

IT'S THE SEASON, BLACK FRIDAY HYPE with REAL DEALS. CHECK THEM OUT: https://www.racknerd.com/BlackFriday/

ralf · December 2025

@noob404 said:

@Saragoldfarb said:

@noob404 said:

@Saragoldfarb said:

@noob404 said:

@Saragoldfarb said:

@noob404 said:
Anybody online right now? Only roughly two days remain for the lottery draw.

Yes

Well, we are here almost everytime.

Yeah, YOLO!

I am realising that.

It's important.... Gotta have fun. Life's overe before you know.

Oh man, it just got serious. I just believe I said this earlier as well, iirc, I am just surviving at this point, the time to live my life is long gone, ig.

ralf · December 2025

@allthemtings said:

@noob404 said:

@allthemtings said:

@noob404 said:

@allthemtings said:
@noob404 POV

Sorry, dude. Has been the trend this year. Welcome to the thread

As an avid shitposter myself i respect the grind

What where though? You mean the megathread? Let me know if myself services are needed elsewhere as well.

I am just kidding, going all out this time for the gaming pc and also because this is gonna be my last participation for the social butterfly prolly

Yup on the megathreads, I'm not on your level though

I don't think anybody is quite on that level!

ralf · December 2025

@TrK said:

@noob404 said:

@allthemtings said:

@TrK said:

@allthemtings said:

@noob404 said:

@allthemtings said:

@noob404 said:

@allthemtings said:
@noob404 POV

Sorry, dude. Has been the trend this year. Welcome to the thread

As an avid shitposter myself i respect the grind

What where though? You mean the megathread? Let me know if myself services are needed elsewhere as well.

I am just kidding, going all out this time for the gaming pc and also because this is gonna be my last participation for the social butterfly prolly

Yup on the megathreads, I'm not on your level though

No one is...

Where have I seen this guy?

On the throne?

The poshest shitter in all of England

ralf · December 2025

@bchot said:

@noob404 said:

@ralf said:

@noob404 said:
@ralf Over the past few months, I researched a few TTS models and here are my findings that might help you.

These are the models I found best for most purposes:-
1. https://github.com/index-tts/index-tts - Imo, the best out there, but, no clear licence information. So, can't be sure if it can be used for commercial projects. One of the best features here is that it can also very reliably do emotions with weightage. There are several emotions that it supports by default. I tried mailing their team regarding clarification on the licence terms, but, they never replied. So, I moved onto the next one I found most suitable for my use case.
2. https://github.com/SWivid/F5-TTS - A very close second. Doesn't have emotion control and such. But, I guess, it can be trained to do that using multiple voices and using specific datasets using the custom toml file they specify on their Github. I did try that though. But, didn't find emotion control very smooth. Sometimes, teh voice is just shouting. But, cloning is prefect, I'd dare sya close to a 100% match.

Training your own voice
Both of the above models, esp. F5 has a training pipeline within the original source code that you can use very easily. For preparation of the dataset, i.e., to record your voice, I recommend https://github.com/rhasspy/piper-recording-studio. It has readmymade dataset for many popular accents. You can also add your own, which I do recommend to make sure there is good phoneme coverage. Remember, most of these models use pinyin characters, so phoneme coverage is important.
As for the dataset size, it depends, if you wanna train from scratch, you will need thousands of hours of raw voice data. But, if you can work by pre-training an already available model, I recommend F5_TTS instead of E2_TTS. Might take a bit of trial and error to find what workds for you best. If pre-training an already available model, you just need an hour of data, will work even with a few minutes of data, but, I do not recommend that.
Also, use a good condensor mic and maybe some denoising software for the best reults.

If you have any doubts, let me know. I will try to clarify according to my knowledge. Please understand I am no expert in this matter. EVerything I have written above is based on what I researched in the past few months and I even tested a few other models to find which one would be best for me like Orpheus, Dia, etc. Hope this helps someone.

Oh cool, thanks. I'll take a look at both of those as well. I hadn't found either.

The one that looked the best when I looked was Spark-TTS:
https://huggingface.co/spaces/Mobvoi/Offical-Spark-TTS
https://arxiv.org/pdf/2503.01710
https://sparkaudio.github.io/spark-tts/

Others I found that didn't seem to do cloning of your own voice were:
https://github.com/rhasspy/piper
https://github.com/WhisperSpeech/WhisperSpeech
https://www.reddit.com/r/LocalLLaMA/comments/1f0awd6/best_local_open_source_texttospeech_and/
https://github.com/coqui-ai/TTS
https://mycroft-ai.gitbook.io/docs/mycroft-technologies/mimic-tts
https://github.com/myshell-ai/OpenVoice
https://www.resemble.ai/open-source-voice-cloning-multiple-languages/
https://github.com/yantze/vim-tts
https://github.com/myshell-ai/MeloTTS
https://github.com/RVC-Boss/GPT-SoVITS

Of those, coqui TTS seems to promise voice cloning, but nothing to indicate it's actually possible. Also looks like the open source effort has been abandoned.

Well, I would have thought that as well, but, there are still AI TTS projects that do see active development. I know you prolly already know this, but, make sure the licence for both the models and the app say they are suitable for your purpose. This is why I prefer F5-TTS cause they have a permissive licence.

are you talking about TiTS?

Always talking about me, not sure who else

Howdy, Stranger!

Categories

In this Discussion

TRENDING NOW: RackNerd’s BLACK FRIDAY! NEW DEALS, NEW LOCATIONS, NEW HARDWARE + 100’s of GIVEAWAYS!

Comments

Howdy, Stranger!

Quick Links

Categories

In this Discussion

TRENDING NOW: RackNerd’s BLACK FRIDAY! NEW DEALS, NEW LOCATIONS, NEW HARDWARE + 100’s of GIVEAWAYS!

Comments