You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We're experimenting with Fastpitch+Hifigan TTS on Slovene language, and the results are quite good. Pronunciation is very good, but we have issues with certain artefacts present in the voiced output, so we're investigating other models as well. From the collection of models available in NeMo we're currently looking into VITS and RadTTS.
VITS because an implementation from another repo trained on the same dataset does not show these artefacts. RadTTS because Riva has gained support for it in 2.10.0 (support for Fastpitch+Hifigan, has been there for a long time).
Now the questions are; are there any plans that Riva will support VITS as well? Are there any plans that NaturalSpeech will get implemented in NeMo (#6746) and get support in Riva later on? What is the current status of the training scripts for VITS and RadTTS? Are there any other TTS models worthy of looking into?
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
We're experimenting with Fastpitch+Hifigan TTS on Slovene language, and the results are quite good. Pronunciation is very good, but we have issues with certain artefacts present in the voiced output, so we're investigating other models as well. From the collection of models available in NeMo we're currently looking into VITS and RadTTS.
VITS because an implementation from another repo trained on the same dataset does not show these artefacts. RadTTS because Riva has gained support for it in 2.10.0 (support for Fastpitch+Hifigan, has been there for a long time).
Now the questions are; are there any plans that Riva will support VITS as well? Are there any plans that NaturalSpeech will get implemented in NeMo (#6746) and get support in Riva later on? What is the current status of the training scripts for VITS and RadTTS? Are there any other TTS models worthy of looking into?
Beta Was this translation helpful? Give feedback.
All reactions