Speech Note 4.4.0
Linux Desktop
Changes:
- Flatpak
- Modular Flatpak package (Base package and Add-ons)
- NVIDIA CUDA runtime update to version 12.2
- AMD ROCm runtime update to version 5.6
- PyTorch update to version 2.1.1
- User Interface
- Improvements to the model browser
- Model filtering options
- Setting option to minimize to the system tray
- Setting option to enable/disable text in desktop notifications
- Speech to Text
- Marathi language. New language is enabled with Whisper and Faster Whisper models.
- New version of Faster Whisper Large model: 'FasterWhisper Large-v3'
- 'Distil' versions of Faster Whisper models
- Whisper and Faster Whisper enabled for Chinese-Cantonese language
- Support for Speex audio codec in 'Transcribe a file'
- Translate to English option for Whisper and Faster Whisper models
- More effective GPU acceleration for Whisper models with AMD graphics cards
- Subtitles generation (SRT format)
- Support for multiple audio streams in a video file
- Text to Speech
- Marathi language. New language is enabled with Coqui MMS model.
- Voice cloning with Coqui XTTS and YourTTS models.
- Coqui XTTS models are enabled for: Arabic, Brazilian Portuguese, Chinese, Czech, Dutch, English, French, German, Hungarian, Italian, Japanese, Korean, Polish, Russian, Spanish and Turkish.
- YourTTS model is enabled for: English, French and Brazilian Portuguese.
- Voice samples creator
- New voices for Serbian and Uzbek languages (RHVoice model)
- GPU acceleration for Coqui models with AMD graphics cards (in Flatpak version)
- Speech synchronized with subtitle timestamps
- Translator
- New model: Lithuanian to English
- Option to force text cleaning before translation
- Text formatting support
- Translation progress indicator
- Other
- Setting option to override GPU version (AMD graphics cards)
- Setting option to limit number of simultaneous CPU threads
- Setting option to set Python libraries directory (in non-Flatpak version)
Sailfish OS
- Speech to Text
- Marathi language. New language is enabled with Whisper models.
- Whisper enabled for Chinese-Cantonese language
- Support for Speex audio codec in 'Transcribe a file'
- Support for multiple audio streams in a video file
- Text to Speech
- New voices for Serbian and Uzbek languages (RHVoice model)
- Translator
- New model: Lithuanian to English
- Translation progress indicator