Skip to content

Speech Note 4.4.0

Compare
Choose a tag to compare
@mkiol mkiol released this 26 Jan 09:54
· 457 commits to main since this release

Linux Desktop

Changes:

  • Flatpak
    • Modular Flatpak package (Base package and Add-ons)
    • NVIDIA CUDA runtime update to version 12.2
    • AMD ROCm runtime update to version 5.6
    • PyTorch update to version 2.1.1
  • User Interface
    • Improvements to the model browser
    • Model filtering options
    • Setting option to minimize to the system tray
    • Setting option to enable/disable text in desktop notifications
  • Speech to Text
    • Marathi language. New language is enabled with Whisper and Faster Whisper models.
    • New version of Faster Whisper Large model: 'FasterWhisper Large-v3'
    • 'Distil' versions of Faster Whisper models
    • Whisper and Faster Whisper enabled for Chinese-Cantonese language
    • Support for Speex audio codec in 'Transcribe a file'
    • Translate to English option for Whisper and Faster Whisper models
    • More effective GPU acceleration for Whisper models with AMD graphics cards
    • Subtitles generation (SRT format)
    • Support for multiple audio streams in a video file
  • Text to Speech
    • Marathi language. New language is enabled with Coqui MMS model.
    • Voice cloning with Coqui XTTS and YourTTS models.
      • Coqui XTTS models are enabled for: Arabic, Brazilian Portuguese, Chinese, Czech, Dutch, English, French, German, Hungarian, Italian, Japanese, Korean, Polish, Russian, Spanish and Turkish.
      • YourTTS model is enabled for: English, French and Brazilian Portuguese.
    • Voice samples creator
    • New voices for Serbian and Uzbek languages (RHVoice model)
    • GPU acceleration for Coqui models with AMD graphics cards (in Flatpak version)
    • Speech synchronized with subtitle timestamps
  • Translator
    • New model: Lithuanian to English
    • Option to force text cleaning before translation
    • Text formatting support
    • Translation progress indicator
  • Other
    • Setting option to override GPU version (AMD graphics cards)
    • Setting option to limit number of simultaneous CPU threads
    • Setting option to set Python libraries directory (in non-Flatpak version)

Sailfish OS

  • Speech to Text
    • Marathi language. New language is enabled with Whisper models.
    • Whisper enabled for Chinese-Cantonese language
    • Support for Speex audio codec in 'Transcribe a file'
    • Support for multiple audio streams in a video file
  • Text to Speech
    • New voices for Serbian and Uzbek languages (RHVoice model)
  • Translator
    • New model: Lithuanian to English
    • Translation progress indicator