Fix SAPI 4 driver #17599

SaschaCowley · 2025-01-08T05:43:25Z

Link to issue number:

Fixes #17516

Summary of the issue:

After the move to exclusively Windows core audio APIs, the SAPI4 driver stopped working.

Description of user facing changes

The SAPI4 driver works again.

A warning is shown the first time the user uses SAPI4 informing them that it is deprecated.

Description of development approach

Implemented a function to translate between MMDevice Endpoint IDs and WaveOut device IDs, based on this Microsoft code sample.

Added a config key, speech.hasSapi4WarningBeenShown, which defaults to False.
Added a synthChanged callback that shows a dialog when the synth is set to SAPI4 if this config key is False and this is not a fallback synthesizer.

Testing strategy:

Ran NVDA, and used it with SAPI4. Changed the audio output device to ensure audio was routed as expected.

Known issues with pull request:

The dialog is only shown once, so may be missed by some users.
Other options I have considered include:

Making this a nag dialog that appears, say, once a week or once a month.
Also making a dialog appear whenever the user manually sets their synth to SAPI4.
Adding a new dialog in 2025.4 (or the last release before 2026.1) that warns users that this will be the last release to support SAPI4.
Adding a dialog when updating to 2026.1 that warns users that they will no longer be able to use SAPI4.
Adding a Windows toast notification that appears every time NVDA starts with SAPI4 as the synth.

Code Review Checklist:

Documentation:
- Change log entry
- User Documentation
- Developer / Technical Documentation
- Context sensitive help for GUI changes
Testing:
- Unit tests
- System (end to end) tests
- Manual testing
UX of all users considered:
- Speech
- Braille
- Low Vision
- Different web browsers
- Localization in other languages / culture than English
API is compatible with existing add-ons.
Security precautions taken.

@coderabbitai summary

…rings and WaveOut device IDs.

source/speech/speech.py

source/synthDrivers/_sapi4.py

source/speech/speech.py

source/config/configSpec.py

source/synthDrivers/sapi4.py

user_docs/en/changes.md

accidental approval

user_docs/en/changes.md

seanbudd · 2025-01-08T07:18:16Z

Could you confirm what the experience is like when in secure mode? both on the secure desktop, or with the environment variable set.
I imagine the warning dialog is always read when switching in secure mode, as settings cannot be saved.
This is probably good behaviour, but will be a little obnoxious I'm sure.

CyrilleB79 · 2025-01-08T08:00:01Z

I guess we will have the same issue as the add-on store warning (#15261), i.e. the warning may be shown various times, maximum one time per profile.

CyrilleB79 · 2025-01-08T08:05:20Z

Also, in the UG, the SAPI4 paragraph says:

SAPI 4 is an older Microsoft standard for software speech synthesizers. NVDA still supports this for users who already have SAPI 4 synthesizers installed. However, Microsoft no longer support this and needed components are no longer available from Microsoft.

Should we add here that SAPI4 usage in NVDA is deprecated and will be removed in the future?

source/synthDrivers/_sapi4.py

cary-rowen · 2025-01-08T08:22:36Z

For the user, does this warning dialog have any practical effect? As far as I know, many Chinese users rely on this synthesizer.
If we remove support for Sapi4 in the future, they will just be frustrated and unable to do anything else.

zstanecic · 2025-01-08T10:46:52Z

@cary-rowen
Sorry about that, but what does it mean users will be frustrated?
I know about which synthesizer you are talking about... Let's keep no secrets here. We are talking about Ibm viavoice with the chinese support.
In croatia, there was similar experience. Many people used sapi4, because of the old WinTalker voice speech synthesizer.
I warned where i can, that the sapi4 will be discontinued, and that the synth which is no more developed, but sold is not a good thing to use in so fast development of technology.
That's why i am recommending people the alternatives...
By the way, for chinese, haven't you used aisound5?

cary-rowen · 2025-01-08T12:05:52Z

Hi @zstanecic

I'm not complaining, I just thought about what effect this dialog can have on end users, so I made the above comment.
If there's nothing they can do and replacing a different speech synthesizer is the only option, is this dialog still that valuable?

I'd love to talk about the current state of TTS in Chinese, although it's a bit off topic, and it probably deserves a separate discussion.

Regarding the AISound you mentioned, it may be just a temporary solution:

It responds slowly and consumes a lot of CPU resources.
It can only pronounce Chinese characters. Even if it encounters English, it will only use Chinese voice to pronounce it.
The speaking speed is extremely limited. Even if it is set to the fastest speaking speed, it will appear slow to skilled screen reader users.

Regarding IBM Viovoice TTS:

Fast response.
Can support more Chinese character pronunciations, which is compared to IBMTTS addon.

Regarding Eloquence:
As far as I know, it doesn't support Chinese. So there is nothing to say.

Of course, for me, Vocalizer is my only choice. As for Sapi5 and oneCore they are really slow to respond.

In summary, response speed is important. New technologies and new TTS are of course developing, but as you can see, Microsoft's natural voice is not yet supported. Although it can be supported through sapi5 through the Natural Voice Adapter, it is still limited by The response speed is not suitable for long-term use.

Hope this will be clearer

zstanecic · 2025-01-08T12:21:25Z

Hi @cary-rowen, It is very interesting to hear the history about chinese speech synthesizer and what exists, what's popular and widely used.
It is not an off-topic. I think that it will shed more light on the situation we have now with sapi4.
Well, It is not that so much valuable, i think.
For example, we had Jaws for windows, which deprecated sapi4 for a long time ago. Nobody really complained about it.
Regarding response and speed? There is a Ibm viavoice driver, but it is somewhat gray zone.
Really, be honest. How many chinese users have optained their viavoice legally?
To conclude things: when it breaks, it will break, and we cannot do anything if this protocol is abandoned.
And. This dialog will not get any value, besides the fact that the users will be warned about thhe possible removal.

SaschaCowley · 2025-01-08T23:35:15Z

@cary-rowen thank you for the questions. We have chosen to add this dialog as, while it may be irritating, we believe the experience of this being a surprise to users would be worse. Many users do not regularly read the changelog, which is not always translated into their language, so this is more likely to be noticed.

Regarding TTS support, does eSpeak-ng have support for Chinese languages? I see the following options in NVDA with eSpeak:

Chinese (Mandarin, latin as English)
Chinese (Mandarin, latin as Pinyin)
Chinese (Cantonese)
Chinese (Cantonese, latin as Jyutping)

Notwithstanding users' dislike of the sound of eSpeak (which is of course valid), are these voices unsatisfactory in other ways? For example, in English eSpeak just reads Chinese characters as "Chinese letter", does Chinese eSpeak have proper support for Chinese writing? Do other TTS engines support more Chinese languages?

zstanecic · 2025-01-09T00:17:53Z

@SaschaCowley myself being in contact with chinese people, they consider espeak as something just for fun, not as something which is usable for them. Espeak has very bad tone support.

cary-rowen · 2025-01-09T00:41:24Z

Hi @SaschaCowley @zstanecic

I see.

The only advantage of eSpeak NG in the Chinese environment is fast response.
But its Chinese pronunciation is very bad. We need to concentrate on distinguishing every word, and we will feel listening fatigue after a while.
Sapi4 and Vocalizer are probably the most used by Chinese users.
So we're very much looking forward to #17592 changing that, and if Sapi5 is more responsive, it might lead to more options.

Thanks

SaschaCowley · 2025-01-09T01:09:17Z

@zstanecic @cary-rowen that is very disappointing to hear! It would be wonderful if eSpeak was a viable option for use with Chinese languages, but it really sounds like it isn't. Hopefully the mentioned improvements to SAPI5 responsiveness mean that more options become available to Chinese speakers.

SaschaCowley · 2025-01-09T01:20:39Z

@seanbudd I've changed the dialog to only show when NVDA is not running in minimal mode. That stops it showing on secure screens and in the launcher.

seanbudd · 2025-01-09T02:55:14Z

source/synthDrivers/_sapi4.py

+	Defined in mmddk.h
+	"""
+
+	QUERY__INSTANCE_ID = 2065


Suggested change

QUERY__INSTANCE_ID = 2065

QUERY_INSTANCE_ID = 2065

seanbudd · 2025-01-09T02:58:55Z

source/speech/speech.py

-synthChanged.register(_sapi4DeprecationWarning)
+if not globalVars.appArgs.minimal:
+	# Don't warn users about SAPI4 deprecation in minimal mode.
+	# This stops the dialog appearing on secure screens or in the launcher.


I think it's important this is announced from the launcher - otherwise we load the security risk without warning. I think it's fine to ignore on secure screens only.

vgjh2005 · 2025-01-09T03:24:41Z

As far as I know, in China, the TTS and vocalizer of SAPI 4 are the most popular speech synthesis engines. SAPI 5 and OneCore have concerning speed and poor audio quality. From a user's perspective, it is unclear what exactly has been upgraded in SAPI 5—slower speed and unclear reading? We also need to consider what benefits removing this feature would bring to NVDA users, or if there are any compelling reasons to do so.

Thanks!

gexgd0419 · 2025-01-09T05:55:50Z

SAPI 4 is not included in recent versions of Windows, and it is no longer supported by Microsoft. SAPI 5 is a built-in component in Windows since Windows XP, but it is also quite old, and I'm not sure if Microsoft is still actively maintaining it.

SAPI 5.3 supports parsing SSML in addition to its own proprietary XML format. But only built-in voices can fully utilize this feature and get most of the information from SSML. Third-party voice developers, unfortunately, can still only use the old interfaces that was designed for the proprietary XML format. Although the SAPI framework automatically converts SSML to compatible data format for third-party TTS engines, some SSML features that cannot be presented in the proprietary XML format are lost in the conversion, for example, the contour attribute of <prosody> element. Those built-in voices support additional COM interfaces to receive SSML information, but those interfaces are not documented. So even in SAPI 5, not all features are accessible for third-party voice developers.

As for the newer interfaces:

OneCore seems to be a weird "variation" based on SAPI 5. Their registry key structures are similar. You can even copy registry keys to make "OneCore-exclusive" voices usable* via SAPI 5. The problem is that Microsoft provide no documentation or support for third-party OneCore voices, so third-party voice vendors still have to use SAPI.

Azure Speech SDK can only use Microsoft voices. It supports online Azure voices or offline neural/Apollo voices, but it's all from Microsoft.

So, although both SAPI 4 and SAPI 5 are old and not actively being updated for a while, they are the only speech systems supported by not only many client applications, but also many third-party voice synthesizers. OneCore and Azure Speech SDK are not open to voice providers.

gexgd0419 · 2025-01-09T06:08:01Z

#17592 makes SAPI 5 voices use WASAPI to improve their responsiveness. I think that a similar approach can be applied to SAPI 4 voices, making SAPI 4 voices use WASAPI as well.

I want to know the performance level of SAPI 4 voices. Are they already good enough? As SAPI 4 is being deprecated, such a fix for SAPI 4 might not be worthy.

SaschaCowley added 6 commits January 8, 2025 11:57

Initial implementation of translation between mmDevice endpoint ID st…

df45ca7

…rings and WaveOut device IDs.

Slight improvements to typing

16b377d

Fix incorrect config path

3e2794b

Merge branch 'master' into fixSapi4

0bee92c

Merge branch 'master' into fixSapi4

173d75d

Added a warning when SAPI4 is in use.

6bf3f01

SaschaCowley commented Jan 8, 2025

View reviewed changes

source/speech/speech.py Outdated Show resolved Hide resolved

Changelog

c7d363b

SaschaCowley marked this pull request as ready for review January 8, 2025 05:47

SaschaCowley requested a review from a team as a code owner January 8, 2025 05:47

SaschaCowley requested a review from seanbudd January 8, 2025 05:47

seanbudd previously approved these changes Jan 8, 2025

View reviewed changes

seanbudd self-requested a review January 8, 2025 07:03

CyrilleB79 reviewed Jan 8, 2025

View reviewed changes

user_docs/en/changes.md Outdated Show resolved Hide resolved

seanbudd reviewed Jan 8, 2025

View reviewed changes

source/synthDrivers/_sapi4.py Outdated Show resolved Hide resolved

SaschaCowley added 2 commits January 9, 2025 10:20

Update copyright headers

6da072f

Update changes

d0781ba

SaschaCowley added 4 commits January 9, 2025 10:44

Switch to a driver message enum

2af45cf

Move loading winmm into _mmDeviceEndpointIdToWaveOutId

ef06304

Documentation improvements

8c7007a

Add deprecation warning

48b4d6e

Added note about SAPI4's deprecation to the UG

181b36a

Only show warning when not minimal

d277513

SaschaCowley added 2 commits January 9, 2025 12:22

Mark hasSapi4WarningBeenShown as private

306bd86

Make SAPI4 warning translatable

b3bf403

SaschaCowley requested a review from a team as a code owner January 9, 2025 02:13

SaschaCowley requested review from Qchristensen, CyrilleB79 and seanbudd January 9, 2025 02:13

seanbudd reviewed Jan 9, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix SAPI 4 driver #17599

Fix SAPI 4 driver #17599

SaschaCowley commented Jan 8, 2025

seanbudd commented Jan 8, 2025

CyrilleB79 commented Jan 8, 2025

CyrilleB79 commented Jan 8, 2025

cary-rowen commented Jan 8, 2025

zstanecic commented Jan 8, 2025

cary-rowen commented Jan 8, 2025

zstanecic commented Jan 8, 2025

SaschaCowley commented Jan 8, 2025

zstanecic commented Jan 9, 2025

cary-rowen commented Jan 9, 2025

SaschaCowley commented Jan 9, 2025

SaschaCowley commented Jan 9, 2025

seanbudd Jan 9, 2025

seanbudd Jan 9, 2025

vgjh2005 commented Jan 9, 2025

gexgd0419 commented Jan 9, 2025

gexgd0419 commented Jan 9, 2025

Fix SAPI 4 driver #17599

Are you sure you want to change the base?

Fix SAPI 4 driver #17599

Conversation

SaschaCowley commented Jan 8, 2025

Link to issue number:

Summary of the issue:

Description of user facing changes

Description of development approach

Testing strategy:

Known issues with pull request:

Code Review Checklist:

seanbudd commented Jan 8, 2025

CyrilleB79 commented Jan 8, 2025

CyrilleB79 commented Jan 8, 2025

cary-rowen commented Jan 8, 2025

zstanecic commented Jan 8, 2025

cary-rowen commented Jan 8, 2025

zstanecic commented Jan 8, 2025

SaschaCowley commented Jan 8, 2025

zstanecic commented Jan 9, 2025

cary-rowen commented Jan 9, 2025

SaschaCowley commented Jan 9, 2025

SaschaCowley commented Jan 9, 2025

seanbudd Jan 9, 2025

Choose a reason for hiding this comment

seanbudd Jan 9, 2025

Choose a reason for hiding this comment

vgjh2005 commented Jan 9, 2025

gexgd0419 commented Jan 9, 2025

gexgd0419 commented Jan 9, 2025