-
Notifications
You must be signed in to change notification settings - Fork 507
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to create hotwords? #1762
Comments
I found this example source code from https://github.com/k2-fsa/sherpa-onnx/blob/master/scripts/text2token.py ! |
So have you fixed it? |
I've followed the hotwords generation process using Is this the correct approach for creating hotwords, especially for Korean language models? What might I be missing in the tokenization or hotwords configuration? Here's my example command and output of
Full code for
Input File:
Output File:
|
Hi,
I’m using the following model configuration:
I’ve followed the guide from this page and successfully generated the
bpe.vocab
file using:python ./export_bpe_vocab.py --bpe-model ./bpe.model
Now, I’d like to configure the
hotwords_ko.txt
file to include specific keywords I want to emphasize.Could you clarify the following:
modeling-unit
for Korean is unclear. Should I follow thebpe
approach like in the English example? Or is there a different process for languages like Korean?Thank you for your guidance!
The text was updated successfully, but these errors were encountered: