100% training WER on custom data #2107

Ashbajawed · 2021-04-21T10:50:12Z

Ashbajawed
Apr 21, 2021

Hello,

I am finetuning the QuartznetBase-en5x5 model on my own dataset, but I am getting trainingWER 100% and training loss is flatted after few steps although val_WER and val_loss is decreasing gradually

Dataset Details
train Dataset consists of 169630 files totaling 410.18 hours
Val_ Dataset consists of 25036 files totaling 84.82 hours
Audio data is of length between( 5-60secs) and made sure that there are no samples without text output

I'm using the file speech_to_text.py just made a little changes and added the following line before 'trainer.fit'

asr_model.restore_from("QuartzNet15x5Base-En.nemo")

other details are:
trainer.max_epochs=200
model.train_ds.batch_size=16
model.validation_ds.batch_size=16
model.optim.lr=0.00015
+model.validation_ds.num_workers=2
+model.train_ds.num_workers=10
+trainer.precision=16

nemo version = 1.0.0rc1

. Could you please troubleshoot where could the problem be?

ulucsahin · 2021-04-21T10:53:38Z

ulucsahin
Apr 21, 2021

It is normal to get 100% WER for first few epochs. How many epochs did train?

0 replies

YovelRom · 2021-04-21T12:54:02Z

YovelRom
Apr 21, 2021

Also, there might be a mismatch between the datasets. For example, is the sampling rate in your data the same as it is in the LibriSpeech dataset?

0 replies

Ashbajawed · 2021-04-22T05:36:54Z

Ashbajawed
Apr 22, 2021
Author

around 50 epoch, after 50 epochs its crashing

It is normal to get 100% WER for first few epochs. How many epochs did train?

0 replies

Ashbajawed · 2021-04-22T05:40:33Z

Ashbajawed
Apr 22, 2021
Author

Also, there might be a mismatch between the datasets. For example, is the sampling rate in your data the same as it is in the LibriSpeech dataset?

sample rate is same but when I make a test run on lesser data i.e. around 200 hours at that time it was decreasing

Blue = ~200hours data
pink = ~400 hours data

0 replies

okuchaiev · 2021-04-23T21:55:03Z

okuchaiev
Apr 23, 2021
Collaborator

60s is unnecessarily long. We typically use max 16.7s. Btw, by default, it may ignore everything longer than that.
Such long fragments doesn't help with training too much but will require large amount of GPU ram.

Also, what is your character set in your transcriptions? For QuartzNet, it is lowercase En letters and apostrophe ('), no punctuation or numbers.

Try lowering your learning rate and increasing batch size too.

0 replies

titu1994 · 2021-04-23T22:07:45Z

titu1994
Apr 23, 2021
Maintainer

Several things - disable spec augment, reduce Weight Decay to 0 and see if the model can learn with no regularization first.
The fact that 200 hours works and 400 doesn't means that your learning rate is too high - with 200 hours you have roughly half the number of steps per epoch so warmup and all other parameters should be adjusted.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

100% training WER on custom data #2107

{{title}}

Replies: 6 comments

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

100% training WER on custom data #2107

Ashbajawed Apr 21, 2021

Replies: 6 comments

ulucsahin Apr 21, 2021

YovelRom Apr 21, 2021

Ashbajawed Apr 22, 2021 Author

Ashbajawed Apr 22, 2021 Author

okuchaiev Apr 23, 2021 Collaborator

titu1994 Apr 23, 2021 Maintainer

Ashbajawed
Apr 21, 2021

ulucsahin
Apr 21, 2021

YovelRom
Apr 21, 2021

Ashbajawed
Apr 22, 2021
Author

Ashbajawed
Apr 22, 2021
Author

okuchaiev
Apr 23, 2021
Collaborator

titu1994
Apr 23, 2021
Maintainer