At the moment, I am importing 21007 wav files into simon. I want to increase the amount of training data.
It takes a few minutes until all files are imported.
I hope that it will work as expected.
Edit: No, it just worked partially. Just about 862 words are contained in the file file:///home/ubuntu/.kde/share/apps/simond/models/default/active/model.dict. Why does simon use just this small subset?
Edit #2: I selected the “static model” option instead of “user generated model”. I will try again.
Edit #3: Obviously, I imported only a small dictionary. This means that the acoustic model has been trained with 21007 words, but the language model contains just 862 words.
Edit #4: I think that I found the mistake. The file file:///home/ubuntu/Documents/201008/lexicon-all.xml doesn’t contain valid XML code. I have to fix that, then I will try again.
When I build huge models, I usually import your dictionary in the active dictionary (beware this will use a LOT of ram (when synchronizing about 3 GB)) and just let simon figure out the rest.
Regards,
Peter