I imported the whole PLS dictionary /home/liberty/200905/voxDE20090209.xml into the active vocabulary. This feature had been added to simon a few weeks ago:
“simon can now import dictionaries to the active lexicon.”
You know that my next goal is to hit the 1000 words mark. 1000 words should be recognized by simon. At the moment, I have major recognition problems. simon isn’t very responsive. It recognizes e.g. the word “abnahmen”, but when I dictate other words (that are of course part of the active vocabulary and had been successfully trained by me), simon doesn’t react. Maybe it is something with the confidence score? Or maybe while playing with sam the speech model has been changed?
Well, the active vocabulary now contains more than 8000 words. When I dictate, simon now recognizes words that I never had trained. And of course, it recognizes the wrong words. So I will have to do figure out how to adjust the speech model.
For example, I could record with Audacity lots of single words (not utterances because I find it difficult to define an appropriate grammar), and choose the Export Multiple... function. I am using Audacity in combination with my external USB sound card. This sound card only works with 22050 hertz, not with 16000 hertz under Ubuntu. This is the reason why I am using my on board sound card when dictating into simon directly (= recognition) or when recording words with simon (= training).
It is a bit complicated to explain. I prefer Audacity for recording because it allows me to record lots of training samples in a short amount of time. So if I record with Audacity in 22050 hertz, I have to resample the wav files with sox. I tested the command from the Sphinx guide. The following command allowed me to transform a 22050 hertz file successfully into 16000 hertz:
$ sox de27-02.wav -r 16000 -c 1 -s de27-02-test.wav
With Audacity, I could record all 8000 words that are now in my active vocabulary. Let’s say in packages of 100 words. Two years ago, Audacity allowed me to export just about 30 wav files at a time otherwise the application would crash. I will have to test the current version of Audacity. Probably, this issue has been fixed.
My main concern is that the words of my dictionary are often very similar. Here is an example:
DUTZEND [Dutzend] d U ts @ n t
DUTZEND [Dutzend] d U ts n= t
DUTZENDE [Dutzende] d U ts @ n d @
DUTZENDE [Dutzende] d U ts n= d @
DUTZENDEN [Dutzenden] d U ts @ n d @ n
DUTZENDEN [Dutzenden] d U ts n= d @ n
DUTZENDEN [Dutzenden] d U ts n= d n=
DUTZENDS [Dutzends] d U ts n= ts
Eight entries that are very similar. I think that this is very hard to train successfully. I will have to find out how I can achive my 1000 words goal. Maybe I should reduce the size of the active vocabulary from 8000 words to 1000 words? The result would be that I could use a set of words that aren’t too similar, and thus I would get better recognition results.
And I could sort out short words. Short words are harder to recognize than long words. It is a trick to only train long words, and to leave out the short ones.
I am interested in the following function:
“simon can now import prompts files through the import training data wizard.”
This is a very interesting function for me. I have recorded lots of utterances. They could be imported into simon. But I have one problem: I didn’t define an appropriate grammar. I could use a grammar that uses just one category of words (e.g. all words are marked as noun, it doesn’t matter what they really are; they can be adverbs, adjectives, verbs, etc.). So this could be the way to go.
I think that the 1000 mark goal could be hit with the present vocabulary of 8000 words. Julius allows 20.000 words dictation. So 1000 words is a reasonable goal. When I have reached that goal, I will have to think about the following question: How can I hit the 10.000 words mark? First, I need a bigger lexicon. I don’t want to use BOMP since it would be necessary to write them an email. I prefer to stick to free dictionaries.
Another solution could be that I switch from the German PLS dictionary to the English Voxforge dictionary. I could do the testing in English.