Because I don’t know how to restore my German speech model (with more than 200 words), I want to train simon with the English dictionary VoxForgeDict. It is necessary to import VoxForgeDict as HTK lexicon into simon:
This dictionary has about 130.000 entries. That should be enough for the English language. The encoding is UTF-8.
Now I can start to train a few words. Let’s start with REGRETTING:
1. Include unused words from the shadow dictionary.
2. Drag & drop the word REGRETTING into the right field.
3. Train selected words.
4. The active vocabulary doesn’t contain the word REGRETTING. I want to add it now. Press the Yes button.
I recorded the word REGRETTING three times. And I defined a grammar (“Unknown”). I disconnected simon, then I restarted ksimond. Then I connected simon again, then pressed the Synchronize button.
simon now recognizes the one word that is part of my active vocabulary: REGRETTING.
VoxForgeDict just contains words that are written in capital letters. It would be possible with a simple text editor to fix that (though it would be a lot of work to go through the hole dictionary). Does anyone have an idea how the dictionary could be uncapitalized elegantly?


The active dictionary is not the same as the shadow dictionary. When you add a word, the “correct” case will be fetched from the shadow dictionary but you are free to change it.
So the next time you add a word you simply enter “regretting”, simon will fetch the word and its details from the Voxforge lexicon and thus change the spelling to “REGRETTING”. Just change the spelling again to anything you want (like “regretting”) and continue the wizard. The words will still be recognized as being the same (“regretting” == “REGRETTING” as far as the training data / lexicon is concerned).
Greetings,
Peter