Posts Tagged ‘עִבְרִית’

Disabling Hebrew speech model

Friday, September 11th, 2009

Because I don’t want to use the Hebrew speech model any more, I renamed the following folders:

Old name: /home/liberty/.kde/share/apps/simon/model
New name: /home/liberty/.kde/share/apps/simon/model-20090911-hebrew

Old name: /home/liberty/.kde/share/apps/simond/models
New name: file:///home/liberty/.kde/share/apps/simond/models-20090911-hebrew

Now I can import a different lexicon. Which lexicon should I import next? The German PLS dictionary? Or should I import the Voxforge dictionary (HTK format)? I didn't import a Dutch dictionary yet. I think that there is one available at Voxforge.

Confidence score with Hebrew

Thursday, September 10th, 2009

A few hours ago, I created a sample Hebrew PLS dictionary. It is very short, but it shows the concept.

hebrew

1. I imported the Hebrew PLS dictionary into simon.
2. I dragged each word to the right side for training.
3. The recorded Hebrew words are stored in the folder /home/liberty/.kde/share/apps/simon/model/training.data.
4. After starting ksimond (PDF), I pressed the Synchronize button.
5. Then I activated simon.
6. When dictating several words, simon is not sure which word is the right one. Is it מדפסת, or is it עִבְרִית? Unfortunately, I didn’t get any output in gedit or in Geany. Maybe it has something to do with the right-to-left encoding?

You can see that thanks to UTF-8 Hebrew shouldn’t be a big problem. I don’t know what went wrong with the missing output. But at least the Hebrew words are displayed correctly, so only the last step is missing.

If anyone is interested in building an Hebrew PLS dictionary, I propose you take a look into the Hebrew Voxforge prompts. I suggest that you take the words that are contained in the prompts into the dictionary. You can take my sample Hebrew dictionary, and expand it. Later, you can use the Voxforge prompts for training (after you have made your first experiences with simon).