Ralf’s modern Greek speech model

May 15th, 2012 by producer

Some words about the creation of Ralf’s modern Greek speech model.

1. I used Ralf’s Greek dictionary as source dictionary. Then I took a look into this word list. Then using the style sheet compare-popular-words.xsl, I created the target dictionary popular-greek-words-dictionary.xml. I created the target dictionary with the following command:

cat 'greek-dictionary.xml.bz2' | bunzip2 -k | saxonb-xslt -ext:on -s:- -xsl:'compare-popular-words.xsl'

I excluded some <lexeme> elements that contain <phoneme> elements with the phoneme [ʎ] or [ɲ]:

not(contains($dictionary-phoneme, 'ʎ')) and not(contains($dictionary-phoneme, 'ɲ'))

I excluded these phonemes because the Simon PLS import process obviously doesn’t “like” these specific phonemes.

Read the rest of this entry »

Ralf’s General American speech model 0.1.1

May 11th, 2012 by producer

Download Ralf’s General American speech model 0.1.1 – it contains the words from sections andrew.xml and beverly.xml.

This speech model is growing:
version 0.1: about 1000 words (just section andrew)
version 0.1.1: about 2000 words (sections andrew and beverly)
version 0.1.2: will have about 3000 words

Update: Version 0.1.2 with about 3000 words is now available. Additionaly, it contains the words from section clark.

Update 2:: Version 0.1.3 with about 4000 words is now available. It contains additionally the words from section diana.

Update 3: Version 0.1.4 with about 5000 words is now available. Additional words from section ethan.

Update 4: Version 0.1.5 with about 6000 words is now available with additional words from section franci.

Update 5: Version 0.1.6 with about 8000 words is now available with addtional words from sections gary and hera.

Update 6: Version 0.1.7 with about 10.000 words is now available. Additionally, it contains words from sections isaac and jane.

Update 7: Version 0.1.8 with about 12.000 words contains words from sections kevin and lana.

Update 8: Version 0.1.9 contains additionally the words from sections marc and nadja.

Update 9: Version 0.1.9.1 contains words from sections oliver and piper.

Long vowels and the colon mark

May 11th, 2012 by producer

I just found out that my General American PLS dictionary contains often the colon mark : instead of the specific IPA Unicode ​U+02D0​ character.

Let me make an example:

<lexeme role="">
<grapheme>zorro</grapheme>
<phoneme>zˈɔ:roʊ</phoneme>
</lexeme>

The long o-vowel is marked by a colon mark. Read the rest of this entry »

After the deletion of the files …

May 10th, 2012 by producer

Recently, I had the problem that I couldn’t start Simon any more. After the deletion of the files …

/home/ubuntu/.kde/share/config/simonrc
/home/ubuntu/.kde/share/config/speechmodelmanagementrc
/home/ubuntu/.kde/share/config/simonrecognitionrc
/home/ubuntu/.kde/share/config/simonscenariosrc
/home/ubuntu/.kde/share/config/simonsoundrc

… I could start Simon again. I will keep that in mind.

Ralf’s General American speech model 0.1

May 10th, 2012 by producer

Here is how I create Ralf’s General American speech model version 0.1.

1. Schott’s General American dictionary has to be reduced. I want to use just popular words. Where can I find information about popular words? I found a good source. And what about copyright? I combined several sources, mixed them, extracted a subset following specific criteria. This means I didn’t create a work that can be called derivative. In short: I produced the dictionary Popular words - GA dictionary with the following command:

cat 'general-american-dictionary.xml.bz2' | bunzip2 -k | saxonb-xslt -ext:on -s:- -xsl:'compare-popular-words.xsl'

Read the rest of this entry »

“Could not initialize scenarios”

May 9th, 2012 by producer

Let me explain my current problem:

1. Menu > Applications > Universal Access > Simon
2. Loading core…
3. Simon displays the following Error message:

Could not initialize scenarios and shadow dictionary.

4. Backtrace:

Application: Simon (simon), signal: Segmentation fault
[Current thread is 1 (Thread 0x7f8852c82780 (LWP 4135))]

Thread 2 (Thread 0x7f8836cfb700 (LWP 4136)):
#0 0x00007f884e2ab473 in poll () from /lib/x86_64-linux-gnu/libc.so.6
#1 0x00007f884c0d5f68 in ?? () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#2 0x00007f884c0d6429 in g_main_context_iteration () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#3 0x00007f8851463f3e in QEventDispatcherGlib::processEvents(QFlags) () from /usr/lib/x86_64-linux-gnu/libQtCore.so.4
#4 0x00007f8851437cf2 in QEventLoop::processEvents(QFlags) () from /usr/lib/x86_64-linux-gnu/libQtCore.so.4
#5 0x00007f8851437ef7 in QEventLoop::exec(QFlags) () from /usr/lib/x86_64-linux-gnu/libQtCore.so.4
#6 0x00007f885134f27f in QThread::exec() () from /usr/lib/x86_64-linux-gnu/libQtCore.so.4
#7 0x00007f8851351d05 in ?? () from /usr/lib/x86_64-linux-gnu/libQtCore.so.4
#8 0x00007f884c5a7efc in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#9 0x00007f884e2b759d in clone () from /lib/x86_64-linux-gnu/libc.so.6
#10 0×0000000000000000 in ?? ()

Thread 1 (Thread 0x7f8852c82780 (LWP 4135)):
[KCrash Handler]
#6 0x000000000040f712 in SimonView::updateActionList (this=0x22984d0) at /home/ubuntu/simon/simonsource/simon/src/simonview.cpp:214
#7 0x000000000040f7c0 in SimonView::displayScenarioPrivate (this=0x22984d0, scenario=) at /home/ubuntu/simon/simonsource/simon/src/simonview.cpp:306
#8 0x000000000041001c in SimonView::SimonView (this=0x22984d0, parent=, flags=, __in_chrg=, __vtt_parm=) at /home/ubuntu/simon/simonsource/simon/src/simonview.cpp:175
#9 0x000000000040d997 in main (argc=1, argv=0x7fff7abdf668) at /home/ubuntu/simon/simonsource/simon/src/main.cpp:88

I reported this bug.

5. This bug happened after I deleted the following folders:

/home/ubuntu/.kde/share/apps/simon
/home/ubuntu/.kde/share/apps/simond
/home/ubuntu/.kde/share/apps/sam

6. Simon doesn’t work any more – even after a fresh git clone combined with a ./build.sh – it doesn’t help.

Recognition of untrained words

May 9th, 2012 by producer

I am still using my French scenario. But I deleted the active vocabulary as well as the shadow vocabulary. Then I imported a reduced version of Schott’s General American dictionary into simon as active vocabulary. Then I pressed Synchronize. And look out what happened:

Some words get a recognition rate of 1 even though I have never trained them in the English language. I have the impression that Simon somewhere stores previous recordings and uses them again for the generation of the speech model. I don’t understand what is happening inside Simon. It is interesting to see that it is possible to recognize some English words even though I have never trained one single English word.

I can answer the following question:

Question for the future: Okay so take all four language and audio models, combine them into one large multilingual model. Will it work?

The answer is: yes. It will work. I didn’t want to test it. But from my experience with Simon I can say that a mixture is possible. I don’t want to mix the different languages. Each language should be developed completely independent. So the question is: how can I remove old recordings from Simon folders so that they won’t be used again for the creation of the next speech model?

I am now taking a look into /home/ubuntu/.kde/share/apps/simon/model/prompts. Obviously, the German prompts are still available. Why is that? Here is one entry in the prompts file:

pʊzəls_35490_2012-05-05_18-49-01 "default" PUZZLES

The pronunciation is in German. And in my vocabulary list in Simon, there is a word puzzles with the SAMPA transcription p V z @ l z – obviously this transcription has been derived from my reduced GA dictionary. Take a look at the corresponding entry:

<lexeme role="">
<grapheme>puzzles</grapheme>
<phoneme>pˈʌzəlz</phoneme></lexeme>

The IPA phone [ʌ] has been converted into the SAMPA phone V. That is OK. But what is not OK is that the recognition rate for this specific word is displayed as 1. I didn’t train the word in English, I had trained it in German (with the phone [ʊ] instead of the [ʌ].

Short: I don’t want to get a mixture of different languages. How can I delete old prompts / wav recordings from Simon completely so that they wont be used again?

Let’s take a look into the folder /home/ubuntu/.kde/share/apps/simon/model/training.data. It contains at the moment 38.056 wav files. When will these files be deleted?

I think that I will do the following before the creation of a new scenario / acoustic model: delete the following folders completely:

/home/ubuntu/.kde/share/apps/simon
/home/ubuntu/.kde/share/apps/simond
/home/ubuntu/.kde/share/apps/sam

I hope that this is sufficient to produce a clean fresh speech model.

Update: I deleted the folders. But now I am not able to start Simon again. So I explicitly don’t recommend to delete these folders!

Ralf’s French speech model

May 9th, 2012 by producer

I want to explain how I create Ralf’s French speech model. This model will have about 1000 words. I will publish the corresponding FLAC files. The pronunciation will be bad even though I had studied 5 months in the French-speaking part of Switzerland (in Lausanne). Let’s get started.

1. First, I want to reduce the size of the Ralf’s French dictionary version 0.1.3. I do this in the Linux Mint terminal:

saxonb-xslt -ext:on -s:french-dictionary.xml -xsl:reduce-french.xsl -o:french-reduced.xml

Some remarks about the XSLT style sheet reduce-french.xsl:
- Not every <lexeme> element is selected thanks to the XPath expression [position() mod 130 = 1].
- <lexeme> elements with empty role attribute or with a <phoneme> element that contains the phone [ɲ] are excluded: not(@role='') and not(contains(phoneme, 'ɲ')) – Simon doesn’t render this specific phone correctly.

2. The following operations are rather complicated. But I will explain it step-by-step. It would be too time consuming to record 1000 words with Simon. Because I want to save time, I will use Audacity for the generation of the audio files. But before I can do that, I have to convert the dictionary french-reduced.xml into a phoneme list file. This is done in the terminal:

saxonb-xslt -ext:on -s:french-reduced.xml -xsl:lexicon2phonemelist.xsl

Speak IPA phoneme list into the microphone.3. Now I can record each word with Audacity.

Read the rest of this entry »

Ralf’s Dutch speech model

May 8th, 2012 by producer

Some details about the creation of Ralf’s Dutch speech model:

1. After pressing the button Manage scenarios, I created a scenario with the name Dutch.

2. Then I pressed the button Open “Dutch”, and imported Ralf’s Dutch dictionary as PLS dictionary.

3. The active vocabulary is empty. Switch to the Shadow vocabulary tab.

Read the rest of this entry »

Ralf’s Catalan speech model

May 8th, 2012 by producer

This article explains some details about the creation of Ralf’s Catalan speech model.

1. I am preparing a Catalan scenario.

2. Press Open “Catalan”.

3. Press the Import dictionary button.

4. Import the dictionary as shadow dictionary.

5. Select the type of the dictionary. Select PLS lexicon. Press the Next button.

6. Provide the downloaded PLS dictionary. Press the blue folder button.

7. After downloading Ralf’s Catalan dictionary, you can import the dictionary into simon. Select the Catalan dictionary. Press OK.

8. The Catalan dictionary has been selected. Press the Next button.

9. The dictionary has been imported successfully. Press the Finish button.

10. The active vocabulary is empty. Switch to the Shadow vocabulary tab.

11. Scroll down. Select a word from the list.

12. Press the Add to training button. Press Train selected words.

13. Simon now displays a message:

Your vocabulary does not define all words used in this text. These words are missing:
servicis

Do you want to add them now?

Press the Yes button.

14. The Define word window opens. You don’t have to change anything here (you could, but you don’t have to). Simply press the Next button.

15. Please speak a word. The volume will be calibrated by Simon automatically.

16. The volume is now correct. Press the Next button.

17. Now it is time to record the Catalan word servicis. Press the Record button.

18. Simon is recording the word. Press the Record button again. Then press the Next button. Record the word again.

19. The new word has been added. Press the Finish button.

20. Now you know how to train a word with Simon. I recommend that you train 10 words. At least, that’s what I do. You can train any number of words – you can compile a speech model with even one trained word.

21. Additional words have been trained. Now switch to the Active vocabulary tab.

22. Ten Catalan words have been trained. The recognition rate is 2. That means that each word has been recorded two times.

23. Select Actions > Synchronize. Simon is now synchronizing. Wait a few moments. Now select Actions > Activate.

24. I won’t explain the next steps in my blog. At the moment I have the problem that simon is recognizing words that are part of my previous speech model. I have to do some modifications to make it work.

25. Download Ralf’s Catalan speech model, and use it with Simon.

Ralf’s Breton speech model

May 8th, 2012 by producer

Some words about Ralf’s Breton speech model (license: GPLv3 – as always):

I trained the model with 10 Breton words. Each word has been recorded two times.

Download the speech model, and use it with Simon.

Import of my GA dictionary; thoughts

May 7th, 2012 by producer

I want to test whether the import of Schott’s General American dictionary functions now better. Here is what I do:

1. Linux Mint terminal:

git pull origin master
./build.sh

2. Create a scenario test-us.
3. Import of the dictionary. Take a look at the result:

The word gardening looks much better than before. But it isn’t perfect yet. The primary stress has been omitted (which is a good decision; we don’t need the stress marks at the moment. We need stress marks in the long run, in a few years or so). But what happened to the secondary stress mark? It became a phoneme with the ASCII transcription perc. This isn’t optimal. At least, the secondary stress mark is processed by simon.

Some thoughts about stress marks: We don’t need them right now. This is why Schott’s German dictionary doesn’t contain any stress information. There are long vowels and short vowels. This is sufficient information for the dictation.

When I generated Schott’s General American dictionary with eSpeak (I chose eSpeak because I “know” the results; some argue that Festival is better – that might be the case. But I didn’t want to “learn” Festival.), I didn’t want to omit the stress marks. I think that in the long run this is valuable information.

So that’s the current situation:
- Schott's German dictionary doesn’t contain stress marks.
- Schott's General American dictionary contains stress marks. I don’t know whether they are on the right position following the IPA specification. But at the moment, I don’t care – it would be possible to move the stress marks to the right position easily (if the positions were wrong) with an XSLT style sheet.

Some thoughts:

By the way, I recommend XSLT for dictionary development. It is easy to use, distinguishes clearly between lower-case and upper-case. It isn’t necessary to install some extra library or so to make it UTF-8 compliant. And you can publish PLS/IPA dictionaries as .xml file (e.g. take a look at Ralf’s Basque IPA FLAC files – it is an PLS dictionary combined with an XSLT style sheet – you can download and directly import this dictionary into simon – the full PLS standard is met).

Why am using sometimes the name Ralf’s …, and sometimes the name Schott’s …. ? The reason is the following: First drafts are published under my Nickname Ralf. And as soon as they have evolved, I change the name to Schott’s (which is my real name).

The next step will be no rename Ralf’s German speech model into Schott's German speech model. Maybe the next version of my speech model will carry the new name. I am not sure yet.

Ralf’s Basque speech model

May 7th, 2012 by producer

Some words about the creation of Ralf’s Basque speech model:

1. I reduced the size of Ralf’s Basque dictionary to 112 words.

2. I recorded 112 words with Audacity.

3. Export the audio files as FLAC. You can listen to Ralf’s Basque IPA FLAC files. By the way, this is the first time in my life that I have spoken Basque – so the pronunciation is very bad. I have never learned a word Basque in my life. But my very bad pronunciation should be sufficient for the generation of Ralf’s Basque speech model. The underlying concept is always the same for each language.

4. I imported the reduced dictionary. You can get the reduced PLS dictionary from here: http://script.blau.in/basque/dictionary.xml – it is a valid PLS/IPA dictionary that you could import into simon. You don’t have to import the dictionary into simon because Ralf’s Basque speech model itself will contain a scenario file with the already imported Basque words.

5. Take a look at the Active vocabulary. I imported the reduced version of my Basque dictionary as active vocabulary. Each word has the terminal information Unknown. I will have to adjust the Grammar to Unknown.

6. Switch to the Grammar tab. Press the button Add sentence. Add as sentence the word “Unknown”.

7. Enter the new sentence structure. The structure is very simple: “Unknown”.

8. Switch to the Commands tab. Press the button Manage plugins.

9. Press the Add button. It is necessary to activate the dictation plugin.

10. Select Dictation.

11. Append text after result. Press the space-bar one time. This means that after each recognized word, there will be a space. In my opinion, the space bar should be the default configuration.

12. Go to Settings > Configure Simon. I want to import FLAC files. It is necessary to tell Simon that I want to import FLAC files (and not wav files).

13. a. Select Recordings.
b. Switch to the Post-Processing tab.
c. Select Apply filters to recordings recorded with simon. OK, my recordings haven’t been recorded with Simon. They have been recorded with Audacity. But never mind.
d. Now it is necessary to add the correct command. It is sox -t flac %1 -t wav % 2 – see step 14 in this article.

14. Now go to the Training tab. Press the button Import training data.

15. You now have to set the path to the training samples. The screen shot is an old one – so don’t be confused by the path names.

16. Now I generate the prompts file with the following command in the Linux Mint terminal:

cat dictionary.xml | saxonb-xslt -ext:on -s:- -xsl:lexicon2prompts.xsl

17. Now I got two error messages. Error message 1: “Could not process soundfiles”
Error message 2: “Could not process “/home/ubuntu/Documents/basque-speech-model/flac/gar.flac” to “/home/ubuntu/.kde/share/apps/simon/model/training.data//gar_0_2012-05-07_12-50-55.wav”. Please check the command:
“sox -t flac /home/ubuntu/Documents/basque-speech-model/flac/gar.flac -t wav % 2″. (Return value: 2)”
Obviously, I made a mistake. Now I repeat step 14 with the following modfication: “sox -t flac %1 -t wav %2” (I removed a space bar).

18. The import of the folder has been completed.

19. I won’t explain the rest of the steps. Please read some of my previous articles. The recognition is working (of course with many recognition errors, but you get at least results). Here is what Simon “typed” into gedit:

aitzin-euskara aitzin-ikusle aitzindari gantzutu garabete saindu saindutegi saino xerratu

This means that the speech model is working. Download Ralf’s Basque speech model, and use it with simon 0.3.80.

Generation of the transcription failed

May 6th, 2012 by producer

I want to use sam for model creation (normally I would say that Simon is sufficient. but unfortunately Simon removes untrained words). I just pressed the Build model button. Then the following error message appeared:

Generation of the transcription failed. Please check if you have correctly specified the paths to mkphones0.led and mkphons1.led. (/mkphones0.led, /mkphones1.led)

It says in the Build log:

ERROR [+5010] InitSource: Cannot open source file /mkphones0.led
ERROR [+1210] ReadScript: Can’t open file /mkphones0.led
FATAL ERROR – Terminating program /usr/local/bin/HLEd

Now I am trying the following in the Linux Mint terminal:

sudo cp /usr/share/kde4/apps/simon/scripts/mkphones0.led /

Then I press the Build model button (sam) again. Now the following error message appears:

Word undefined: default

It says in the sam Build log:

Generating Master Label File…
“/usr/local/bin/HLEd” -A -D -T 1 -l “*” -d “/tmp/kde-ubuntu/sam/internalsamuser/compile//dict” -i “/tmp/kde-ubuntu/sam/internalsamuser/compile//phones0.mlf” “/mkphones0.led” “/tmp/kde-ubuntu/sam/internalsamuser/compile//words.mlf”
/usr/local/bin/HLEd -A -D -T 1 -l * -d /tmp/kde-ubuntu/sam/internalsamuser/compile//dict -i /tmp/kde-ubuntu/sam/internalsamuser/compile//phones0.mlf /mkphones0.led /tmp/kde-ubuntu/sam/internalsamuser/compile//words.mlf

No HTK Configuration Parameters Set

Editing file: ʔaʊ̯sʃʊsmɪtgliːt_12597_2012-05-05_18-42-37.lab

ERROR [+1232] NumParts: Cannot find word default in dictionary
FATAL ERROR – Terminating program /usr/local/bin/HLEd

I take a look into words.mlf:

#!MLF!#
“*/ʔaʊ̯sʃʊsmɪtgliːt_12597_2012-05-05_18-42-37.lab”
“default”
AUSSCHUSSMITGLIED
.
“*/ʔʀøːdəlhaɪ̯ms_29063_2012-05-05_18-47-12.lab”
“default”
RÖDELHEIMS
.
“*/gəlaɪ̯t͡s_17734_2012-05-05_18-44-01.lab”
“default”
GELEITS

Obviously, there is always a “default” marked. I don’t have any clue what to do.

Automatic removal of untrained words

May 6th, 2012 by producer

The recognition is pretty good. Unfortunately, Simon doesn’t allow untrained words. At the moment, I have a scenario with 360.000 German words. All words consist of triphones that are covered by my acoustic model. But only 10% of these words have been trained. I don’t want to train all words. That would be too complicated and too time-consuming. I guess that I will have to use sam for model generation:

You can use sam to not suffer from the automatic removal of untrained words.

What a pity. In my opinion, Simon itself should offer a button to disable the automatic removal of untrained words.

Update: Is there a specific file that I could modify to manipulate the recognition rate from 0 to 1?

“buffer overflow detected”

May 6th, 2012 by producer

I want to test my 360.000 words German speech model, but unfortunately I got an error message. Simon displays the following text:

The recognition reported the following error:
Failed to setup recognition: Julius did not initialize correctly

I will recompile the model, then test again.

Update May 7, 2012: I tested it again – almost the same error message. Obviously, it is difficult to initialize Julius when the active vocabulary contains 360.000 words.

Update 2: It is possible to recognize a dictionary of up to 50.000 words. If the dictionary is bigger, then an error message appears (see link above).

Update 3: On my computer (Linux Mint 12) is julius version 4.1.5 installed. The current version is version 4.2.1. Maybe I should install the latest version of julius.

Update 4: Maybe I should compile Julius from source:

When you want to change some sompile-time settings of Julius (ex. vocabulary size limit or input length limit, search algorithm variants. …). you should compile Julius from the source codes.

Is there a vocabulary size limit of Julius that comes with the pre-packaged julius 4.1.5? Should I remove this installation from my computer, and compile instead from source? There are specific libraries required to install julius. Probably, I already have them installed. Now I try to install Julius:

cd /home/ubuntu/Documents/julius-4.2.1

And now I have a problem:

./configure
creating cache ./config.cache
checking host system type… x86_64-unknown-linux
checking host specific optimization flag… no
checking for gcc… gcc
checking whether the C compiler (gcc ) works… yes
checking whether the C compiler (gcc ) is a cross-compiler… no
checking whether we are using GNU C… yes
checking whether gcc accepts -g… yes
checking how to run the C preprocessor… gcc -E
checking for a BSD compatible install… /usr/bin/install -c
checking for rm… /bin/rm
checking for Cygwin environment… no
checking for mingw32 environment… no
checking for executable suffix… no
updating cache ./config.cache
creating ./config.status
creating Makefile
creating mkbingram/Makefile
creating mkbinhmm/Makefile
creating adinrec/Makefile
creating adintool/Makefile
creating mkss/Makefile
creating generate-ngram/Makefile
creating jclient-perl/Makefile
creating man/Makefile
configuring in mkgshmm
running /bin/sh ./configure –cache-file=.././config.cache –srcdir=.
loading cache .././config.cache
checking for a BSD compatible install… (cached) /usr/bin/install -c
checking for rm… (cached) /bin/rm
checking for perl… /usr/bin/perl
checking for Cygwin environment… (cached) no
checking for mingw32 environment… (cached) no
checking for executable suffix… (cached) no
updating cache .././config.cache
creating ./config.status
creating Makefile
creating mkgshmm
configuring in gramtools
running /bin/sh ./configure –cache-file=.././config.cache –srcdir=.
loading cache .././config.cache
checking host system type… x86_64-unknown-linux
checking host-specific optimization flag… no
checking for gcc… (cached) gcc
checking whether the C compiler (gcc ) works… yes
checking whether the C compiler (gcc ) is a cross-compiler… no
checking whether we are using GNU C… (cached) yes
checking whether gcc accepts -g… (cached) yes
checking how to run the C preprocessor… (cached) gcc -E
checking for a BSD compatible install… (cached) /usr/bin/install -c
checking for Cygwin environment… (cached) no
checking for mingw32 environment… (cached) no
checking for executable suffix… (cached) no
checking host specific optimization flag… skipped
checking for rm… (cached) /bin/rm
checking for perl… (cached) /usr/bin/perl
checking for yywrap in -lfl… no
configure: error: flex library not found! installation terminated
configure: error: ./configure failed for gramtools
ubuntu@ubuntu-MS-7597 ~/Documents/julius-4.2.1 $

It doesn’t seem to work. I should install the flex library. Now I do the following:
sudo apt-get install flex
And now again in the Linux Mint terminal:
./configure
make
sudo make install (I don’t know whether the sudo is required or not)

It didn’t work out. I think that I will stick to the pre-packaged installation.

Missing triphones in German speech model

May 6th, 2012 by producer

When importing the Reduced German dictionary (I didn’t publish this dictionary) then I get the message that lots of triphones are missing. I extracted these triphones, and put them into the .xml file just-triphones-unique.xml. About 7.000 triphones are missing! This means that I have to record up to 7.000 German words (each word has to contain at least one of the missing triphones) to make “it” work. What does that mean? If I record 7.000 additional German words, I can generate a speech model that covers 380.000 German words (all words from Schott’s German dictionary).

I will have to transform the phonemes inside the file just-triphones-unique.xml into IPA format. This means that I have to do a conversion from SAMPA to IPA. Then I will have to compare the <phoneme> elements from Schott’s German dictionary with the transformed .xml file. If there is a match, I can output the corresponding <lexeme> element. It is pretty complicated. But it should be the fastest way to get a not too bad result.

Or I could go another way. I could compare the future IPA version of just-triphones-unique.xml with Schott’s German dictionary. If there is a match, then the word should be excluded. This means that I can produce a reduced version of my German dictionary with as much words as possible.

There is another different way, too. I could extract the missing words, and put them into a list. Then I could compare this list with my German dictionary. If there is a match, then these specific words will have to be excluded from the German dictionary.

Which way is the easiest one?

Update: I created a list of the missing grapheme elements missing-graphemes.xml. I will have to write an .xsl style sheet that compares this list with my German dictionary. If there isn’t a match, then include this word in a reduced version of the dictionary.

Import of the German word “Bergbach”

May 6th, 2012 by producer

Let’s take a look into Schott’s German dictionary:

<lexeme role="Substantiv">
<grapheme>Bergbach</grapheme>
<phoneme>bɛʀgbaχ</phoneme>
</lexeme>

The transcription is as follows after this word has been imported into Simon: [Bergbach] b E R gb a x

You can see that the gb is treated as one single phoneme. But it should be treated as two different phonemes: /g/ and /b/.

Words with two pronunciations

May 6th, 2012 by producer

Here is my current situation: As base model, I am using the file german-base-model.sbm that is part of Ralf’s German speech model 0.1.9.3. As scenario, I am using the scenario file from alpha. Then I imported Schott’s German dictionary 0.2.8 into Simon. Then I selected Actions > Synchronize. Then the following message appeared:

Specific words aren’t accepted by Simon. These words share a common characteristic.

Let’s take a closer look at a word that causes problems. I selected the word “Abbiegens”. This word does have two pronunciations. Obviously, Simon doesn’t like that.

Let’s take a look into Schott’s German dictionary:

<lexeme role="Substantiv">
<grapheme>Abbiegens</grapheme>
<phoneme>ʔapbiːgəns</phoneme>
<phoneme>ʔapbiːgŋ̩s</phoneme>
</lexeme>

Maybe I should try the following: Create a modified version of Schott's German dictionary that doesn’t contain a second <phoneme> element. Then import this dictionary into Simon. I want to test the following: Recognize 384099 Standard German words in conjunction with Ralf’s German speech model 0.1.9.3. This speech model has been trained with 36.000 German words. That should be (more or less) sufficient. I don’t need to train every word that is part of the vocabulary. It should be sufficient to cover the triphones.

“Julius did not initialize correctly”

May 5th, 2012 by producer

I just tried to use sam. I clicked on Test model. Then an error message appeared:

It says: “Could not initialize recognition: Julius did not initialize correctly.” Simon was not running at that moment.