Posts Tagged ‘prompts’

Capital letter at beginning of sentence

Sunday, October 11th, 2009

I want to import prompts into simon. Let’s say, I want to import the German prompts 01. The following problems occur:

1. Convert flac to wav. I will do this with the following command: $ for f in *.flac; do sox "$f" -t wav -r 16000 -s -c 1 "converted/${f%.flac}.wav"; done
Problem solved.

2. Convert the SSML file into a normal prompts file (HTK format). I will solve this with an XSLT stylesheet. Problem solved.

3. One problem still remains: What is to do with the capital letters at the beginning of each sentence? I don’t want to add the following bold marked words to the dictionary:

Hast du mich verstanden?
Ich habe dich jetzt verstehen können.
Der Schmerz wird mit der Zeit nachlassen.

How could this issue be solved? I don’t want to convert the words at the beginning of each sentence into lowercase manually.

Maybe it would be possible to do some kind of matching? This matching could look like this: If a word in a prompts file is capitalized at the beginning of a sentence (Hast, Ich , Der), then check whether an uncapitalized version of this word is available in the dictionary (hast, ich, der).

I want to train several thousand sentences with simon/sam. These are normal sentences with capitalized letters. How is it possible to uncapitalize a letter at the beginning of a sentence if it is not a noun?

sam: test prompts

Friday, August 7th, 2009

I checked out revision 891:

liberty@liberty-desktop:~/200907$ svn co https://speech2text.svn.sourceforge.net/svnroot/speech2text/

Then I tried to build simon / sam:

liberty@liberty-desktop:~/200907/speech2text/trunk$ ./build_ubuntu.sh

During the compilation, an error message appeared. I will try again later, I don’t know what went wrong.

I think that sam will be very useful for testing speech models:

sam-test

I opened the file /home/liberty/200907/speech2text/trunk/sam/src/main.ui with Qt Creator. You can see that it is possible to define a path for test prompts (text file) / test prompts base path (corresponding wav files). I will try that with German Voxforge prompts. My goal is to test up to about 100 prompts (utterances) at a time.

Adding prompts from Voxforge

Tuesday, July 21st, 2009

I just added two lines to the file /home/liberty/.kde/share/apps/simon/model/prompts:

Herzog_S2_2009-07-19_23-45-26 HERZOG
organisiert_S1_2009-07-20_00-21-32 ORGANISIERT
Flaschen_S2_2009-07-19_18-59-50 FLASCHEN
de27-02 DAS HAUS IST NEU GEBAUT WORDEN
de27-03 DAS WETTER IST SEHR SCHLECHT

Now, I will have to add the corresponding FLAC/wav files. I have to do a conversion from FLAC to wav:

flac-wav

1. I have opened the folder /home/liberty/200907/ralfherzog-20071213-de27/flac.
2. The audio files de27-02.flac and de27-03.flac have to be converted into the wav format.
3. The suffix will be .wav.
4. I have to select the wav format.

move-wav

5. I moved the two wav files to /media/Hitachi/simon-xp-wav.
6. You can see that this folder contains lots of wav files that have been recorded with simon.

And now I have started simon and ksimond. Let’s see what happens when I press the Synchronize button. There was no error message. But the word Wetter isn’t included in the active word list (it is part of the shadow dictionary).

I guess that I have forgotten to adjust the TrainingDate value.

One question remains open: how would simon know which pronunciation to choose if there were several pronunciations available? It is possible that the answer has been given in a comment on this blog, but I can’t remember the details at the moment.

How to import Voxforge models / prompts

Tuesday, July 21st, 2009

There is some interesting information in the Voxforge forum (subsequent quotes are from the Voxforge forum unless not marked otherwise):

“I am assuming that since it uses HTK format acoustic models, you should be able to just replace the hmmdefs, macros and tiedlist files with VoxForge’s versions of these files.”

I have the same thought. But I need to know about the exact details. Here they are:

“replacing the model files in ~/.kde/share/apps/simond/models/<your user>/active with the voxforge model files will work”

It would be possible to add wav files:

“You can of course add samples to them but make sure you place them in the configured samples folder so they will be found during the compiling of the model.”

So this means that I could add wav files from Voxforge to the following location on my computer:

training-path

And I would have to adjust the prompts file:

“The prompts file is located at ~/.kde/share/apps/simon/model/prompts.”

Here are a few lines from my prompts file (location: /home/liberty/.kde/share/apps/simon/model/prompts):

Computer_S2_2009-07-18_16-53-22 COMPUTER
Grenzen_S2_2009-07-19_20-41-13 GRENZEN
Geschenken_S1_2009-07-19_20-04-16 GESCHENKEN
Technologie_S1_2009-07-19_10-58-49 TECHNOLOGIE
Gewichten_S2_2009-07-19_20-38-22 GEWICHTEN

And I would have

“to manually update the “TrainingsDate” value to the current date/time in the file: ~/.kde/share/apps/simon/model/modelsrcrc”

On my computer, the file /home/liberty/.kde/share/apps/simon/model/modelsrcrc has the following content:

GrammarDate=2009,7,13,16,43,50
LanguageDescriptionDate=2009,5,5,19,16,34
TrainingDate=2009,7,20,22,51,26
WordListDate=2009,7,20,22,49,38

Let’s take a closer look at TrainingDate:

year,month,day, = 2009,7,20,
hour (24 hour format),minute,second = 22,51,26

I think that it would be best if I tried to import a few German wav files from Voxforge (16 kHz) into /media/Hitachi/simon-xp-wav. I have just downloaded ralfherzog-20071213-de27.tgz (6.2 MB). It contains FLAC files. They should be converted to wav files before/while inserting them into the folder /media/Hitachi/simon-xp-wav. And I would have to add the content of PROMPTS (from ralfherzog-20071213-de27.tgz) to the file /home/liberty/.kde/share/apps/simon/model/prompts.

I think that the concept is as follows: simon manages the wav files, the prompts, and TrainingDate. It ‘gives’ them (via TCP/IP) to simond which generates hmmdefs and tiedlist.

My goal is to use simon/simond for the model generation. And I want to import wav files with the corresponding prompts from Voxforge.