Archive for the ‘Ubuntu’ Category

Ralf’s Interlingua dictionary

Tuesday, January 10th, 2012

This article explains how I create the dictionary, and how the imported result looks like in simon.

A. Creation of the PLS dictionary:

1. Get spelling dictionary.
2. License is GPL. It says in the file README_en.txt:

This spell check dictionary for Interlingua is licensed under GPL. [...] This hyphenation rules for Interlingua are licensed under GPL.

This means that I can use this spelling dictionary as source.
3. Extract dict-ia-2010-11-29.oxt.
4. ISO 639-1 language code is ia.
5. Probably I will use this table for grapheme to phoneme conversion.

6. Check the encoding of ia_iso.aff and ia_iso.dic. Both files are encoded in ISO 8859-1. Probably it is best if I convert the encoding of both files into UTF-8.
iconv -f ISO-8859-1 -t UTF-8 < ia_iso.dic > interlingua-utf8.dic
iconv -f ISO-8859-1 -t UTF-8 < ia_iso.aff > interlingua-utf8.aff

Change the first line in interlingua-utf8.aff into SET UTF-8. Both files contain CRLF at the end of each line (Windows mode). I don’t know whether this is ok with the unmunch command. I will check it out:

ubuntu@ubuntu:~/Documents/2011-II/Interlingua$ unmunch interlingua-utf8.dic interlingua-utf8.aff > interlingua-wordlist

Obviously, it worked. The CRLF is part of the source files. The target file contains just a LF (Unix mode). There are a lot of duplicate entries. I think that these duplicate entries will be removed later by an .xsl script.

7. Add lexicon tags at the beginning and the end of interlingua-wordlist.

8. Create XML file:

ubuntu@ubuntu:~/Documents/2011-II/Interlingua$ saxonb-xslt -s:interlingua-wordlist -xsl:'http://spirit.blau.in/simon/files/2010/04/create-xml-file.xsl' -o:interlingua.xml

9. Create PLS dictionary:

ubuntu@ubuntu:~/Documents/2011-II/Interlingua$ saxonb-xslt -s:interlingua.xml -xsl:'improve-interlingua.xsl' -o:interlingua-dictionary.xml

B. Download the dictionary. Import it into simon.

The left column contains the words. The pronunciation column contains the corresponding SAMPA transcriptions. The Category column contains just “Unknown” entries.

Now you know how I created the dictionary and how the result looks like in simon.

Ralf’s Arabic dictionary

Tuesday, January 10th, 2012

This article explains the creation of an Arabic PLS dictionary and how the result looks like in simon.

A. Creation of the dictionary:

1. Get Arabic spelling dictionary.
2. Check the license. Inside the file dict_ar-3.0.oxt there is a file with the name COPYING (in the docs folder). It says in the file:

GPL 2.0/LGPL 2.1/MPL 1.1 tri-license

This means that I can use this tri-licensed spelling dictionary as source for my future GPLv3 PLS dictionary.

3. Now I have to extract dict_ar-3.0.oxt.
4. Let’s try the unmunch command inside the Ubuntu terminal:

ubuntu@ubuntu:~/Documents/2011-II/Arabic$ unmunch ar.dic ar.aff > arabic

It failed. I wasn’t able to unmunch the word list.
5. I have to remove all numbers from ar.dic. This can be done with the sed command:

sed 's/[0-9]*//g' ar.dic > arabic-without-numbers

6. Remove the slash (“/”) from arabic-without-numbers with Geany.
7. Add lexicon tags at the beginning and the end of the file.
8. Ubuntu terminal:

ubuntu@ubuntu:~/Documents/2011-II/Arabic$ saxonb-xslt -s:arabic-without-numbers -xsl:'http://spirit.blau.in/simon/files/2010/04/create-xml-file.xsl' -o:arabic.xml

9. ISO 639-1 language code is ar.
10. Maybe I will use this table for the grapheme to phoneme conversion.
11. Ubuntu terminal:

ubuntu@ubuntu:~/Documents/2011-II/Arabic$ saxonb-xslt -s:arabic.xml -xsl:'improve-arabic.xsl' -o:arabic-dictionary.xml

I have to remove the number sign (“#”) with Geany from arabic.xml.

B. Download the dictionary. Import it into simon.

The left column contains 457089 Arabic words. The pronunciation column contains the corresponding SAMPA transcriptions. The third column contains just entries with “Unknown”. This is because the PLS dictionary contains no role attributes.

Now you know how I created the dictionary. And you know how the result looks like in simon.

Ralf’s Hebrew dictionary

Tuesday, January 10th, 2012

In 2009, I made some initial tests with Hebrew. Now it is time to develop a Hebrew PLS dictionary that is much bigger than the sample dictionary from 2009 (which I have deleted). This article explains how I create the dictionary, and how the result looks like when imported into simon.

A. Creation of the dictionary:

1. Get Hebrew spelling dictionary from OpenOffice.org.
2. License is GPL. There is a copyright notice inside the file he_IL.aff.

3. I tried to unmunch the dictionary in the Ubuntu terminal, but unfortunately I failed:

ubuntu@ubuntu:~/Documents/2011-II/Hebrew$ unmunch he_IL.dic he_IL.aff > hebrew-test

4. The source file he_IL.dic contains a lot of numbers. I remove them with the Ubuntu terminal:

ubuntu@ubuntu:~/Documents/2011-II/Hebrew$ sed 's/[0-9]*//g' he_IL.dic > hebrew-without-numbers

With Geany, I remove the “,” (commas) and the “/” (slashes) that still are included within in the file hebrew-without-numbers. Now I have a clean word list with 43.000 Hebrew words.

5. Add lexicon tags at the beginning and the end of hebrew-without-numbers.
6. Ubuntu terminal:

ubuntu@ubuntu:~/Documents/2011-II/Hebrew$ saxonb-xslt -s:hebrew-without-numbers -xsl:'http://spirit.blau.in/simon/files/2010/04/create-xml-file.xsl' -o:hebrew.xml

7. ISO 639-1 language code is he.
8. I need a table for grapheme to phoneme conversion. Maybe I will use this table. There are several tables available at Wikipedia. I am not sure which one I should use. I have an idea: as far as I know, Yiddish and Hebrew share the same alphabet. This means I could try to use the Yiddish improve-yiddish.xsl style sheet:

ubuntu@ubuntu:~/Documents/2011-II/Hebrew$ saxonb-xslt -s:hebrew.xml -xsl:'/home/ubuntu/Documents/2011-II/Yiddish/dictionaries/improve-yiddish.xsl' -o:hebrew-dictionary.xml

The result is that most Hebrew letters have been converted into IPA. There is only one Hebrew letter that hasn’t been converted: [א] I will add this phone to the .xsl style sheet with the name improve-hebrew.xsl. Now I try it again:

ubuntu@ubuntu:~/Documents/2011-II/Hebrew$ saxonb-xslt -s:hebrew.xml -xsl:'improve-hebrew.xsl' -o:hebrew-dictionary.xml

The result is not so good: Maybe I should adjust the grapheme to phoneme conversion rules for modern standard Israeli Hebrew. Or is this not necessary? I think for a first draft I can use the Yiddish transformation rules.

B. Download the dictionary. Import it into simon as shadow dictionary.

Take a look at the result: The left column contains 43933 Hebrew words. The pronunciation column contains the corresponding SAMPA transcriptions. The category column is unemployed (or to be more exact: displays just Unknown) since the source PLS dictionary contains no role attributes.

Now you know how I created the dictionary. And you know how the result looks like in simon. This dictionary uses more or less Yiddish pronunciation because I was too lazy to adjust it to modern standard Israeli Hebrew. It shouldn’t be a problem to adjust the style sheet improve-hebrew.xsl so that the phoneme results are better.

Ralf’s Yiddish dictionary

Tuesday, January 3rd, 2012

This article explains some details about the creation of the dictionary, and how the result looks like in simon.

A. How I create Ralf's Yiddish dictionary:

1. Get spelling dictionary.
2. License is GPLv3.
3. Extract jidysz.net.ooo.spellchecker.oxt.
4. Ubuntu terminal:
cd /home/ubuntu/Documents/2011-II/Yiddish/dictionaries
sudo apt-get install hunspell-tools
unmunch yi.dic yi.aff > yiddish-wordlist

5. Add <lexicon> at the beginning of yiddish-wordlist. Add </lexicon> at the end of this file.
6. Generate .xml document with lexicon, lexeme and grapheme elements:

ubuntu@ubuntu:~/Documents/2011-II/Yiddish/dictionaries$ saxonb-xslt -s:yiddish-wordlist -xsl:'http://spirit.blau.in/simon/files/2010/04/create-xml-file.xsl' -o:yiddish.xml

7. ISO 639-1 language code is yi.
8. I think I will use this table as source for the grapheme to phoneme mapping.
9. Ubuntu terminal:

ubuntu@ubuntu:~/Documents/2011-II/Yiddish/dictionaries$ saxonb-xslt -s:yiddish.xml -xsl:'improve-yiddish.xsl' -o:yiddish-dictionary.xml

B. Download the dictionary, and import it into simon.

Take a look at the result. The left column contains the Yiddish words. This dictionary contains 99980 words. The right column contains the corresponding SAMPA transcription.
Yiddish is written in the Hebrew alphabet. The Hebrew alphabet is written from right to left. Obviously, the corresponding SAMPA transcriptions are written from left to right. This means that the phoneme order should be fine.

There are a lot of other PLS dictionaries available. Find the PLS dictionary that suits your language.

Import of your Grammar

Monday, January 2nd, 2012

In this post I want to write some words about the Grammar / Import function of simon 0.3. Here is what I do:

1. Import Schott’s German dictionary as active dictionary into simon.

2. Open the Grammar tab. Press the Import button.

3. Simon starts a wizard. Press the Next button.

4. Let’s try and check the option Also import unknown sentences. I don’t know whether this is a good decision. So let’s give it a try.
This is interesting: “words with more than one terminal” – is it now possible to use more than one entry for the role attribute? The current version of Schott’s German dictionary employs just one entry for each role attribute. The PLS standard allows more entries.
Please, download and extract Schott’s German utterances. This compressed folder 15000-german-utterances.zip contains a plain text file with more than 15000 utterances. I am the author of these utterances, and I have licensed them under the GPLv3. The utterances are designed to be used in conjunction with Schott’s German dictionary (former name: Ralf’s German dictionary). Every word that is included within Schott’s German utterances should be included in Schott’s German dictionary, too. I am 99% sure that this is the case, but I can’t guarantee it. If some words are missing in Schott’s German dictionary, please inform me, and I will include them within the next version of Schott’s German dictionary.
You can import my German utterances using simon’s Import Text option (copy & paste).

5. The import has been completed. There are a lot of lines that contain the Unknown terminal. Probably it would have been better if I wouldn’t have checked the option Also import unknown sentences in step 4.

6. Because simon didn’t react any more, I forced it to quit. I tried to start simon several times, but it wouldn’t start. Now there are several simon zombie statuses displayed. I was able to end these zombie / sleeping processes. But at the moment, it seems to be impossible to start simon again (a new zombie / sleeping status is beeing created if I try to start simon again).

Conclusion: I don’t recommend to check the option Also import unknown sentences. I tried the import Grammar function before without checking this option. Simon reacted normal, everything seemed to be fine.

German speech model ‘deef’

Monday, August 1st, 2011

Visit Schott’s German IPA FLAC files (section: deef) [source 1] or Voxforge [source 2]. Get the corresponding speech model [object], and import it into simon 0.3. Watch my video about this speech model:

These are the words that were recognized in the video:

Randschicht Randproblem Randprobleme Randproblemen Randproblems Randpunkt Randsportart Randstellung Randträger Rechenvorrichtung Randwerbung Randwinkel Randträger Randzone Rangabzeichen Rangabzeichens Rangelei Rangfolgen Rangliste Ranglisten Rangordnung Rangordnungen Ranguns Rangstufe Rangstufen Rappe Ratte Rapport Rapporte Rapporten Rapports Rapsfeld Rapsfelder Rapsfeldern Rapsfelds Rasenfläche Rassenkämpfen Rasenloch Rasenplätze Rasenplatz Rasenplätze Rasenplätzen Rasensport Rasenstück Rasentraktor Rasentrimmer Raserei Rasereien Rasierapparat Rasierapparaten Rasierapparats Rasierer Rasierklinge Rasierpinsel Rasierschaum Rasierseife Rasierwasser Rasierzeug Rasmussen Raspe Raspeln Rassenhass Rassenhasses Rassenkampf Rassenkonflikt Rassenkrawall Rassenkunde Rassenkämpfe Rassenkämpfen Rassenmischung Rassenmischungen Rassenproblem Rassenprobleme Rassentrennung Rassenunruhen Rassepferd Rastalocken Rastblech Rastdorn Rastdorne Rasterbild Rasterdecke Rasterdruck Rasterelektronenmikroskop Rasterfahndung Rasterfahndungen Rastergestaltung Rastermaße Rastermaßen Rastermaßes Rastern Rasterpapier Rasterpunkt Rasterpunktabfühlung Rasterpunktlesen Rasters Rasterung Rasterweite Rasterweiten Rastkappe Rastkontakt Rastlappen Rastlosigkeit Rastmechanismus Rastmoment Rastmontage Rastplatz Rastplätze Rastplätzen Rastpunkt Raststätten Rasur Ratifikationen Ratifikationsurkunde Ratifikationsurkunden Ratifizierens Ratifizierung Ratifizierungen Ratifizierungsdebatte Rechnerbaugruppe Ratingen Ration Rationalisieren Rationalisierens Rationalisierung Rationalisierungen Rationalisierungsdruck Rationalismus Rationalist Rationalisten Rationalität Rationen Rationierens Rationierung Rationierungen Ratlosigkeit Rasanz Ratsamkeit Ratsbeschluss Ratsbeschlusses Ratschlag Ratschlags Ratspräsidenten Ratspräsident Ratsmitgliedern Ratsmitglieder Ratsmitglied Ratsherren Ratsche Ratssitzung Ratssitzungen Ratstisch Ratsvorsitz Ratsvorsitzende Randverbinder Rattenfleckfieber Rattenfänger Rattengift Rattenhaus Rattenkönig Rattenloch Rattenloches Rattenplage Rattenschwanz Rattermarken Ratzinger Rauchkammer Rauch Rauchabzug Rauchbombe Rauchens Rauchen Raucher Raucherinsel Rauchern Rauchgas Raucherzimmer Raucherzone Rauchers Rauchfahne Rauchfang Rauchfass Rauchfleisch Rauchgas Rauchgasfilter Rauchgasfühler Rauchgaskanal Rauchgaswäsche Rauchgaszug Rauchgenuss Rauchgenusses Rauchglas Rauchigkeit Rauchkammer Rauchmaschine Rauchmelder Rauchpilz Rauchschwaden Rauchsäule Rauchsäulen Rauchverbot Rauchvergiftung Rauchverzehrer Rauchverzicht Rauchvorhang Rauchwand braven Raudi Rauchpilz Raufbold Rauferei Raufhandel Rauflustigkeit Raufrost Rauheiten Rauheit Rauheitsbeiwert Rauheitsbestimmung Rauhut Raum Raumabtrennung Raumakustik Raumangst Raumanzug Raumaufteilung Raumaufteilungen Raumausstatter Raumbedarfs Raumbelastung Ratsmitgliedern Raumbereich Raumbereichen Raumbereichs Raumbuch Raumfahrtagentur Raumfahrtbehörde Raumtransport Raumtransport Raumfahrtkonzern Raumfahrtkonzerns Raumfahrtprogramm Raumfahrtsparte Raumfahrttechnik Raumfahrzeuge Raumfahrzeuge Raumflug Raumflugs Raumforscher Raumforschung Raumfähre Raumgeräuschpegel Raumgeschwindigkeit Raumgestaltung Raumgewicht Raumgruppe Raumhöhe Rauminhalt Raumkapsel Raumkelle Raumklima Raumladung Raumlufttechnik Raummangel Raummaß Raumordnung Raumordnungsverfahrens Raumpfleger Raumpflegerin Raumplanung Raumprogramm Raumrichtung Raumschiff Raumschiffen Raumschiffes Raumthermostat Raumverhältnis Raumverhältnisse Raumverhältnissen Raumverhältnisses Raumwinkel Raumzeitalter Raute Rauchwaren Raupenantrieb Raupenantrieben Raupenbagger Rauschebart Rauschens Rauschgift Rauschgiftdezernat Rauschgifte Rauschgiftsucht Rauschgiftverbot Rauschgold Rauschgoldengel Rauschmittel Rauschpegel Rauschpegels Rauschunterdrückung Rauschuntergrund Rauschuntergrunds Rausschmeißer Rauschgifts Rausschmisse Raupentrieb Rauchzeichen Randwähler Ravennas Reagenzglases Reagenzgläser Reagenzpapier Reaktionsmöglichkeiten

A lot of words were recognized correctly. That is not a bad result.

sudo make uninstall

Monday, August 1st, 2011

A few minutes ago in my Ubuntu terminal:

ubuntu@ubuntu:~/Documents/2011-II/speech2text/build$ sudo make uninstall

Then download of simon_0.3.0-1ubuntu8_amd64.deb. Then I installed this version with Ubuntu Software Center.

There is no PPA for Ubuntu 11.04

Monday, August 1st, 2011

There is no PPA for Ubuntu 11.04. How is it possible to install an old PPA (Maverick)? What is the command that I have to enter to install an old PPA?

Export test result with sam

Monday, August 1st, 2011

Yesterday, I installed Qwt 6, and then built simon 0.3.60. It was difficult, but in the end it worked out fine. And look, sam offers now an Export test result button (top right of the screen shot):

I want to export the following information: Filename, Expected result, Actual result, Recognition rate (below 50%). The resulting document should be a simple text file (or XML file or whatever). Is this possible with the current Export test result function of simon?

ppa.launchpad.net – where is natty?

Sunday, July 31st, 2011

This is what I did a few minutes ago:

ubuntu@ubuntu:~/Documents/2011-II/speech2text$ sudo add-apt-repository ppa:grasch-simon-listens/simon

Then I typed into the Ubuntu terminal:

ubuntu@ubuntu:~/Documents/2011-II/speech2text$ sudo apt-get update

And then the following message appeared:

[...] W: Failed to fetch http://ppa.launchpad.net/grasch-simon-listens/simon/ubuntu/dists/natty/main/source/Sources 404 Not Found

W: Failed to fetch http://ppa.launchpad.net/grasch-simon-listens/simon/ubuntu/dists/natty/main/binary-amd64/Packages 404 Not Found

Then I took a look into http://ppa.launchpad.net/grasch-simon-listens/simon/ubuntu/dists/. Obviously there isn’t a directory called natty.

I want to get the newest version of simon/sam. Where can I find it?
–> It is not here. This version is from October 2010.
–> At ppa.launchpad.net/grasch-simon-listens/ I find only versions made for lucid and maverick.
–> I wasn’t successful with git because some problem with qwt 6 came up.

I want to see whether the newest version of sam has an ‘Export test results’ button.

Remove the package “simon”

Sunday, July 31st, 2011

I want to get the newest simon version via git. Here is what I do:

1. Ubuntu terminal:

sudo apt-get install git

2. Ubuntu terminal:

ubuntu@ubuntu:~/Documents/2011-II$ git clone git://speech2text.git.sourceforge.net/gitroot/speech2text/speech2text

3. System > Administration > Synaptic Package Manager:
Remove the package “simon” (Mark for Complete Removal).
simon is now not visible any more in Synaptic. So obviously, it has been completely removed.

4. Ubuntu terminal:

ubuntu@ubuntu:~/Documents/2011-II/speech2text$ git pull origin master
From git://speech2text.git.sourceforge.net/gitroot/speech2text/speech2text
* branch master -> FETCH_HEAD
Already up-to-date.
ubuntu@ubuntu:~/Documents/2011-II/speech2text$

5. Ubuntu terminal:

ubuntu@ubuntu:~/Documents/2011-II/speech2text$ ./build_ubuntu.sh
– The C compiler identification is GNU
– The CXX compiler identification is GNU
– Check for working C compiler: /usr/bin/gcc
– Check for working C compiler: /usr/bin/gcc — works
– Detecting C compiler ABI info
– Detecting C compiler ABI info – done
– Check for working CXX compiler: /usr/bin/c++
– Check for working CXX compiler: /usr/bin/c++ — works
– Detecting CXX compiler ABI info
– Detecting CXX compiler ABI info – done
CMake Error at cmake/FindZLIB.cmake:25 (MESSAGE):
Could not find ZLIB
Call Stack (most recent call first):
julius/libsent/CMakeLists.txt:3 (find_package)

– Configuring incomplete, errors occurred!
touch: cannot touch `./julius/gramtools/mkdfa/mkfa-1.44-flex/*’: No such file or directory
ubuntu@ubuntu:~/Documents/2011-II/speech2text$

6. Question: What do I have to do to get simon going from git repository?

Edit 19.15: I am trying the following:

sudo apt-get install git-core build-essential cmake bison flex gettext gettext-kde kdeartwork \
kdelibs5-dev libxtst-dev libqt4-sql-sqlite qtmobility-dev libphonon-dev libattica-dev libattica0 zlib1g-dev \
portaudio19-dev

Edit 19.25:

ubuntu@ubuntu:~/Documents/2011-II/speech2text$ ./build_ubuntu.sh
– Found Portaudio: /usr/lib/libportaudio.so
– Found ZLIB: /usr/lib/x86_64-linux-gnu/libz.so
– Found Pthreads: /usr/lib/x86_64-linux-gnu/libpthread.so
– Looking for Q_WS_X11
– Looking for Q_WS_X11 – found
– Looking for Q_WS_WIN
– Looking for Q_WS_WIN – not found.
– Looking for Q_WS_QWS
– Looking for Q_WS_QWS – not found.
– Looking for Q_WS_MAC
– Looking for Q_WS_MAC – not found.
– Found Qt-Version 4.7.2 (using /usr/bin/qmake)
– Looking for XOpenDisplay in /usr/lib/x86_64-linux-gnu/libX11.so;/usr/lib/x86_64-linux-gnu/libXext.so;/usr/lib/x86_64-linux-gnu/libXau.so;/usr/lib/x86_64-linux-gnu/libXdmcp.so
– Looking for XOpenDisplay in /usr/lib/x86_64-linux-gnu/libX11.so;/usr/lib/x86_64-linux-gnu/libXext.so;/usr/lib/x86_64-linux-gnu/libXau.so;/usr/lib/x86_64-linux-gnu/libXdmcp.so – found
– Looking for gethostbyname
– Looking for gethostbyname – found
– Looking for connect
– Looking for connect – found
– Looking for remove
– Looking for remove – found
– Looking for shmat
– Looking for shmat – found
– Found X11: /usr/lib/x86_64-linux-gnu/libX11.so
– Looking for include files CMAKE_HAVE_PTHREAD_H
– Looking for include files CMAKE_HAVE_PTHREAD_H – found
– Looking for pthread_create in pthreads
– Looking for pthread_create in pthreads – not found
– Looking for pthread_create in pthread
– Looking for pthread_create in pthread – found
– Found Threads: TRUE
– Looking for _POSIX_TIMERS
– Looking for _POSIX_TIMERS – found
– Found Automoc4: /usr/bin/automoc4
– Found Perl: /usr/bin/perl
– Found Phonon: /usr/include
– Performing Test _OFFT_IS_64BIT
– Performing Test _OFFT_IS_64BIT – Success
– Performing Test HAVE_FPIE_SUPPORT
– Performing Test HAVE_FPIE_SUPPORT – Success
– Performing Test __KDE_HAVE_W_OVERLOADED_VIRTUAL
– Performing Test __KDE_HAVE_W_OVERLOADED_VIRTUAL – Success
– Performing Test __KDE_HAVE_GCC_VISIBILITY
– Performing Test __KDE_HAVE_GCC_VISIBILITY – Success
– Found KDE 4.6 include dir: /usr/include
– Found KDE 4.6 library dir: /usr/lib
– Found the KDE4 kconfig_compiler preprocessor: /usr/bin/kconfig_compiler
– Found automoc4: /usr/bin/automoc4
– Found Qt-Version 4.7.2 (using /usr/bin/qmake)
– Found X11: /usr/lib/x86_64-linux-gnu/libX11.so
– Found Qt-Version 4.7.2 (using /usr/bin/qmake)
– Found X11: /usr/lib/x86_64-linux-gnu/libX11.so
– Found Qt-Version 4.7.2 (using /usr/bin/qmake)
– Found X11: /usr/lib/x86_64-linux-gnu/libX11.so
– Found Qt-Version 4.7.2 (using /usr/bin/qmake)
– Found X11: /usr/lib/x86_64-linux-gnu/libX11.so
– Found Qt-Version 4.7.2 (using /usr/bin/qmake)
– Found X11: /usr/lib/x86_64-linux-gnu/libX11.so
– Found Qt-Version 4.7.2 (using /usr/bin/qmake)
– Found X11: /usr/lib/x86_64-linux-gnu/libX11.so
– Found Qt-Version 4.7.2 (using /usr/bin/qmake)
– Found X11: /usr/lib/x86_64-linux-gnu/libX11.so
– Found Qt-Version 4.7.2 (using /usr/bin/qmake)
– Found X11: /usr/lib/x86_64-linux-gnu/libX11.so
– Found Qt-Version 4.7.2 (using /usr/bin/qmake)
– Found X11: /usr/lib/x86_64-linux-gnu/libX11.so
– Found Qt-Version 4.7.2 (using /usr/bin/qmake)
– Found X11: /usr/lib/x86_64-linux-gnu/libX11.so
– Found Qt-Version 4.7.2 (using /usr/bin/qmake)
– Found X11: /usr/lib/x86_64-linux-gnu/libX11.so
– Found Qt-Version 4.7.2 (using /usr/bin/qmake)
– Found X11: /usr/lib/x86_64-linux-gnu/libX11.so
– Found libsamplerate: /usr/lib/libsamplerate.so
– Found ALSA: /usr/lib/libasound.so
– Enabling resample support
– Found Qt-Version 4.7.2 (using /usr/bin/qmake)
– Found X11: /usr/lib/x86_64-linux-gnu/libX11.so
– Found Qt-Version 4.7.2 (using /usr/bin/qmake)
– Found X11: /usr/lib/x86_64-linux-gnu/libX11.so
– Found Qt-Version 4.7.2 (using /usr/bin/qmake)
– Found X11: /usr/lib/x86_64-linux-gnu/libX11.so
– Found Qt-Version 4.7.2 (using /usr/bin/qmake)
– Found X11: /usr/lib/x86_64-linux-gnu/libX11.so
– Found Qt-Version 4.7.2 (using /usr/bin/qmake)
– Found X11: /usr/lib/x86_64-linux-gnu/libX11.so
– Found Qt-Version 4.7.2 (using /usr/bin/qmake)
– Found X11: /usr/lib/x86_64-linux-gnu/libX11.so
– Found Qt-Version 4.7.2 (using /usr/bin/qmake)
– Found X11: /usr/lib/x86_64-linux-gnu/libX11.so
– Found Qt-Version 4.7.2 (using /usr/bin/qmake)
– Found X11: /usr/lib/x86_64-linux-gnu/libX11.so
– Found Qt-Version 4.7.2 (using /usr/bin/qmake)
– Found X11: /usr/lib/x86_64-linux-gnu/libX11.so
– Found Qt-Version 4.7.2 (using /usr/bin/qmake)
– Found X11: /usr/lib/x86_64-linux-gnu/libX11.so
– Found Qt-Version 4.7.2 (using /usr/bin/qmake)
– Found X11: /usr/lib/x86_64-linux-gnu/libX11.so
– Found Qt-Version 4.7.2 (using /usr/bin/qmake)
– Found X11: /usr/lib/x86_64-linux-gnu/libX11.so
– Found Qt-Version 4.7.2 (using /usr/bin/qmake)
– Found X11: /usr/lib/x86_64-linux-gnu/libX11.so
– Found Qt-Version 4.7.2 (using /usr/bin/qmake)
– Found X11: /usr/lib/x86_64-linux-gnu/libX11.so
– Found Qt-Version 4.7.2 (using /usr/bin/qmake)
– Found X11: /usr/lib/x86_64-linux-gnu/libX11.so
– Enabling simon scenario support.
– Found Qt-Version 4.7.2 (using /usr/bin/qmake)
– Found X11: /usr/lib/x86_64-linux-gnu/libX11.so
– Found Qt-Version 4.7.2 (using /usr/bin/qmake)
– Found X11: /usr/lib/x86_64-linux-gnu/libX11.so
– Found Qt-Version 4.7.2 (using /usr/bin/qmake)
– Found X11: /usr/lib/x86_64-linux-gnu/libX11.so
– Found Qt-Version 4.7.2 (using /usr/bin/qmake)
– Found X11: /usr/lib/x86_64-linux-gnu/libX11.so
– Found Qt-Version 4.7.2 (using /usr/bin/qmake)
– Found X11: /usr/lib/x86_64-linux-gnu/libX11.so
– Found Qt-Version 4.7.2 (using /usr/bin/qmake)
– Found X11: /usr/lib/x86_64-linux-gnu/libX11.so
– Found Qt-Version 4.7.2 (using /usr/bin/qmake)
– Found X11: /usr/lib/x86_64-linux-gnu/libX11.so
– Found Qt-Version 4.7.2 (using /usr/bin/qmake)
– Found X11: /usr/lib/x86_64-linux-gnu/libX11.so
– Found Qt-Version 4.7.2 (using /usr/bin/qmake)
– Found X11: /usr/lib/x86_64-linux-gnu/libX11.so
– Found Qt-Version 4.7.2 (using /usr/bin/qmake)
– Found X11: /usr/lib/x86_64-linux-gnu/libX11.so
– Found Qt-Version 4.7.2 (using /usr/bin/qmake)
– Found X11: /usr/lib/x86_64-linux-gnu/libX11.so
– Could NOT find KdepimLibs (missing: KdepimLibs_CONFIG) (Required is at least version “4.5.60″)
– Found Qt-Version 4.7.2 (using /usr/bin/qmake)
– Found X11: /usr/lib/x86_64-linux-gnu/libX11.so
– Found Qt-Version 4.7.2 (using /usr/bin/qmake)
– Found X11: /usr/lib/x86_64-linux-gnu/libX11.so
– Found Qt-Version 4.7.2 (using /usr/bin/qmake)
– Found X11: /usr/lib/x86_64-linux-gnu/libX11.so
CMake Error at cmake/FindQwt6.cmake:101 (MESSAGE):
Could not find Qwt 6.x
Call Stack (most recent call first):
sam/src/CMakeLists.txt:1 (find_package)

– Configuring incomplete, errors occurred!
make: *** No targets specified and no makefile found. Stop.
ubuntu@ubuntu:~/Documents/2011-II/speech2text$

And what now? What should I do now?

German speech model ‘deea’

Saturday, July 30th, 2011

Visit Schott’s German IPA FLAC files (section: deea) (source 1) or Voxforge (source 2). Download the German speech model ‘deea’ (object). Watch my video about this speech model:


There are a lot of recognition errors that need to be fixed.

(more…)

German speech model ‘dedv’

Thursday, July 28th, 2011

Here is how I create the German speech model ‘dedv’:

1. Open the file de-dv. It contains a list of 1000 words (only the phonetic transcriptions).

2. Start Audacity.

3. Now I read every word in the list. Between each word, there is a pause of 1-2 seconds. Later, Audacity will find the pauses automatically.

4. Mark the whole recording with a double-click.
5. Then select Analyze > Sound Finder…

6. Set the Label starting point to 1.0. Set the Label ending point to 1.0. Why? Because simon has to know the amount of background noise.

7. Let’s eliminate the error at position 237. There was some noise (above 26 dB, I think) at position 237, and not a word. Mark the area with the mouse. Then press the Silence button (see top right of the picture).

8. Let’s have a look at the text file de-dv, and at the audio file. The text file ends with line 841. The Audacity audio file ends with number 841. Both files correspond to each other.

9. Select Audacity > File > Export Labels… The position (starting point and ending point) of each label will be exported into a simple text file. Export the labels to a file named labels.txt.

10. Open labels.txt with Geany. The file labels.txt ends at line 841. The first number of each line indicates the label starting point. The second number indicates the label ending point. The third number indicates the label itself.
You can see in the picture that de-dv is open, too. Both files – labels.txt and de-dv – have a length of exactly 841 lines.

11. Geany > Search > Replace.
Search for: \t\w+$ (t means tab; w means alphanumeric character; + means this: “The plus sign indicates that there is one or more of the preceding element”; $ means end of line)
Don’t forget to mark Use regular expressions.
This procedure removes the third number from each line.

12. You can see that the third column has been removed thanks to the regular expression procedure.

13. Now it is time to merge both files: labels.txt should be merged with de-dv. This is done via the paste command in the Ubuntu terminal:

ubuntu@ubuntu:~$ paste /media/104d991d-2062-40d7-89f6-ddde3cb5b781/home/ubuntu/Documents/2011-i/german-0.2.5/object/split/dedv/labels.txt /media/104d991d-2062-40d7-89f6-ddde3cb5b781/home/ubuntu/Documents/2011-i/german-0.2.5/object/split/de-dv > /media/104d991d-2062-40d7-89f6-ddde3cb5b781/home/ubuntu/Documents/2011-i/german-0.2.5/object/split/dedv/pasted.txt

The resulting document is named pasted.txt.

14. You can see that the document pasted.txt has a third column: The labels are the phonetic transcriptions!

15. Now let’s go back to Audacity > File > Import > Labels… Take a look at the result. Each label is a phonetic transcription of the corresponding recording.

16. Audacity > File > Export Multiple…
Export format: FLAC files
Export location: /media/104d991d-2062-40d7-89f6-ddde3cb5b781/home/ubuntu/Documents/2011-i/german-0.2.5/object/split/dedv/flac-dedv
Split files based on: Labels
Name files: Using Label/Track Name
Press the Export button.

17. Now you know how I create the FLAC files that are part of Schott’s German IPA FLAC files.

18. Let’s generate a PLS dictionary that contains about 841 entries. This is done in the Ubuntu terminal:

ubuntu@ubuntu:~$ cat /media/104d991d-2062-40d7-89f6-ddde3cb5b781/home/ubuntu/Documents/2011-i/german-0.2.5/german-0.2.7.xml | saxonb-xslt -ext:on -s:- -xsl:/media/104d991d-2062-40d7-89f6-ddde3cb5b781/home/ubuntu/Documents/2011-i/combine-0.2.4/compare.xsl

The result is a PLS dictionary at the following location: file:///media/104d991d-2062-40d7-89f6-ddde3cb5b781/home/ubuntu/Documents/2011-i/german-0.2.5/object/split/dedv/lexicon-dedv.xml

19. Now I need a prompts file. This is generated, too, via Ubuntu terminal:

ubuntu@ubuntu:~$ saxonb-xslt -ext:on -s:/media/104d991d-2062-40d7-89f6-ddde3cb5b781/home/ubuntu/Documents/2011-i/german-0.2.5/german-0.2.7.xml -xsl:’/media/104d991d-2062-40d7-89f6-ddde3cb5b781/home/ubuntu/Documents/2011-i/combine-0.2.4/lexicon2prompts.xsl‘ -o:’/home/ubuntu/Documents/dummy.xml’

20. Now it is time to upload the package file:///media/104d991d-2062-40d7-89f6-ddde3cb5b781/home/ubuntu/Documents/2011-i/german-0.2.5/object/split/dedv/german-ipa-flac-files-dedv-20110727.tar.bz2 to Voxforge.

21. Delete file:///home/ubuntu/.kde/share/apps/simon and file:///home/ubuntu/.kde/share/apps/simond.

22. Start simon.

23. I am skipping the next steps. Please read my article German speech model ‘dedq’ to get more details.

24. And now it is time to watch the video about this speech model:

The following words were recognized in the video:

Orakelspruchs Orangensaft Orangensaftes Orangenschale Orangenschalenstruktur Orangenscheibe Orangensekt Orangerie Orangerien Orangerücken Oranienburger Oranienburgs Orchester Orchesterbegleitung Orchesterkanzel Orchestergraben Orchesterbesoldung Orchestermitglieder Orchestermusik Orchestermusiker Orchesterprobe Orchesterraum Orchester Ordensgelübde Ordensschwester Ordensschwestern Orderpapier Ordination Ordnungsbehörde Ordnungsbehörden Ordnungsmacht Ordnungssystem Orffs Organbank Organe Organell Organelle Organells Organhandel Organigramm Organigramme Organigrammen Organigramms Organik Organisationsabteilung Organisationsaufgabe Organisationsaufgaben Organisationsausschuss Organisationsbegabung Organisationseinheit Organisationserfahrung Organisationsfachmann Organisationsform Organisationsformen Organisationsgabe Organisationskomitee Organisationslösung Organisationslösungen Organisationsmethoden Organisationsplan Organisationsplanung Organisationspsychologie Organisationsreform Organisationsstruktur Organisationsteam Organisator Organisatoren Organisators Organisierung Organismus Organist Organistin Organographie Orgelbauer Orchesters Orgelbauers Orgelklang Orgelklangs Orgelkonzert Orgelkonzerte Orgelmusik Orgel Orgelpfeife Orgelton Orgelwerke Orientbrücken Orienthandel Orientierungskrise Orientierungskrisen Orientierungspunkte Orientierungsstufe Ornithologie Origami Originalantwortschein Originalantwortscheine Organigrammen Organstreit Originalausgabe Originalausgaben Originalbeleg Organstreit Originaldiskette Originalersatzteil Originalfassung Originalgehäuse Orgelkonzerte Originalität Organizismus Originalprüfunterlagen Organells Orderscheck Originalschecks Organhandel Organstreitverfahren Originalversion Originalverpackung Originalversion Organ Organstreit Orkanen Organstreit Orkanschadens Orkantiefs Orkantiefs Orlando Orlandos Orléans Ornamentband Ornamentbands Ornamentbänder Ornamentbändern Ornamente Ornaments Orographie Orographien Orpheus Ortbeton Orte Orten Ortens Ortgang Ortgangbrett Orthodoxie Orthodoxien Ostgeschäft Orthographie Orthographiefehler Orthographiefehlern Orthographiefehlers Orthografien Orthographie Ortholexikon Ortholexikons Orthonormalbasis Orthopäde Orthopäden Orthopädie Orthopädien Ortleb Ortolf Ostkirche Ortsbehörden Ortsbesichtigung Ortsbezeichnung Ortsbild Ortschaftsrats Ortschaftsräte Ortschaftsräten Ortsdurchfahrten Ortsfremde Ortsfremden Ortsgebühr Ortsgespräche Ortsgesprächen Ortsgrammatik Ortsgruppe Ortsgruppenleiter Ortskirchen Ostkredite Ortskrankenkassen Orchestern Ortsmitte Ortsname Ortsnamen Ortsnetz Ortsnetze Ortsnetzen Ortsnetzes Ortleb Ortssendern Ortsteil Ortsteilen Ortsteils Orchester Ortsvektoren Ortsverbandes Ortsverzeichnis Ortsveränderung Ortsvorsitzende Ortsvorsteher Ostdeutschland Ortszulage Ortungsgeräte Ortung Ostblock Ostblockes Osteolyse Ostfriesentee Ostallgäu Ostdeutschlands Ost-Berliner Ost-Berlins Ost-SPD Ostwestfale Ost-West-Konflikt Ostafrika Ostafrikas Ostalgie Ostasien Ostbahnhof Ost-Berlins Ostbesuche Ostbesuchen Ostbewohner Ostblock Ostblockländer Ostblockländern Ostblockreisen Ostblockstaaten Ostbündnis Ostbündnisse Ostbündnissen Ostbündnisses Ostdeutschland Ostelbien Ostfront Ostens Osteoporose Osteoporosen Ozeanriesen Osteuropäer Osteuropas Osteuropäern Osteuropäers Ostexport Ostexports Ostfalen Ostfildern Ostfilderns Ostgebiete Ostgeschäft Ostprovinz Ostpreußen Ostpreußens Ostteil Ostsee Ostseebäder Ostseehandel Ostseeheilbad Ostseeinsel Oszillator Oszillatoren Ostwirtschaft Ostsektor Ovation Ovationen

You can see that there are a lot of recognition errors.

25. Now you know how I created the German speech model ‘dedv’, and how good / bad it is when used for recognition.

Fixing the problem with “setxkbmap de”

Wednesday, July 27th, 2011

Because I have problems with the German special characters like ö and ü, I am trying the following:

Type into the terminal “setxkbmap de” – and then start simon and ksimond.

By the way, my simon version is 0.3.0-1ubuntu8. I installed it a few months ago using this approach.

Yes, the German special characters are displayed correctly. I just dictated: “Mühlingen Müllern Mörike Mörtelgerüche Mühelosigkeit ” – great, it is working now.

German speech model ‘dedq’

Tuesday, July 26th, 2011

This article shows (A.) how I create the German speech model ‘dedq’, and (B.) how I dictate using this speech model.

A. Creation of the German speech model ‘dedq’

1. Delete file:///home/ubuntu/.kde/share/apps/simon and file:///home/ubuntu/.kde/share/apps/simond

2. Start simon.

3. Click the Vocabulary button. Press the Import Dictionary button. Select Target: Active Dictionary. Type of dictionary: PLS dictionary. The location of the PLS dictionary on my computer is as follows: /media/104d991d-2062-40d7-89f6-ddde3cb5b781/home/ubuntu/Documents/2011-i/german-0.2.5/object/split/dedq/german-ipa-flac-files-dedq-20110726/lexicon-dedq.xml You can find this dictionary file at Voxforge (37 MB; you can extract the dictionary file). Or here is an easier way: you can download the file dedq.xml (right click; Save Link as). The file dedq.xml is a valid PLS dictionary that you can import into simon.

4. Press the Grammar button. Add sentence: Adjektiv. Add sentence: Substantiv. Add sentence: Zahlwort

5. Click the Training button. Import trainingsdata.
Import prompts:
- Prompts: /media/104d991d-2062-40d7-89f6-ddde3cb5b781/home/ubuntu/Documents/2011-i/german-0.2.5/object/split/dedq/german-ipa-flac-files-dedq-20110726/prompts-dedq
- Base directory: /media/104d991d-2062-40d7-89f6-ddde3cb5b781/home/ubuntu/Documents/2011-i/german-0.2.5/object/split/dedq/german-ipa-flac-files-dedq-20110726/flac-dedq

You can get both files from Voxforge (see link above). Please keep in mind that some post-processing has to be enabled.

Go to Settings > Configure simon… > Recordings > Post Processing You can see the post processing command that causes sox to convert the FLAC files to WAV format.

6. Press the Commands button. Manage Plug-ins > Add > Dictation > Dictation > Append text after result: ” ” (enter just a space bar, then press the OK button).

7. Start ksimond. simon > Connect button. simon now starts with the compilation of the speech model. Let’s dictate a few words: “M;nchsfisch Mnchskopf Mhe Mllerin Mllerthal Mllschlucker” The German ö and ü aren’t displayed. Press the Activated button to stop the recognition.

8. Now let’s copy the files of the base model:

a. Copy hmmdefs:

cp /tmp/kde-ubuntu/simond/default/compile/hmm24/hmmdefs /media/104d991d-2062-40d7-89f6-ddde3cb5b781/home/ubuntu/Documents/2011-i/german-0.2.5/object/split/dedq/german-speech-model-dedq/hmmdefs-dedq

b. Copy macros:

cp /tmp/kde-ubuntu/simond/default/compile/hmm24/macros /media/104d991d-2062-40d7-89f6-ddde3cb5b781/home/ubuntu/Documents/2011-i/german-0.2.5/object/split/dedq/german-speech-model-dedq/macros-dedq

c. Copy stats:

cp /tmp/kde-ubuntu/simond/default/compile/stats /media/104d991d-2062-40d7-89f6-ddde3cb5b781/home/ubuntu/Documents/2011-i/german-0.2.5/object/split/dedq/german-speech-model-dedq/stats-dedq

d. Copy tiedlist:

cp /tmp/kde-ubuntu/simond/default/compile/tiedlist /media/104d991d-2062-40d7-89f6-ddde3cb5b781/home/ubuntu/Documents/2011-i/german-0.2.5/object/split/dedq/german-speech-model-dedq/tiedlist-dedq

9. It is now time to export the scenario file. Select the option Export to file.

10. You can download the German speech model ‘dedq’.

You now know how I created the German speech model ‘dedq’.

B. The following video demonstrates the dictation / recognition process. Unfortunately, Youtube limits the length to 15 minutes. Take a look at the German speech model ‘dedq’ in action:

These are the words that were recognized in the video:

M;nchengladbach Mnchkloster Mnchsfisch Mnchskopf Mnchskutte Mnchstum Mnsheim Mrderbande Mrtel Mrtelbrett Mrtelgeruch Mckenlarven Mckenschutz Mckenschwarm Mckenspray Mckenstich Mdigkeit Mgeln Mggelturm Mhelosigkeit Mhen Mhle Mhlen Mhlenanordnung Mhlenbedampfung Mhlenbetrieb Mhlendrehvorrichtung Mhleneinsatzdiagramm Mhlenausfallgut Mhlheim Mhlhausen Mhlingen Mhlrad Mhlteich Mlleimer Mllerin Mllers Mllerstrae Mllhaufen Mllpresse Mllhalde Mllhalden Mllhaufen Mllkasten Mllkippe Mllverbrennung Mllverbrennungsanlage Mllzerkleinerer Mnder Mndigkeit Mndungsbremse Mndungsdelta Mndungsdeltas Mndungsfeuer Mndungsfeuerdmpfer Mnsterbau Mnsterbaus Mnsterbauverein Mnsterbauvereins Mnstereifel Mnzer Mnze Mnzenberg Mnzwert Mrrhe Mtze Mtzen Mtzenich Mtzenmacher Mtzenschirm NEUNZEHNHUNDERT Mhlendrehvorrichtung NEUNZEHNHUNDERTACHTZIG Nabelbinde Nabelbruch Nabeln Nabenabdeckung Nabenabstand Nabenbremse Nabendynamo Nachbarplaneten Nachbarstdte Nachbehandelns Nachfolgekandidaten Nachfolgemodelle Nachfolgewert Nachforderungsmanagement Nachforschens Nachfrageinflation Nachfrageintensitt Nachfragelcke Nachfragestruktur Nachhilfekurse Nachhilfekursen Nachladung Nachladungen Nachmittagsschicht Nachrichtensendung Nachrcker Nachrstens Nachschlagebuch Nachschlagewerk Nachschlagewerke Nachschlagewerken Nachschlagewerks Nachschlssel Nachsetzen Nachspeise Nachspeisung Nachspur Nachwuchsausbildung Nachwuchsarbeit Nachwuchsbereich Nachwuchself Nachwuchsfrage Nachwuchskraft Nachwuchsfrderung Nachwuchskrften Nachwuchslufer Nachwuchsmannschaft Nachwuchsschwimmers Nachwuchsspieler Nachwuchsspieler Nachwuchstalent Nachwuchsteam Nachzgler Nachzglers Nachzndung Nadelhlse Nahhandel Nahhandels Nahrungsbestandteil Nahrungsmittelknappheit Nahrungsmittelkonserve Nahrungsmittelvergiftung Nahrungsreserven Nahrungsvorrte Naivitt Naivling Naivlinge Naivlingen Naivlings Namensaktien Namensaufruf Namensobligation Narkoseschwester Namensnderung Nastassia Natalia Natalie Natangen Natascha Nathan Nation Nationalarmee Nationalbank Nationalcharakter Nationalchina Nationaldemokratische Nationaleinkommen Nationalelf Nationalfarben Nationalfeiertag Nationalfeind Nationalflagge Nationalfriedhof Nationalgalerie Nationalgetrnk Nationalheld Nationalhelden Nationalheldin Nationalhymne Nationalinstitut Nationalismus Nationalisten Nationalitt Nationalitten Nationalittenkampf Nationalkongress Nationalkommunist Nationalkasse Nationalkonvent Nationalmannschaft Nationalmannschaften Nationalmuseum Nationalpartei Nationaltracht Nationalspiel Nationalspielerin Nationalspielerinnen Nationalspielers Nationalsport Nationalsprache Nationalsynode Nationalteam Nationaltheater Nationaltracht Nationaltruppen Nationalverband Nationalversammlung Nationalversammlungen Nationalwerksttten Natomitglieder Natrium Natriumbikarbonat Natriumnitrit Natriumnitrat Natriumkarbonat Natriumhydroxid Natriumphosphat Natriums Natriumsilikat Natriumstearat Natriumsulfat Naturalabgabe Naturalabgaben Naturaleinkommen Naturaleinkommens Naturalersatz Naturalpacht Naturalwirtschaft Naturanlagen Naturbedingung Naturereignis Naturerlebnis Naturfreund Naturgeschmack Naturgesetz Naturgesetze Naturgesetzen Naturgesetzes Naturgewalt Naturgre Naturgren Naturheilkunde Naturheilkundige Naturkosmetik Naturkautschuk Naturkunde Naturlehrpfad Naturmenschen Naturphilosophie Naturprodukt Naturprozess Naturprozesse Naturprozessen Naturprozesses Naturraum Naturrecht Naturreich Naturschauspiel Naturschilderung Naturschilderungen Naturschutz Naturschutzbund Naturschutzes Naturschutzexperte Naturschutzgebiet Naturschutzgebiete Naturschutzgesetz Naturschutzpark Naturschutzparks Naturschutzstelle Naturschutzbund Naturschden Naturschtze Naturschnheit Naturschtzer Naturschtzern Naturseide Naturstein Natursteinpflasterbelag Natursteinpflasterbelags Natursteinpflasterbelge Natursteinpflasterbelgen Natursteinverkleidung Naturstrand Naturzement Nauheim Nauheims Naumburg Naumburgs naturwissenschaftliche

German speech model ‘dedl’

Monday, July 25th, 2011

Import the German speech model ‘dedl’ into simon. Find the corresponding source files at section dedl, or at Voxforge.

I have recorded a video:


In the video, the following words were (partially wrong) recognized:

Merlan Memory Memoranden Memorandums Memorialbild Memorialbilder
Memorialbildern Memorialbilds Memorialquelle Memorialquellen Memory
Menschenfeindes Menarche Mendel Mendels Mendelssohn Menschwerdens (more…)

simon via Synaptic Package Manager

Wednesday, April 6th, 2011

A few minutes ago, I did the following:
1. Ubuntu terminal:

sudo add-apt-repository ppa:grasch-simon-listens/simon
sudo apt-get update

2. Installed simon via Synaptic Package Manager. Obviously, it is working.

German speech model ‘friedrich’

Friday, February 18th, 2011

You can import the German speech model 'friedrich' (0.5 MB, GPLv3) into simon. The package contains all necessary files: hmmdefs-friedrich, tiedlist-friedrich, macros-friedrich, stats-friedrich, and of course the scenario-file scenario-friedrich.xml.

Edit: Download the video German speech model ‘friedrich’ (40 min., 47 MB, WMV, link will become invalid soon)

Edit: These words are recognized in the video:

ARBEITSAUFNAHME Aachener Rachens Einsparens Aalen Racheschach RAF
Abarbeiten Barockzeit Abarbeitungszyklus Abartung Abbauvermgen
Abbauvertrag Abberufens Abberufungen Abbestellungen Abbiegevorgang
Abbiegevorgangs Abbiegevorgnge Abbiegevorgngen Abbindebehandlung
Abbindebereich Abbindebeschleuniger Abbindebeschleunigung Abbindedauer
Abbitte Abbindung Abbindezeit Abbindewasser Abbindeverhalten (more…)

Latin speech model ‘xaf’

Tuesday, February 15th, 2011

A few months ago, I published the Latin speech model ‘xaa’. You can find the corresponding audio files at VoxForge, too.

Please, download the Latin speech model 'xaf' that contains words from section xaf.

Do you know how to import the Latin speech model 'xaf' into simon? No? Then read this article, and take especially a look at this screen shot:

Short explanation: the Latin speech model 'xaf' contains the files hmmdefs-xaf, tiedlist-xaf, macros-xaf, and stats-xaf. You have to set the correct paths to these files.

manage-scenariosThen, you have to use the manage scenario function.

I hope that there is someone out there who tries to import the Latin speech model 'xaf' into simon. My recognition results were poor. But never mind, these are just the first steps.

Ralf’s German speech model 0.1.9.2

Wednesday, December 1st, 2010

You can download Ralf’s German speech model version 0.1.9.2. It contains 36000 German words (from sections: alpha bravo charlie diego echo and friedrich).

Unfortunately, the recognition rate is 0% on my computer. I don’t know why this is the case.