Posts Tagged ‘Karmic Koala’

Installing branches/no-scenarios

Sunday, January 17th, 2010

The branches/no-scenarios might be a good choice (because I failed to install revision 1116). Here is what I do:

$ cd Documents/201001
$ svn co https://speech2text.svn.sourceforge.net/svnroot/speech2text/branches/no-scenarios/ simon-no-scenarios
am3msi@am3msi-desktop:~/Documents/201001/simon-no-scenarios$ ./build_ubuntu.sh

Obviously, it worked out:

Issue "simon" to start it.
am3msi@am3msi-desktop:~/Documents/201001/simon-no-scenarios$ simon

Yes, everything is fine. Now I can use simon on my new AM3 computer.

Tutorial: how to install under Ubuntu

Friday, January 8th, 2010

This tutorial explains how to install simon under Ubuntu, and how to import Ralf’s Portuguese dictionary.

1. Download simon.
2. Double-click on simon-0.2-Linux_i386.deb:

ubuntu-deb

3. Press Install Package:

install-speechrecognition

4. Enter the password that you had chosen during your Ubuntu installation:

administrative-rights

Press the OK button.

5. The installation has been finished. The package simon-0.2-Linux_i386.deb has been installed:

installation-finished

Press the Close button.

6. Select Applications > Universal Access > simon:

universal-access

7. Take a look at the simon main window:

press-wordlist

Press the Wordlist button.

8. The Wordlist tab has opened:

import-dictionary

Press the Import Dictionary button.

9. You have to select the type of the dictionary:

select-dictionary-type

Choose PLS Lexicon, then press the Next button.

Note for simon development team: it would be nice if simon now offered a list of the 27 PLS dictionaries that are available.

10. You can now import one of my 27 PLS dictionaries. In the sidebar of testing simon, you can find a PLS dictionary that you can import:

sidebar-pls

Right-click on Ralf’s Portuguese dictionary, then Save Link As....

11. The dictionary with the name portuguese-dictionary.xml.bz2 will be saved:

save-portuguese-dictionary

It will be saved in the Downloads folder. Press the Save button.

12. It is time to import Ralf's Portuguese dictionary that you have just downloaded:

select-pls-file

Please press the File button to point simon to the downloaded PLS dictionary.

13. Select the Downloads folder:

select-downloads-folder

14. Select portuguese-dictionary.xml.bz2:

select-portuguese-dictionary

On my computer, I didn’t have to press the OK button.

15. simon displays the path to the PLS dictionary:

path-to-dictionary

Note for simon development team: it is pretty complicated to first download, and then import the dictionary. It would be nice if the wizard offered automatic download directly from the internet.

My guess is: the average user begins to lose interest in simon at this point of the installation because he already has invested about 20 minutes of his precious time. It is getting annoying. Don’t annoy the user! Offer automatic PLS dictionary import directly from the internet.

The automatic BOMP import is a great thing. But not everybody is a native German speaker. At the moment, I am offering 27 different languages. An automatic import would make simon much more interesting for a lot of people. E.g. Portuguese is spoken by 200 million native speakers. Recently, someone showed interest in Portuguese at Voxforge. You can imagine that almost all people don’t have a clue where to start, and what is necessary. Helping people from a lot of different countries would be so easy by adding an automatic import function to the wizard.

Press the Next button.

16. simon is now processing the lexicon:

processing-lexicon

What does that mean? It means that simon converts the dictionary from PLS format into HTK format. This process works fine for Ralf's German dictionary. The process is not yet optimized for the other PLS dictionaries. If you are a native speaker of Portuguese (European), you can edit Ralf's Portuguese dictionary with a simple text editor.

17. Ralf's Portuguese dictionary has been imported:

imported-successfully

Press the Finish button.

18. The dictionary is now available:

portuguese-shadow

a. Select Include unused words from the shadow lexicon.
b. Use the scroll bar to get an impression of the Portuguese dictionary.
c. First column: word. Second column: corresponding pronunciation.

19. Let’s finish here. Now you know how to install simon under Ubuntu, and how to import Ralf's Portuguese dictionary into simon.

There are more steps necessary to make it work:
- install HTK;
- record a few training samples;
- define a grammar;
- start ksimond (PDF).

Take a look into the simon handbook to find out more about simon.

Try to install revision 1112 on 32-bit Ubuntu

Wednesday, January 6th, 2010

A few days ago, my Socket 939 computer (64-bit) which has simon revision 1090 installed on it stopped working. Because of that I want to try to install the current revision 1112 on a 32-bit Ubuntu 9.10 computer. This is what I do (following these and these instructions :
1. $ sudo apt-get install subversion build-essential cmake bison flex gettext gettext-kde kdeartwork kdelibs5-dev portaudio19-dev libxtst-dev libqt4-sql-sqlite libqt4-phonon-dev kdelibs4c2a
I don’t know how to install libattica. I will try it without libattica.
2. $ cd Documents/201001
3. $ svn co https://speech2text.svn.sourceforge.net/svnroot/speech2text/trunk simonsource
4. Checked out revision 1112.
5. $ cd simonsource
6. $ ./build_ubuntu.sh
It didn’t work out:

revision-1112

I hope that the simon developers will fix this issue.

Why am I interested in the development version? Because I want to adjust my work flow for speech model development. simon/sam should increase my productivity. I want to publish a speech model that works (more or less) out of the box. But first, I have to build one. This future speech model can be used by people who have a similar voice like me (I am not planning to build a speaker independent speech model; I want to offer only a speaker dependent speech model with my own voice). To achieve this goal, I need the development version.

A lot of people will be interested in simon if it works for dictation out of the box (without the need to install HTK first; without the need to record a few training samples). So my goal is to develop a speech model with > 200 German words (maybe up to 1000 German words) that the user can import into simon, and use it directly for dictation. Of course, the recognition rate probably would be pretty low. But the important thing would be: it would attract more people who maybe would invest some time to submit German speech to VoxForge.

So it has to work. The installation has to work. And the recognition has to work (even if the recognition rate is very low). The average user won’t invest more than 20 minutes of his precious time when trying simon for speech recognition. He wants that the computer recognizes his voice. Either speech recognition works within 20 minutes, or the average user will lose interest in simon.

To make simon successful, it is necessary to offer speech model packs for the following languages: German (use Ralf's German dictionary – PLS format), English (use VoxForgeDict – HTK format), French (I have found one in the internet in Sphinx format), Spanish (I have found one at Voxforge, I think. Maybe someone is willing to edit Ralf's Spanish dictionary – it should be possible because the Spanish pronunciation rules are fairly regular).

I would try to prepare speech models for these four major Western languages if simon/sam works sufficiently (especially the import/export functionality has to work). So my personal focus are the following languages:

German – 105 million native speakers
English – 350 million native speakers
French – 110 million native speakers
Spanish – 329 million native speakers

German + English + French + Spanish = 894 million native speakers

These languages are pretty similar. simon should offer automatic dictionary import for these languages (German is already covered by Hadifix import).

It is necessary to make it as easy as possible for the average user. The German Hadifix dictionary (probably the best German dictionary currently available) can be imported automatically into simon. Why not extend this import function to other dictionaries? E.g. a future version of simon could download / import
- Ralf's German dictionary (advantage: GPLv3 – it is no problem to use this dictionary for the development of GPLv3 speech models).
- VoxForgeDict (advantage: probably very good phoneme quality).

This concept could be extended to other dictionaries (Ralf’s Spanish dictionary, Ralf’s French dictionary, …).

An automatic import function of my PLS dictionaries (I am offering most languages that eSpeak can handle) would make simon more attractive. Think about it: a user who lives in the Indian subcontinent whose native language is Tamil (66 million native speakers) could do the following: Install simon, and choose Ralf's Tamil dictionary for automatic import into simon. Of course, this dictionary contains at the moment eSpeak characters so that the imported phonemes aren’t usable. But that is OK for the beginning. Because this problem can be fixed later.

You understand why help from native speakers is needed. I don’t speak a word Tamil. But what the simon developers could do: offer an automatic dictionary import function for 27 languages. These are the paths to the dictionaries:

  1. http://script.blau.in/afrikaans-dictionary.xml.bz2
  2. http://script.blau.in/catalan-dictionary.xml.bz2
  3. http://script.blau.in/croatian-dictionary.xml.bz2
  4. http://script.blau.in/czech-dictionary.xml.bz2
  5. http://script.blau.in/dutch-dictionary.xml.bz2
  6. http://script.blau.in/english-dictionary.xml.bz2
  7. http://script.blau.in/esperanto-dictionary.xml.bz2
  8. http://script.blau.in/french-dictionary.xml.bz2
  9. http://script.blau.in/german-dictionary.xml.bz2
  10. http://script.blau.in/greek-dictionary.xml.bz2
  11. http://script.blau.in/hindi-dictionary.xml.bz2
  12. http://script.blau.in/icelandic-dictionary.xml.bz2
  13. http://script.blau.in/italian-dictionary.xml.bz2
  14. http://script.blau.in/kurdish-dictionary.xml.bz2
  15. http://script.blau.in/latin-dictionary.xml.bz2
  16. http://script.blau.in/latvian-dictionary.xml.bz2
  17. http://script.blau.in/norwegian-dictionary.xml.bz2
  18. http://script.blau.in/polish-dictionary.xml.bz2
  19. http://script.blau.in/portuguese-dictionary.xml.bz2
  20. http://script.blau.in/romanian-dictionary.xml.bz2
  21. http://script.blau.in/russian-dictionary.xml.bz2
  22. http://script.blau.in/slovak-dictionary.xml.bz2
  23. http://script.blau.in/spanish-dictionary.xml.bz2
  24. http://script.blau.in/swahili-dictionary.xml.bz2
  25. http://script.blau.in/swedish-dictionary.xml.bz2
  26. http://script.blau.in/tamil-dictionary.xml.bz2
  27. http://script.blau.in/vietnamese-dictionary.xml.bz2

You can see: 27 PLS dictionaries are available. All PLS dictionaries are GPLv3 (I got most word lists from OpenOffice.org spelling dictionaries – obviously, most of the word lists are GPL; some are not GPL – I didn’t use them). So there is no licensing problem. You can be sure that there is no copyright infringement because OpenOffice.org is a very good source. I didn’t use word lists without an explicit GPL license.

An example: Welsh is offered by eSpeak, but I didn’t find a GPL word list with Welsh words. So I didn’t build a dictionary for this language. I took a look into the license file (included in the Welsh spelling dictionary):

# This dictionary is covered by the GNU General Public License,
[...]
# Redistribution and use in source and binary forms, with or without

# modification, are permitted provided that the following conditions

# are met:
[...]
# 3. All modifications to the source code must be clearly marked as

# such. Binary redistributions based on modified source code

# must be clearly marked as modified versions in the documentation

# and/or other materials provided with the distribution.

#

# 4. The name of [...] may not be

# used to endorse or promote products derived from this software

# without specific prior written permission.

You can see that this license seems to be a mixed license: GPL + modifications. I can’t work with such a license. So I couldn’t use this Welsh word list for the development of a Welsh PLS dictionary. You can see: I check the license very exactly when building my dictionaries. I only use sources that clearly use the GPL without any modification.

I am using only one license for my dictionaries: GPL (almost all dictionaries are GPLv3, maybe a very old version of my German PLS dictionary is GPLv2). The concept is easy to understand: Only GPL. One license. No difficult dual-licensing or triple-licensing scheme. Voxforge collects speech under the GPL. My dictionaries are GPL. Easy, isn’t it?

simon should try to enter the market for these 27 languages. Why not offer automatic import for these 27 dictionaries? OK, eSpeak phonemes aren’t working right now. But that can be fixed later with the help of native speakers (or adjust the simon import process to eSpeak phonemes).

So my wish is an automatic import function for these PLS dictionaries. This makes it easier for interested people to become involved.

simon should try to gain market share. At the moment, the user has to do two steps:
1. Download a dictionary from the internet (he has to know where to find such a dictionary, but most users don’t have a clue).
2. Import this dictionary into simon.

These two steps could be combined into one single step: automatic download & import via a wizard. 27 PLS dictionaries could be offered at the moment. I don’t have a bandwidth limit, so it would be OK to get the dictionary directly from script.blau.in.

Marketing is the big deficit of open source ASR projects. An automatic dictionary import directly from the internet for 27 languages would be good for marketing.

Video: Recognize 200 German words

Sunday, December 27th, 2009

Download the video: Recognize 200 German words under Ubuntu (30 MB; 13 minutes; the video will be replaced from time to time as soon as I have trained more words).

100 % of the words were recognized correcty. All words that I am dictating in this video are included in Ralf's German dictionary.

This video proves:
- simon works well under Ubuntu 9.10 (64-bit);
- Ralf's German dictionary allows good recognition results;
- up to 100 % error free is possible (I didn’t expect such a good result).

Each word in this dictation video has been trained 3 times.

Segmentation fault

Wednesday, December 23rd, 2009

A few minutes ago, simon crashed. Here is the report. I don’t know yet how to create useful crash reports. I will figure that out later.

At least I can say that simon was able to recognize the word überflutender (SAMPA: y: b ah f l u: t @ n d ah). So it works. I am not sure which sound card is beeing used (I used two microphones at the same time: on-board sound card and USB sound card).

How can I update to the current svn version?

Tuesday, December 15th, 2009

I want to update to the current simon svn version. Here is what I do:
1. Type the following command into the terminal: $ sudo apt-get install subversion build-essential cmake bison flex gettext gettext-kde kdeartwork kdelibs5-dev portaudio19-dev libxtst-dev libqt4-sql-sqlite libqt4-phonon-dev
2. liberty@liberty-desktop:~/200908$ svn co https://speech2text.svn.sourceforge.net/svnroot/speech2text/
Checked out revision 1088.
3. Unfortunately, compiling wasn’t successful:
liberty@liberty-desktop:~/200908/speech2text/trunk$ ./build_ubuntu.sh (more...)