A few days ago, my Socket 939 computer (64-bit) which has simon revision 1090 installed on it stopped working. Because of that I want to try to install the current revision 1112 on a 32-bit Ubuntu 9.10 computer. This is what I do (following these and these instructions :
1. $ sudo apt-get install subversion build-essential cmake bison flex gettext gettext-kde kdeartwork kdelibs5-dev portaudio19-dev libxtst-dev libqt4-sql-sqlite libqt4-phonon-dev kdelibs4c2a
I don’t know how to install libattica. I will try it without libattica.
2. $ cd Documents/201001
3. $ svn co https://speech2text.svn.sourceforge.net/svnroot/speech2text/trunk simonsource
4. Checked out revision 1112.
5. $ cd simonsource
6. $ ./build_ubuntu.sh
It didn’t work out:
I hope that the simon developers will fix this issue.
Why am I interested in the development version? Because I want to adjust my work flow for speech model development. simon/sam should increase my productivity. I want to publish a speech model that works (more or less) out of the box. But first, I have to build one. This future speech model can be used by people who have a similar voice like me (I am not planning to build a speaker independent speech model; I want to offer only a speaker dependent speech model with my own voice). To achieve this goal, I need the development version.
A lot of people will be interested in simon if it works for dictation out of the box (without the need to install HTK first; without the need to record a few training samples). So my goal is to develop a speech model with > 200 German words (maybe up to 1000 German words) that the user can import into simon, and use it directly for dictation. Of course, the recognition rate probably would be pretty low. But the important thing would be: it would attract more people who maybe would invest some time to submit German speech to VoxForge.
So it has to work. The installation has to work. And the recognition has to work (even if the recognition rate is very low). The average user won’t invest more than 20 minutes of his precious time when trying simon for speech recognition. He wants that the computer recognizes his voice. Either speech recognition works within 20 minutes, or the average user will lose interest in simon.
To make simon successful, it is necessary to offer speech model packs for the following languages: German (use Ralf's German dictionary – PLS format), English (use VoxForgeDict – HTK format), French (I have found one in the internet in Sphinx format), Spanish (I have found one at Voxforge, I think. Maybe someone is willing to edit Ralf's Spanish dictionary – it should be possible because the Spanish pronunciation rules are fairly regular).
I would try to prepare speech models for these four major Western languages if simon/sam works sufficiently (especially the import/export functionality has to work). So my personal focus are the following languages:
German – 105 million native speakers
English – 350 million native speakers
French – 110 million native speakers
Spanish – 329 million native speakers
German + English + French + Spanish = 894 million native speakers
These languages are pretty similar. simon should offer automatic dictionary import for these languages (German is already covered by Hadifix import).
It is necessary to make it as easy as possible for the average user. The German Hadifix dictionary (probably the best German dictionary currently available) can be imported automatically into simon. Why not extend this import function to other dictionaries? E.g. a future version of simon could download / import
- Ralf's German dictionary (advantage: GPLv3 – it is no problem to use this dictionary for the development of GPLv3 speech models).
- VoxForgeDict (advantage: probably very good phoneme quality).
This concept could be extended to other dictionaries (Ralf’s Spanish dictionary, Ralf’s French dictionary, …).
An automatic import function of my PLS dictionaries (I am offering most languages that eSpeak can handle) would make simon more attractive. Think about it: a user who lives in the Indian subcontinent whose native language is Tamil (66 million native speakers) could do the following: Install simon, and choose Ralf's Tamil dictionary for automatic import into simon. Of course, this dictionary contains at the moment eSpeak characters so that the imported phonemes aren’t usable. But that is OK for the beginning. Because this problem can be fixed later.
You understand why help from native speakers is needed. I don’t speak a word Tamil. But what the simon developers could do: offer an automatic dictionary import function for 27 languages. These are the paths to the dictionaries:
- http://script.blau.in/afrikaans-dictionary.xml.bz2
- http://script.blau.in/catalan-dictionary.xml.bz2
- http://script.blau.in/croatian-dictionary.xml.bz2
- http://script.blau.in/czech-dictionary.xml.bz2
- http://script.blau.in/dutch-dictionary.xml.bz2
- http://script.blau.in/english-dictionary.xml.bz2
- http://script.blau.in/esperanto-dictionary.xml.bz2
- http://script.blau.in/french-dictionary.xml.bz2
- http://script.blau.in/german-dictionary.xml.bz2
- http://script.blau.in/greek-dictionary.xml.bz2
- http://script.blau.in/hindi-dictionary.xml.bz2
- http://script.blau.in/icelandic-dictionary.xml.bz2
- http://script.blau.in/italian-dictionary.xml.bz2
- http://script.blau.in/kurdish-dictionary.xml.bz2
- http://script.blau.in/latin-dictionary.xml.bz2
- http://script.blau.in/latvian-dictionary.xml.bz2
- http://script.blau.in/norwegian-dictionary.xml.bz2
- http://script.blau.in/polish-dictionary.xml.bz2
- http://script.blau.in/portuguese-dictionary.xml.bz2
- http://script.blau.in/romanian-dictionary.xml.bz2
- http://script.blau.in/russian-dictionary.xml.bz2
- http://script.blau.in/slovak-dictionary.xml.bz2
- http://script.blau.in/spanish-dictionary.xml.bz2
- http://script.blau.in/swahili-dictionary.xml.bz2
- http://script.blau.in/swedish-dictionary.xml.bz2
- http://script.blau.in/tamil-dictionary.xml.bz2
- http://script.blau.in/vietnamese-dictionary.xml.bz2
You can see: 27 PLS dictionaries are available. All PLS dictionaries are GPLv3 (I got most word lists from OpenOffice.org spelling dictionaries – obviously, most of the word lists are GPL; some are not GPL – I didn’t use them). So there is no licensing problem. You can be sure that there is no copyright infringement because OpenOffice.org is a very good source. I didn’t use word lists without an explicit GPL license.
An example: Welsh is offered by eSpeak, but I didn’t find a GPL word list with Welsh words. So I didn’t build a dictionary for this language. I took a look into the license file (included in the Welsh spelling dictionary):
# This dictionary is covered by the GNU General Public License,
[...]
# Redistribution and use in source and binary forms, with or without# modification, are permitted provided that the following conditions
# are met:
[...]
# 3. All modifications to the source code must be clearly marked as# such. Binary redistributions based on modified source code
# must be clearly marked as modified versions in the documentation
# and/or other materials provided with the distribution.
#
# 4. The name of [...] may not be
# used to endorse or promote products derived from this software
# without specific prior written permission.
You can see that this license seems to be a mixed license: GPL + modifications. I can’t work with such a license. So I couldn’t use this Welsh word list for the development of a Welsh PLS dictionary. You can see: I check the license very exactly when building my dictionaries. I only use sources that clearly use the GPL without any modification.
I am using only one license for my dictionaries: GPL (almost all dictionaries are GPLv3, maybe a very old version of my German PLS dictionary is GPLv2). The concept is easy to understand: Only GPL. One license. No difficult dual-licensing or triple-licensing scheme. Voxforge collects speech under the GPL. My dictionaries are GPL. Easy, isn’t it?
simon should try to enter the market for these 27 languages. Why not offer automatic import for these 27 dictionaries? OK, eSpeak phonemes aren’t working right now. But that can be fixed later with the help of native speakers (or adjust the simon import process to eSpeak phonemes).
So my wish is an automatic import function for these PLS dictionaries. This makes it easier for interested people to become involved.
simon should try to gain market share. At the moment, the user has to do two steps:
1. Download a dictionary from the internet (he has to know where to find such a dictionary, but most users don’t have a clue).
2. Import this dictionary into simon.
These two steps could be combined into one single step: automatic download & import via a wizard. 27 PLS dictionaries could be offered at the moment. I don’t have a bandwidth limit, so it would be OK to get the dictionary directly from script.blau.in.
Marketing is the big deficit of open source ASR projects. An automatic dictionary import directly from the internet for 27 languages would be good for marketing.



