Posts Tagged ‘Latvian’

Ralf’s Latvian speech model

Wednesday, May 16th, 2012

Some words about the creation of Ralf’s Latvian speech model:

1. Get Ralf’s Latvian dictionary.
2. Create a Latvian scenario.
3. Import the dictionary as shadow dictionary into simon.

4. Now I want to train 10 Latvian words. Press the button Train selected words.

5. Simon is asking now:

Your vocabulary does not define all words used in this text. These words are missing:
jâpârkurc, Lauciene, laukumains, olimpiete, olimpiâde, piekliegt, satriecâs, uzpletâm, þûpîba, þûþas

Do you want to add them now?

Press the Yes button.


Ralf’s Latvian dictionary 0.1.1

Monday, May 17th, 2010

How I improve Ralf's Latvian dictionary:

1. Take a look at the Latvian pronunciation.

2. Language code is lv. Edit espeak2ipa.xsl. The section that is relevant to the Latvian language begins with matches(/lexicon/@xml:lang, 'lv').

3. Convert eSpeak phonemes into IPA phonemes:

$ cat '/media/5f6432a3-9a68-45ee-b4b7-11f3b009825a/home/am3msi/Documents/200911/latvian/latvian-dictionary.xml.bz2' | bunzip2 -k | saxonb-xslt -ext:on -s:- -xsl:'/home/ubuntu/Documents/201005/dict-phonemes-espeak2ipa/espeak2ipa.xsl'

4. Download Ralf's Latvian dictionary, and import it into simon. Take a look at the imported PLS dictionary:

latvian The word column offers 154740 Latvian words. The pronunciation column contains the corresponding SAMPA transcriptions.

5. Is there a native speaker who wants to improve Ralf's Latvian dictionary? Just do it, the license of the dictionary is GPLv3.

Import 150.000 Latvian words

Monday, November 9th, 2009

You can now import Ralf's Latvian dictionary (version 0.1; GPLv3) into simon. Unfortunately, training with this dictionary is currently almost impossible.

The phoneme elements contain eSpeak ASCII characters (not IPA characters).