Ralf’s French dictionary 0.1.2 released

A few minutes ago, I uploaded Ralf's French dictionary version 0.1.2 (license: GPLv3). Download the dictionary, and import it into simon as PLS dictionary. I applied the following changes to the dictionary:

1. I added an empty role attribute to each <lexeme> element. At the moment, Ralf's French dictionary doesn’t contain any terminal information (noun, verb, adjective). It is possible to add terminal information to this dictionary with a simple text editor.

2. I changed another thing: The previous version of Ralf's French dictionary contained about 60.000 duplicate <lexeme> elements. I removed these elements with the following XSLT expression:

<xsl:for-each-group select="lexeme" group-by="grapheme">

You can find this line in the style-sheet improve-french-dictionary.xsl (license: GPLv3). I used this style-sheet to generate version 0.1.2 using the Ubuntu terminal:

am3msi@am3msi-desktop:~/Documents/201004/french-dictionary$ saxonb-xslt -ext:on -s:french-dictionary.xml -xsl:improve-french-dictionary.xsl -o:french-dictionary-0.1.2.xml

3. The language tag is now correct: xml:lang="fr"

Unfortunately, at the moment the French phonemes ɔ̃ — ɛ̃ — ɑ̃ — ɥ aren’t transcribed into the correct SAMPA phonemes during the simon import process. It shouldn’t be a big deal to fix this. As soon as this issue has been fixed, Ralf's French dictionary should be usable for training of French words. Remember: the dictionary contains more than 300.000 French <lexeme> elements. It would be nice if a native speaker from France would take a closer look at Ralf's French dictionary, and make suggestions for improvements.

Tags: ,

Comments are closed.