What happens with French phonemes?

I want to import Ralf's French dictionary (version 0.1.1; November 03, 2009), and see what could be improved.

1. I found a small error. It says in the XML file:

<lexicon version="1.0" alphabet="ipa" xml:lang="de">

Well, the language tag shouldn’t be de. It should be fr.

2. What happens with the French phonemes when I import the dictionary? Here are a few examples with French phonemes:

2.a.

<lexeme>
<grapheme>dissolussions</grapheme>
<phoneme>disɔlysjɔ̃</phoneme>
</lexeme>

2.b.

<lexeme>
<grapheme>spiritains</grapheme>
<phoneme>spiʀitɛ̃</phoneme>
</lexeme>

2.c.

<lexeme>
<grapheme>transportable</grapheme>
<phoneme>tʀɑ̃spoʀtabl</phoneme>
</lexeme>

2.d.

<lexeme>
<grapheme>habituiez</grapheme>
<phoneme>abitɥie</phoneme>
</lexeme>

2.e.

<lexeme>
<grapheme>mademoiselle</grapheme>
<phoneme>madəmwazɛl</phoneme>
</lexeme>

I want to know what happens with these specific <phoneme> elements when I import Ralf's French dictionary into simon:

disɔlysjɔ̃
spiʀitɛ̃
tʀɑ̃spoʀtabl
abitɥie
madəmwazɛl

Here are the results:

2.a. disɔlysjɔ̃

dissolussions

The SAMPA result is crap: d i s O l y s j O n a s
A future version of simon should import the French phoneme ɔ̃ correctly. This phoneme occurs even in the German words Balkon, Ballon that have French origin.

2.b. spiʀitɛ̃

spiritains

The SAMPA result is crap, too: s p i R i t E n a s
The simon PLS import process should be adjusted so that the French phoneme ɛ̃ is represented correctly.

2.c. tʀɑ̃spoʀtabl

transportable

The SAMPA result is crap: t R A n a s s p o R t a b l
This is the third import phoneme error that should be corrected. Let’s take a look at the next French word:

2.d. abitɥie

habituiez

Is the SAMPA result abitHie acceptable? No, it isn’t. There should be a space between each phoneme. The French IPA phoneme ɥ has been transformed during the import into the SAMPA phoneme H. Probably, this is an error, but I am not sure. At least, the spaces between the phonemes are missing. So this is the fourth error that should be fixed. Let’s take a look at the next word:

2.e. madəmwazɛl

mademoiselle

The SAMPA phonemes are correct: m a d @ m w a z E l The w phoneme has been imported correctly by simon.

Conclusion: The simon import process should be adjusted so that all phonemes in the French IPA transcriptions disɔlysjɔ̃, spiʀitɛ̃, tʀɑ̃spoʀtabl, and abitɥie are converted correctly into SAMPA. As soon as these phoneme import errors are corrected, I will think about a video that demonstrates that simon recognises French words.

I demonstrated in this 30 MB video that simon recognizes 200 German words (Ralf's German dictionary). It should be possible to get a similar result with Ralf's French dictionary. But first, it is necessary that the errors during import are being fixed.

Tags: ,

One Response to “What happens with French phonemes?”

  1. Peter Grasch says:

    Hi!

    Please open a bug at the simon bug tracker on sourceforge (http://sourceforge.net/tracker/?group_id=190872&atid=935103) stating all the errors and you proposed changes in more detail (what should be transcribed as what).

    Thank you.

    Greetings,
    Peter