Let’s take a look into Ralf's German dictionary:
<lexeme>
<grapheme>vereinbarten</grapheme>
<phoneme>fɛʀaɪ̯nbaʀtən</phoneme>
<phoneme>fɛʀaɪ̯nbaʀtn̩</phoneme>
<phoneme>fɛʀʔaɪ̯nbaʀtən</phoneme>
</lexeme>
The last phoneme element contains a Knacklaut (ʔ). When importing the dictionary into simon, the Knacklaut is being omitted. Instead, simon displays: f E R aI n b a R t @ n
Is this OK, or should the simon import process be adjusted? Here are a few thoughts:
1. The Wiktionary mentions the Knacklaut (U+0294):
Theater /[teˈʔaːtɐ]/, beantworten /[bəˈʔantvɔʁtn̩]/
2. In the Wikipedia you can find the following information:
“der Glottisschlag in vielen Varietäten der Standardsprache auftritt. In der Duden-Grammatik wird er aber durch einen senkrechten Strich [|] wiedergegeben und sonst im Duden durch einen Apostroph.”
The Knacklaut occurs often in German:
“In den meisten Varietäten der deutschen Sprache erscheint ein Glottisschlag in den folgenden Fällen:
* Vor vokalischem Anlaut, beispielsweise Acht [ˈʔaxt], der Alte [deːr ˈʔaltə].
* Vor vokalisch anlautenden Wortstämmen in zusammengesetzten Wörtern, beispielsweise beachten [bəˈʔaxtən], Spiegelei [ˈʃpiːɡəlˌʔaɪ].”
3. Maybe I will prepare a Swiss German dictionary (the OpenOffice.org spelling dictionaries are a very good source for that). I will take a look at this de-CH dictionary (probably GPL). It might be necessary to filter out the Knacklaut:
“Im Schweizer Hochdeutschen tritt der Glottisschlag oft nicht auf.”
4. Is the Knacklaut a phoneme?
“In den meisten phonologischen Analysen des Deutschen wird der Glottisschlag nicht als eigenständiges Phonem betrachtet, sondern als phonetischer Grenzmarkierer vor Vokalanlaut, da er nicht in allen Varietäten der deutschen Standardsprache erscheint.”
It is something in between.
5. eSpeak produces symbols for the Knacklaut. Timo’s espeak2Phones.pl script outputs the Q as symbol for the Knacklaut:
'_!' => 'Q', '_|' => 'Q'
The Q is used as ASCII symbol for the Knacklaut. I have implemented that in my XSLT stylesheet:
<xsl:variable name="sierra" select="replace($sierra, '_!', 'Q')"/>
<xsl:variable name="sierra" select="replace($sierra, '_\|', 'Q')"/>
6. The Dictionary Acquisition Project displays the IPA symbol ʔ for the Knacklaut when you enter Q or ? (question-mark).
7. Probably, the proprietary Hadifix Bomp dictionary makes use of the Knacklaut:
'?a:l|
The question-mark indicates the Knacklaut.
Conclusion: I would say that the Knacklaut is a phone that should be imported by simon. The Q could be used as SAMPA symbol for the Knacklaut. The result could look like this: f E R Q aI n b a R t @ n
Probably, the ? (question-mark) as SAMPA symbol for the Knacklaut would cause problems with HTK.
Tags: fɛʀʔaɪ̯nbaʀtən, U+0294, ʔ
This is very interesting. I haden’t though about it in quite some time because the dictionary importing / phoneme set decisions were quite some time ago and I hadn’t put much thought into it then figuring I’ll just change it later.
Maybe I should do some recognition tests if the system improves when retaining the “?” phoneme…
Greetings,
Peter
Yes, some recognition tests would be fine.