This is what I am currently doing: I checked out revision 1040.
I want to import the prompts 01 into simon. First, I had to convert the 40 flac files into wav files with the following command:
liberty@liberty-desktop:~/200910/editing-ralfherzog/01$ for f in *.flac; do sox "$f" -t wav -r 16000 -s -c 1 "subfolder/${f%.flac}.wav"; done
Then I transformed the file http://script.blau.in/german/01/prompts.xml into (almost) HTK compatible format with the following command:
liberty@liberty-desktop:~/200910/editing-ralfherzog/01$ saxonb-xslt -ext:on -o:PROMPTS01 -xsl:transform-ssml-prompts.xsl -s:prompts.xml
The stylesheet transform-ssml-prompts.xsl has the following content:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<!-- 20091013; license: GPL -->
<xsl:output method="text"/>
<xsl:template match="speak">
<xsl:for-each select="audio">
<xsl:value-of select="replace(@src, 'flac','wav')"/>
<xsl:text> </xsl:text>
<xsl:value-of select="."/><xsl:text>
</xsl:text>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
I think that I found an error. I forgot to capitalize the prompts with the XPath expression upper-case(). I will have to correct the stylesheet. Probably that is the reason why the Import Trainingsdata function didn’t work out.


