<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>testing simon &#187; sam</title>
	<atom:link href="http://spirit.blau.in/simon/tag/sam/feed/" rel="self" type="application/rss+xml" />
	<link>http://spirit.blau.in/simon</link>
	<description>my first steps with the simon speech recognition software</description>
	<lastBuildDate>Tue, 10 Jan 2012 14:59:26 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.1</generator>
		<item>
		<title>Export test result with sam</title>
		<link>http://spirit.blau.in/simon/2011/08/01/export-test-result-with-sam/</link>
		<comments>http://spirit.blau.in/simon/2011/08/01/export-test-result-with-sam/#comments</comments>
		<pubDate>Mon, 01 Aug 2011 07:31:30 +0000</pubDate>
		<dc:creator>producer</dc:creator>
				<category><![CDATA[Ubuntu]]></category>
		<category><![CDATA[sam]]></category>

		<guid isPermaLink="false">http://spirit.blau.in/simon/?p=5600</guid>
		<description><![CDATA[Yesterday, I installed Qwt 6, and then built simon 0.3.60. It was difficult, but in the end it worked out fine. And look, sam offers now an Export test result button (top right of the screen shot): I want to export the following information: Filename, Expected result, Actual result, Recognition rate (below 50%). The resulting [...]]]></description>
			<content:encoded><![CDATA[<p>Yesterday, I <a href="http://spirit.blau.in/simon/2011/07/31/remove-the-package-simon/#comment-392">installed Qwt 6</a>, and then built <strong>simon 0.3.60</strong>. It was difficult, but in the end it worked out fine. And look, sam offers now an <code>Export test result</code> button (top right of the screen shot):</p>
<p><a href="http://spirit.blau.in/simon/files/2011/08/export-test-result.jpg"><img src="http://spirit.blau.in/simon/files/2011/08/export-test-result-300x188.jpg" alt="" title="export-test-result" width="300" height="188" class="alignleft size-medium wp-image-5601" /></a>I want to export the following information: Filename, Expected result, Actual result, Recognition rate (below 50%). The resulting document should be a simple text file (or XML file or whatever). Is this possible with the current <code>Export test result</code> function of simon?</p>
<div style="clear:both"></div>
]]></content:encoded>
			<wfw:commentRss>http://spirit.blau.in/simon/2011/08/01/export-test-result-with-sam/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>sam: sorting should be improved</title>
		<link>http://spirit.blau.in/simon/2010/08/29/sam-sorting-should-be-improved/</link>
		<comments>http://spirit.blau.in/simon/2010/08/29/sam-sorting-should-be-improved/#comments</comments>
		<pubDate>Sun, 29 Aug 2010 08:14:14 +0000</pubDate>
		<dc:creator>producer</dc:creator>
				<category><![CDATA[Ubuntu]]></category>
		<category><![CDATA[sam]]></category>

		<guid isPermaLink="false">http://spirit.blau.in/simon/?p=4986</guid>
		<description><![CDATA[A few minutes ago, I tested section ‘xea’ against all. The overall recognition rate was about 95%. Take a closer look at the right column: The word DIVERGIERST has a recognition rate of 100%. The word DIN has a recognition rate of 11.9569%. You can see that the sorting is not correct. Btw, it would [...]]]></description>
			<content:encoded><![CDATA[<p>A few minutes ago, I <a href="http://spirit.blau.in/simon/2010/08/29/sam-test-section-xea-against-all/">tested section ‘xea’ against all</a>. The overall recognition rate was about 95%.</p>
<p><a href="http://spirit.blau.in/simon/files/2010/08/recognition-rate.png"><img src="http://spirit.blau.in/simon/files/2010/08/recognition-rate-300x271.png" alt="recognition-rate" width="300" height="271" class="alignleft size-medium wp-image-4987" /></a>Take a closer look at the right column: The word <code>DIVERGIERST</code> has a recognition rate of 100%. The word DIN has a recognition rate of 11.9569%. You can see that the <strong>sorting is not correct</strong>.</p>
<div style="clear:both"></div>
<p>Btw, it would be interesting to export the recognition results into a text file. What about adding an <code>Export test results</code> button? </p>
]]></content:encoded>
			<wfw:commentRss>http://spirit.blau.in/simon/2010/08/29/sam-sorting-should-be-improved/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>sam: test section &#8216;xea&#8217; against all</title>
		<link>http://spirit.blau.in/simon/2010/08/29/sam-test-section-xea-against-all/</link>
		<comments>http://spirit.blau.in/simon/2010/08/29/sam-test-section-xea-against-all/#comments</comments>
		<pubDate>Sun, 29 Aug 2010 07:16:35 +0000</pubDate>
		<dc:creator>producer</dc:creator>
				<category><![CDATA[Ubuntu]]></category>
		<category><![CDATA[sam]]></category>

		<guid isPermaLink="false">http://spirit.blau.in/simon/?p=4973</guid>
		<description><![CDATA[At the moment, I am testing with sam the speech model xea (contains about 800 German words) against Ralf&#8217;s German speech model 0.1.6 (contains about 25000 German words). The file:///home/ubuntu/Documents/201008/sam-0.1.6/test-xea-against-all.sam has the following content: /home/ubuntu/.kde/share/apps/simon/model/hmmdefs /home/ubuntu/.kde/share/apps/simon/model/tiedlist /home/ubuntu/.kde/share/apps/simon/model/model.dict /home/ubuntu/.kde/share/apps/simon/model/model.dfa /home/ubuntu/Documents/201008/wav-all /home/ubuntu/Documents/201006/audacity/xea-folder/wav-xea /home/ubuntu/Documents/201008/sam-0.1.6/lexicon /home/ubuntu/Documents/201008/sam-0.1.6/model.grammar /home/ubuntu/Documents/201008/sam-0.1.6/simple.voca /home/ubuntu/Documents/201008/prompts-wav.txt /home/ubuntu/Documents/201006/audacity/xea-folder/prompts-xea /usr/share/kde4/apps/simon/model/tree1.hed /usr/share/kde4/apps/simon/model/wav_config 16000 /usr/share/kde4/apps/simond/default.jconf 2 Question: what does the 2 [...]]]></description>
			<content:encoded><![CDATA[<p>At the moment, I am testing with sam the <a href="http://spirit.blau.in/simon/tag/xea/">speech model xea</a> (contains about 800 German words) against <a href="http://spirit.blau.in/simon/2010/08/28/ralfs-german-speech-model-0-1-6/">Ralf&#8217;s German speech model 0.1.6</a> (contains about 25000 German words). The <code>file:///home/ubuntu/Documents/201008/sam-0.1.6/<strong>test-xea-against-all.sam</strong></code> has the following content:</p>
<blockquote><p><code>/home/ubuntu/.kde/share/apps/simon/model/hmmdefs<br />
/home/ubuntu/.kde/share/apps/simon/model/tiedlist<br />
/home/ubuntu/.kde/share/apps/simon/model/model.dict<br />
/home/ubuntu/.kde/share/apps/simon/model/model.dfa<br />
/home/ubuntu/Documents/201008/wav-all<br />
/home/ubuntu/Documents/201006/audacity/xea-folder/wav-xea<br />
/home/ubuntu/Documents/201008/sam-0.1.6/lexicon<br />
/home/ubuntu/Documents/201008/sam-0.1.6/model.grammar<br />
/home/ubuntu/Documents/201008/sam-0.1.6/simple.voca<br />
/home/ubuntu/Documents/201008/prompts-wav.txt<br />
/home/ubuntu/Documents/201006/audacity/xea-folder/prompts-xea<br />
/usr/share/kde4/apps/simon/model/tree1.hed<br />
/usr/share/kde4/apps/simon/model/wav_config<br />
16000<br />
/usr/share/kde4/apps/simond/default.jconf<br />
2</code></p></blockquote>
<p>Question: what does the <code>2</code> in the last line mean?</p>
<p>By the way, I used the <code>Serialize scenarios</code> button (input <code>file:///home/ubuntu/Documents/201007/german-speech-model-0.1.6/<strong>scenario-0.1.6.xml</strong></code>) to generate the following files:<br />
<code>file:///home/ubuntu/Documents/201008/sam-0.1.6/lexicon<br />
file:///home/ubuntu/Documents/201008/sam-0.1.6/model.grammar<br />
file:///home/ubuntu/Documents/201008/sam-0.1.6/simple.voca</code></p>
<p>I think that it is a good idea to test a subset (e.g. <a href="http://script.blau.in/german-ipa/xea.xml">xea</a>) against the superset (xaa+xab+xac+&#8230;+xpw).</p>
]]></content:encoded>
			<wfw:commentRss>http://spirit.blau.in/simon/2010/08/29/sam-test-section-xea-against-all/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>sam handbook</title>
		<link>http://spirit.blau.in/simon/2010/08/12/sam-handbook/</link>
		<comments>http://spirit.blau.in/simon/2010/08/12/sam-handbook/#comments</comments>
		<pubDate>Thu, 12 Aug 2010 15:03:34 +0000</pubDate>
		<dc:creator>producer</dc:creator>
				<category><![CDATA[Ubuntu]]></category>
		<category><![CDATA[sam]]></category>

		<guid isPermaLink="false">http://spirit.blau.in/simon/?p=4767</guid>
		<description><![CDATA[During the last couples of minutes, I took a look into the sam handbook: &#8220;You can sort the files by each column simply by clicking on the column header. This way it is very easy to find bad samples by sorting by recognition rate.&#8221; This function worked partially on my computer (when using a previous [...]]]></description>
			<content:encoded><![CDATA[<p>During the last couples of minutes, I took a look into the <a href="http://speech2text.git.sourceforge.net/git/gitweb.cgi?p=speech2text/speech2text;a=commitdiff;h=efc327bf0973069f1216a17b7436f6b4e4e7af70">sam handbook</a>: </p>
<blockquote><p>&#8220;You can sort the files by each column simply by clicking on the column header. This way it is very easy to find bad samples by sorting by recognition rate.&#8221;</p></blockquote>
<p>This function worked partially on my computer (when using a previous development version of sam). I hope that this issue will be fixed. Some words were sorted beginning from lowest recognition rate to the highest recognition rate, but not all. </p>
<p>What about a function that displays all words that are below a specific recognition rate (e.g. <em>&#8220;display all words that are below 90% recognition rate&#8221;</em>)? I need a tool that helps me to sort out the bad <a href="http://script.blau.in/german-ipa/xaa.xml">audio files</a> as efficiently as possible. I hope that a future version of sam will do the job.</p>
]]></content:encoded>
			<wfw:commentRss>http://spirit.blau.in/simon/2010/08/12/sam-handbook/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>HMM-definitions, Tiedlist, Dict, DFA</title>
		<link>http://spirit.blau.in/simon/2010/04/24/hmm-definitions-tiedlist-dict-dfa/</link>
		<comments>http://spirit.blau.in/simon/2010/04/24/hmm-definitions-tiedlist-dict-dfa/#comments</comments>
		<pubDate>Fri, 23 Apr 2010 22:30:22 +0000</pubDate>
		<dc:creator>producer</dc:creator>
				<category><![CDATA[General]]></category>
		<category><![CDATA[sam]]></category>

		<guid isPermaLink="false">http://spirit.blau.in/simon/?p=3109</guid>
		<description><![CDATA[When I open Applications &#62; Universal Access &#62; sam &#62; Input &#38; output files, I see the following four options in the Output files area: (a) HMM-definitions (b) Tiedlist (c) Dict (d) DFA When I am looking into simon, I can see the following options: (a) Hmm Definition (b) Tiedlist (c) Macros (d) Stats I [...]]]></description>
			<content:encoded><![CDATA[<p>When I open <code>Applications &gt; Universal Access &gt; sam &gt; Input &amp; output files</code>, I see the following four options in the <code>Output files</code> area:</p>
<p>(a) <code>HMM-definitions</code><br />
(b) <code>Tiedlist</code><br />
(c) <code>Dict</code><br />
(d) <code>DFA</code></p>
<p>When I am looking into simon, I can see the following options:</p>
<p><a href="http://spirit.blau.in/simon/files/2010/04/static-model.png"><img class="alignnone size-medium wp-image-2841" src="http://spirit.blau.in/simon/files/2010/04/static-model-300x139.png" alt="static-model" width="300" height="139" /></a></p>
<p>(a) <code>Hmm Definition</code><br />
(b) <code>Tiedlist</code><br />
(c) <code>Macros</code><br />
(d) <code>Stats</code></p>
<p>I find it confusing:<br />
- (a) and (b) obviously the same: sam can output the (a) <code>Hmm Definition</code>; simon can use the (a) <code>Hmm Definition</code> as input.  sam can output (b) <code>Tiedlist</code>; simon can use the (b) <code>Tiedlist</code> that sam produced as input. So far so good.<br />
- but what is with (c) and (d)? <strong>Is (c) Dict the same as (c) Macros? Is (d) DFA the same as (d) Stats? </strong></p>
<p>Why does <a href="http://simon-listens.blogspot.com/2009/08/sam.html">sam</a> produce (c) <code>Dict</code> and (d) <code>DFA</code> as output files? <strong>Is it possible to use these sam output files as simon input files?</strong></p>
<p>Some clarification would be helpful.</p>
]]></content:encoded>
			<wfw:commentRss>http://spirit.blau.in/simon/2010/04/24/hmm-definitions-tiedlist-dict-dfa/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>git pull origin master; Serialize scenarios</title>
		<link>http://spirit.blau.in/simon/2010/02/16/git-pull-origin-master-serialize-scenarios/</link>
		<comments>http://spirit.blau.in/simon/2010/02/16/git-pull-origin-master-serialize-scenarios/#comments</comments>
		<pubDate>Tue, 16 Feb 2010 17:40:58 +0000</pubDate>
		<dc:creator>producer</dc:creator>
				<category><![CDATA[Ubuntu]]></category>
		<category><![CDATA[build_ubuntu.sh]]></category>
		<category><![CDATA[node10]]></category>
		<category><![CDATA[sam]]></category>

		<guid isPermaLink="false">http://spirit.blau.in/simon/?p=2477</guid>
		<description><![CDATA[A few minutes ago, I built the simon development version: $ cd Documents/201001/speech2text $ git pull origin master $ ./build_ubuntu.sh It is working. I would like to know how sam &#62; Input &#38; output files &#62; Serialize scenarios &#124; Serialize prompts is working. Is there a tutorial available?]]></description>
			<content:encoded><![CDATA[<p>A few minutes ago, I built the simon development version:</p>
<p><code>$ cd Documents/201001/speech2text<br />
$ git pull origin master<br />
$ ./build_ubuntu.sh</code></p>
<p>It is working. </p>
<p>I would like to know how <code>sam</code> &gt; <code>Input &amp; output files</code> &gt; <code>Serialize scenarios</code> | <code>Serialize prompts</code> is working. Is there a tutorial available? </p>
]]></content:encoded>
			<wfw:commentRss>http://spirit.blau.in/simon/2010/02/16/git-pull-origin-master-serialize-scenarios/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>How can I export a sam speech model?</title>
		<link>http://spirit.blau.in/simon/2010/02/03/how-can-i-export-a-sam-speech-model/</link>
		<comments>http://spirit.blau.in/simon/2010/02/03/how-can-i-export-a-sam-speech-model/#comments</comments>
		<pubDate>Wed, 03 Feb 2010 14:11:33 +0000</pubDate>
		<dc:creator>producer</dc:creator>
				<category><![CDATA[Ubuntu]]></category>
		<category><![CDATA[node12]]></category>
		<category><![CDATA[sam]]></category>

		<guid isPermaLink="false">http://spirit.blau.in/simon/?p=2446</guid>
		<description><![CDATA[I tried sam again after I had problems a couple of days ago (I used the same paths; I just had to fix the paths to the test files). I used the Build model and the Test model button in conjunction with my German backup files (about 200 German words can be recognized with these [...]]]></description>
			<content:encoded><![CDATA[<p>I tried sam again after <a href="http://spirit.blau.in/simon/2010/01/26/sam-couldnt-open-prompts-file/">I had problems a couple of days ago</a> (I used the same paths; I just had to fix the paths to the test files). I used the <code>Build model</code> and the <code>Test model</code> button in conjunction with my German backup files (about 200 German words can be recognized with these files when everything is configured correctly). It worked.</p>
<p>My question is: how can I import the model that I have built/tested with sam into simon?</p>
<p>simon offers to <code>Manage scenarios</code>: <code>Import</code> and <a href="http://spirit.blau.in/simon/2010/02/03/clear-button-improve-phoneme/#comment-263"><code>Export</code></a> are offered. I tried the <code>Export</code> button. This created an XML file:</p>
<blockquote><p>&lt;!DOCTYPE scenario&gt;<br />
&lt;scenario version=&#8221;1&#8243; icon=&#8221;simon&#8221; name=&#8221;General&#8221; lastModified=&#8221;2010-02-03T14:06:50&#8243;&gt;<br />
&lt;simonCompatibility&gt;<br />
&lt;minimumVersion&gt;<br />
&lt;version&gt;0.2.82&lt;/version&gt;<br />
&lt;/minimumVersion&gt;<br />
&lt;maximumVersion/&gt;<br />
&lt;/simonCompatibility&gt;<br />
&lt;authors&gt;<br />
&lt;author&gt;<br />
&lt;name&gt;Anybody&lt;/name&gt;<br />
&lt;contact&gt;no@mail&lt;/contact&gt;<br />
&lt;/author&gt;<br />
&lt;/authors&gt;<br />
&lt;licence&gt;GPL&lt;/licence&gt;<br />
&lt;vocabulary/&gt;<br />
&lt;grammar&gt;<br />
&lt;structure&gt;Unknown&lt;/structure&gt;<br />
&lt;/grammar&gt;<br />
&lt;actions/&gt;<br />
&lt;trainingtexts/&gt;<br />
&lt;/scenario&gt;</p></blockquote>
<p>I would like to be able to <strong>import my sam speech model into simon</strong>. How can I do this?</p>
<p>What can I import into simon?<br />
- I can press the <code>Import Dictionary</code> button to import an active dictionary and/or a shadow dictionary.<br />
- I can switch to the <code>Grammar</code> tab, press the <code>Import</code> button. I didn&#8217;t try this function yet. I am not too interested because currently I don&#8217;t need a grammar function (200 German words could be recognized without a grammar &#8211; only the terminal <code>Unknown</code> was used; 1000 German words should be possible without grammar, I hope). Of course, if I want to <a href="http://script.blau.in/20091228-kdeshareappssimon-model.tar.gz">restore my German speech model</a> (53 MB), it is necessary to restore the grammar, too. So this <code>Import</code> [Grammar] button might be useful.<br />
- In the <code>Training</code> tab, I can press the <code>Import Trainingsdata</code> button. This should import the <code>prompts</code> file and the corresponding <code>wav</code> files (stored in the <code>training.data</code> folder).</p>
<p>When I take a look at <code>sam</code> &gt; <code>Static model</code>, I can see fields for <code>Base macros</code> and <code>Base stats</code>. Where are these files from my own German 200 words speech model located? When I download the <a href="http://www.repository.voxforge1.org/downloads/Nightly_Builds/AcousticModel-2010-02-03/">English acoustic model from Voxforge</a> (<code>HTK_AcousticModel-2010-02-03_16kHz_16bit_MFCC_O_D.tgz</code>), I can see the following files:</p>
<p><a href="http://spirit.blau.in/simon/files/2010/02/macros-stats.png"><img src="http://spirit.blau.in/simon/files/2010/02/macros-stats-300x247.png" alt="macros-stats" width="300" height="247" class="alignnone size-medium wp-image-2456" /></a></p>
<p>1. <code>macros</code> &#8211; this file is probably usable with <code>sam</code> &gt; <code>Input &amp; output files</code> &gt; <code>Static model</code> &gt; <code>Base macros</code>.<br />
2. <code>stats</code> &#8211; probably usable with <code>sam</code> &gt; <code>Input &amp; output files</code> &gt; <code>Static model</code> &gt; <strong><code>Base stats</code></strong>.</p>
<p>When I build my own German speech model with sam, where are these files &#8211; <code>macros</code> and <code>stats</code> &#8211; located? Where can I find them? I assume that I need them if I want to restore my speech model for the usage with simon, but I am not sure.</p>
<p>This is what I want: Restore my 200 German words speech model (my current problem). Then I want to add more words to this speech model. I am planning to add about 10 words per day on average to my German speech model. It should grow continously. And if something goes wrong, I want to be able to restore from my backup file because I don&#8217;t want to begin again and again from scratch.</p>
<p>The <code>Manage scenarios</code> &gt; <code>Import</code> and <code>Export</code> buttons might be of help in the future.</p>
<p>I already said it earlier, and I say it again because it is important: the user doesn&#8217;t want to loose his own work (= wav recordings that were made with simon). It should be possible to backup (=export) and to restore (=import) all files that are necessary to build a working speech model. </p>
<p>For me, it is OK to specify each specific path like it is possible with sam. But in the end, it has to work with simon.</p>
<p>I want to fine tune my German speech model with sam. Especially, I want to sort out <code>wav</code> files that have a low recognition rate with sam. After I have fine-tuned my German speech model with sam, I want to use it with simon (=import it into simon). How is this possible?</p>
]]></content:encoded>
			<wfw:commentRss>http://spirit.blau.in/simon/2010/02/03/how-can-i-export-a-sam-speech-model/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>sam: Couldn&#8217;t open prompts file</title>
		<link>http://spirit.blau.in/simon/2010/01/26/sam-couldnt-open-prompts-file/</link>
		<comments>http://spirit.blau.in/simon/2010/01/26/sam-couldnt-open-prompts-file/#comments</comments>
		<pubDate>Tue, 26 Jan 2010 11:46:57 +0000</pubDate>
		<dc:creator>producer</dc:creator>
				<category><![CDATA[Ubuntu]]></category>
		<category><![CDATA[default.jconf]]></category>
		<category><![CDATA[git]]></category>
		<category><![CDATA[sam]]></category>
		<category><![CDATA[training.data]]></category>

		<guid isPermaLink="false">http://spirit.blau.in/simon/?p=2398</guid>
		<description><![CDATA[I want to use my German backup folder with sam. I have to choose the specific paths to the specific backup files: I am using the following path for the jconf file (I had to look into this blog post): /usr/share/kde4/apps/simond/default.jconf This is the current content of the file /home/am3msi/Documents/201001/model/20100126-try-to-restore-german.sam: /home/am3msi/Documents/201001/model/hmmdefs /home/am3msi/Documents/201001/model/tiedlist /home/am3msi/Documents/201001/model/model.dict /home/am3msi/Documents/201001/model/model.dfa /home/am3msi/Documents/201001/model/training.data/ [...]]]></description>
			<content:encoded><![CDATA[<p>I want to use <a href="http://spirit.blau.in/simon/2009/12/29/how-can-i-import-the-backup-folder/">my German backup folder</a> with sam. I have to choose the specific paths to the specific backup files:</p>
<p><a href="http://spirit.blau.in/simon/files/2010/01/user-generated.png"><img src="http://spirit.blau.in/simon/files/2010/01/user-generated-300x164.png" alt="user-generated" width="300" height="164" class="alignnone size-medium wp-image-2399" /></a></p>
<p>I am using the following path for the jconf file (I had to <a href="http://spirit.blau.in/simon/2009/12/29/find-bad-wav-files-with-sam/">look into this blog post</a>): <code>/usr/share/kde4/apps/simond/default.jconf</code></p>
<p>This is the current content of the file <code>/home/am3msi/Documents/201001/model/20100126-try-to-restore-german.sam</code>:</p>
<blockquote><p>/home/am3msi/Documents/201001/model/hmmdefs<br />
/home/am3msi/Documents/201001/model/tiedlist<br />
/home/am3msi/Documents/201001/model/model.dict<br />
/home/am3msi/Documents/201001/model/model.dfa<br />
/home/am3msi/Documents/201001/model/training.data/<br />
/usr/share/kde4/apps/simond/default.jconf<br />
/home/am3msi/Documents/201001/model/lexicon<br />
/home/am3msi/Documents/201001/model/model.grammar<br />
/home/am3msi/Documents/201001/model/model.voca<br />
/home/am3msi/Documents/201001/model/prompts<br />
/home/am3msi/Documents/201001/model/training.data/<br />
/home/am3msi/Documents/201001/model/tree1.hed<br />
/home/am3msi/Documents/201001/model/wav_config<br />
16000<br />
/home/am3msi/Documents/201001/model/prompts
</p></blockquote>
<p>Now, I click the <code>Build model</code> button.</p>
<p><a href="http://spirit.blau.in/simon/files/2010/01/build-log.png"><img src="http://spirit.blau.in/simon/files/2010/01/build-log-300x209.png" alt="build-log" width="300" height="209" class="alignnone size-medium wp-image-2405" /></a></p>
<p>1. I pressed the <code>Build model</code> button.<br />
2. The <code>Build log</code> indicates that it worked out. Great. I assume that the previously existing files<br />
<code>/home/am3msi/Documents/201001/model/hmmdefs<br />
/home/am3msi/Documents/201001/model/tiedlist<br />
/home/am3msi/Documents/201001/model/model.dict<br />
/home/am3msi/Documents/201001/model/model.dfa</code><br />
have been replaced by new ones (probably with the identical content).</p>
<p>Now I want to test the model. So I press the <code>Test model</code> button. sam displays an error message:</p>
<blockquote><p>Couldn&#8217;t open prompts file for reading: /home/am3msi/Documents/201001/model/training.data/</p></blockquote>
<p>Why is that? What went wrong? Let&#8217;s take a look at the paths to the test files:<br />
<code>/home/am3msi/Documents/201001/model/training.data/<br />
/usr/share/kde4/apps/simond/default.jconf<br />
/home/am3msi/Documents/201001/model/prompts</code></p>
<p>The paths are correct. I am trying the following: I copy the <code>prompts</code> file to the <code>training.data</code> folder. But this didn&#8217;t solve my problem.</p>
<p>My guess is that there is a bug with sam. At least, it is possible to build a speech model with sam (from my German backup files). That is a good start. That means that my German wav recordings, my dictionary, my prompts aren&#8217;t lost.</p>
<p>My next step will be to take a closer look at simon. I will try to use my German backup files with simon. They worked with sam (only the <code>Test model</code> function failed, but the <code>Build model</code> function obviously worked). And I hope that they will work with simon, too.</p>
]]></content:encoded>
			<wfw:commentRss>http://spirit.blau.in/simon/2010/01/26/sam-couldnt-open-prompts-file/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Find bad wav files with sam</title>
		<link>http://spirit.blau.in/simon/2009/12/29/find-bad-wav-files-with-sam/</link>
		<comments>http://spirit.blau.in/simon/2009/12/29/find-bad-wav-files-with-sam/#comments</comments>
		<pubDate>Tue, 29 Dec 2009 04:26:13 +0000</pubDate>
		<dc:creator>producer</dc:creator>
				<category><![CDATA[Ubuntu]]></category>
		<category><![CDATA[jconf]]></category>
		<category><![CDATA[sam]]></category>

		<guid isPermaLink="false">http://spirit.blau.in/simon/?p=2080</guid>
		<description><![CDATA[A. I am testing sam. I am not sure about the jconf file (error message: &#8220;Couldn't open julius jconf file: "".&#8220;. Where is it located? When I am searching my computer, I find adin.jconf, default.jconf, Sample.jconf (each configuration file is located at a different location). Which one should I choose? 1. I tested sam with [...]]]></description>
			<content:encoded><![CDATA[<p>A. I am testing sam. I am not sure about the <code>jconf</code> file (error message: &#8220;<code>Couldn't open julius jconf file: "".</code>&#8220;. Where is it located? When I am searching my computer, I find <code>adin.jconf</code>, <code>default.jconf</code>, <code>Sample.jconf</code> (each configuration file is located at a different location). Which one should I choose? </p>
<p>1. I tested sam with <code>Sample.jconf</code>, and it seemed to work. But to be honest: Isn&#8217;t this just a dummy file? Every line begins with an <code>#</code> (number sign).</p>
<p>2. <code>adin.jconf</code> is very short, and everything is commented out.</p>
<p>3. <code>default.jconf</code> is probably the correct choice because some lines are valid, e.g.:</p>
<blockquote><p><code>[...]<br />
-h hmmdefs<br />
[...]<br />
-hlist tiedlist<br />
[...]<br />
-penalty1 5.0		# first pass<br />
-penalty2 20.0		# second pass<br />
[...]</code></p></blockquote>
<p>Location of this file: <code>/usr/share/kde4/apps/simond/default.jconf</code><br />
I think that I will use sam with this file.</p>
<p>B. I assume that <code>sam &gt; Build model</code> is the same like <code>simon &gt; Synchronize</code> because the button is the same (green circular arrow). </p>
<p>This means that I can use simon for recording new words, and synchronize the speech model. In my last video, all words (more than 200 words) were <a href="http://spirit.blau.in/simon/2009/12/27/video-recognize-200-german-words/">recognized correctly</a>. How can I eliminate wav files that are not so good? I need an efficient way to fix future problems. sam seems to fill the gap. With simon, I can record words, and synchronize. With sam, I can find out which words are below 100% confidence score. These words can be edited with sam (very nice feature). </p>
<p>So, simon is good for the main work. Fixing bad wav files can be done with sam. E.g. I found a wav file with a confidence score of about 89%. I edited this wav file with sam (sam offers the option to re-record a wav file). I will see whether this improves the confidence score after rebuilding (<code>sam &gt; Build model</code>), and testing the model (<code>sam &gt; Test model</code>).</p>
<p>C. It seems that <code>sam</code> is working without <code>ksimond</code>. </p>
<p>D. I don&#8217;t know yet how to handle wav files with multiple words. With sam, the confidence score of these wav files is 0 %. I added with simon the grammar structure &#8220;<code>Unknown Unknown Unknown Unknown</code>&#8220;, but it still doesn&#8217;t work out. But now, I found one sample which had a confidence score of 50%:</p>
<blockquote><p><code>Result 6 of 10<br />
=====================<br />
Sentence: verlangsamende verlangsamendem verlangsamenden verlangsamendes<br />
SAMPA:<br />
Raw SAMPA:<br />
Average Confidence: 50<br />
Confidence Scores: 1.72309e-12 100 4.8642e-05 100</code></p></blockquote>
<p>So I can say: it seems to work when I am adding the sentence structure &#8220;<code>Unknown Unknown Unknown Unknown</code>&#8220;.</p>
<p>E. The word <code>deutschfeindlichen</code> has a confidence score of 90.0133 %:</p>
<blockquote><p>Result 1 of 3<br />
=====================<br />
Sentence: deutschfeindlichen<br />
SAMPA:<br />
Raw SAMPA:<br />
Average Confidence: 90.0133<br />
Confidence Scores: 90.0133 </p>
<p>Result 2 of 3<br />
=====================<br />
Sentence: deutschfeindlichem<br />
SAMPA:<br />
Raw SAMPA:<br />
Average Confidence: 9.98674<br />
Confidence Scores: 9.98674 </p>
<p>Result 3 of 3<br />
=====================<br />
Sentence: deutschfeindliche<br />
SAMPA:<br />
Raw SAMPA:<br />
Average Confidence: 8.29019e-17<br />
Confidence Scores: 8.29019e-17</p></blockquote>
<p>So the alternatives <code>deutschfeindlichem</code> and <code>deutschfeindliche</code> have a much lower confidence score. This is fine. I can see how good it works internally. Because in the video (see link above), you can see 100 % perfect results. But internally, it is just about 90 % for <code>deutschfeindlichen</code>.</p>
<p>Here is another example <code>Wortbreite</code>:</p>
<blockquote><p>Result 1 of 2<br />
=====================<br />
Sentence: Wortbreite<br />
SAMPA:<br />
Raw SAMPA:<br />
Average Confidence: 0.0408209<br />
Confidence Scores: 0.0408209 </p>
<p>Result 2 of 2<br />
=====================<br />
Sentence: Wortbreiten<br />
SAMPA:<br />
Raw SAMPA:<br />
Average Confidence: 99.9592<br />
Confidence Scores: 99.9592 </p></blockquote>
<p>It should have recognized <code>Wortbreite</code>, but it recognized with more than 99 % confidence score the word <code>Wortbreiten</code>. So this match is wrong.</p>
<p>F. Conclusion: I hope you got an impression of sam. sam seems to be a great tool for speech model development.  </p>
]]></content:encoded>
			<wfw:commentRss>http://spirit.blau.in/simon/2009/12/29/find-bad-wav-files-with-sam/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>&#8220;sam can now test models&#8221;</title>
		<link>http://spirit.blau.in/simon/2009/09/09/sam-can-now-test-models/</link>
		<comments>http://spirit.blau.in/simon/2009/09/09/sam-can-now-test-models/#comments</comments>
		<pubDate>Wed, 09 Sep 2009 02:09:10 +0000</pubDate>
		<dc:creator>producer</dc:creator>
				<category><![CDATA[Ubuntu]]></category>
		<category><![CDATA[dictionary]]></category>
		<category><![CDATA[sam]]></category>

		<guid isPermaLink="false">http://spirit.blau.in/simon/?p=1365</guid>
		<description><![CDATA[Interesting: &#8220;sam can now test models&#8221;. This means that I can get an acoustic model from VoxForge, and test this model with sam. Well, I want to build my own acoustic models with just my own voice. I hope that I can use sam for the building / testing process. By the way, currently I [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://speech2text.svn.sourceforge.net/viewvc/speech2text/trunk/sam/src/accuracydisplay.cpp?view=log">Interesting</a>: &#8220;sam can now test models&#8221;. This means that I can <a href="http://www.repository.voxforge1.org/downloads/Nightly_Builds/AcousticModel-2009-09-08/">get an acoustic model from VoxForge</a>, and test this model with sam. Well, I want to build my own acoustic models with just my own voice. I hope that I can use sam for the <a href="http://spirit.blau.in/simon/files/2009/09/test.png">building / testing process</a>. </p>
<p>By the way, currently I am building an additional German pronunciation dictionary that someday will contain the missing words from <a href="http://script.blau.in/xml/corpus_all.xml">my German audio files</a>. When I have the missing words in the dictionary, I hope that the testing process with sam will give me good results.</p>
]]></content:encoded>
			<wfw:commentRss>http://spirit.blau.in/simon/2009/09/09/sam-can-now-test-models/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

