Experimental conditions for continuous speech recognition

For the training of the continuous mixture HMM, 2,635 word utterances were used. For the training of the Ergodic HMM, 8,475 sentences of the ATR Dialog Database were used in text-open experiments, whereas the same 8,475 sentences plus 38 test sentences were used in text-closed experiments. In this way, it was possible to perform both the text-closed and text-open experiments using the same test data. The experimental conditions are summarized in Table 2

**表 2:** Experimental conditions for continuous speech recognition using Ergodic HMM
algorithm	continuous mixture HMM
	+ beam search + Ergodic HMM
mixture count	max 14 ( valid for each syllable )
state number	3-state 4-loop left-to-right model
acoustic parameter	16th order LPC cepstrum + power
	+ $\Delta$ power + 16th order $\Delta$ cepstrum
frame window	20 ms
frame period	5 ms
training voice	word speech (2,635 words)
phone category	52 syllables
vocabulary	435
beam width	4,096
duration control	no
test sentence count	261 sentences; same speaker
speaking style	read speech
speech content	international conference task