next up previous
次へ: Experimental results for the 上へ: Experiments for Continuous Speech 戻る: Experiments for Continuous Speech

Experimental conditions for continuous speech recognition

For the training of the continuous mixture HMM, 2,635 word utterances were used. For the training of the Ergodic HMM, 8,475 sentences of the ATR Dialog Database were used in text-open experiments, whereas the same 8,475 sentences plus 38 test sentences were used in text-closed experiments. In this way, it was possible to perform both the text-closed and text-open experiments using the same test data. The experimental conditions are summarized in Table 2

表 2: Experimental conditions for continuous speech recognition using Ergodic HMM
algorithm continuous mixture HMM
  + beam search + Ergodic HMM
mixture count max 14 ( valid for each syllable )
state number 3-state 4-loop left-to-right model
acoustic parameter 16th order LPC cepstrum + power
  + $\Delta$ power + 16th order $\Delta$ cepstrum
frame window 20 ms
frame period 5 ms
training voice word speech (2,635 words)
phone category 52 syllables
vocabulary 435
beam width 4,096
duration control no
test sentence count 261 sentences; same speaker
speaking style read speech
speech content international conference task

next up previous
次へ: Experimental results for the 上へ: Experiments for Continuous Speech 戻る: Experiments for Continuous Speech
Jin'ichi Murakami 平成13年10月2日