Experiments are performed for speaker dependent (SD) and speaker independent (SI) sentence speech recognition. For comparison, a word bigram model experiment is also carried out under the same conditions.
The test data were spoken by a male broadcast announcer. The training text data employed to estimate word trigram probabilities consisted of about 15,000 sentences with 190,000 words from the ATR Dialog Database, and an additional 261 test speech sentences. The task perplexity was 4.0 for the word trigram. The word HMMs were made by connected phone HMMs. A summary of these experimental conditions is shown in Table 2 .
Algorithm | continuous mixture HMM |
+ beam search + word trigram models | |
(One-pass DP) | |
Phone model | 4-state 3-loop left-to-right model |
Acoustic parameters | 16th order LPC cepstrum + power |
+ power + 16th order cepstrum | |
Frame period | 5 mS |
Training data (SD) | 2620-word utterance |
Training data (SI) | 12 male speakers, 736-word utterance |
Vocabulary | 1,567 |
Beam width | 4,096 |
Test data | 261 sentences |
Speaking style | read |