In this section, we describe a continuous speech algorithm using word trigram models based on Viterbi search. Well-known algorithms for continuous speech recognition include two-level DP matching, level building or Viterbi search (one-pass DP). Among them, the Viterbi search (one-pass DP) algorithm is well suited for Markov models with a language model such as bigram or trigram. To compute the Viterbi path to the n'th recognized word , at time t, we need to know the Viterbi paths emerging from each word candidate, at time t-1. However, for the trigram algorithm, for each previous word candidate , we need to know not only the Viterbi paths emerging from words at time t-1, but also the most likely paths passing through all possible word pair combinations.
We define the word uttered at time t, the word uttered previous , . We can calculate recursively using the following algorithm.
[ Definition ] |
:the number of states of word |
:transition probability in word from state to state |
:symbol output probability in word at state for observation vector at frame |
: trigram probability of word after have appeared. |
:vocabulary |
:the number of input frames |
[ Initialization ] |
execute step1 under |
1) |
means sentence head |
[ Viterbi search ] |
execute step2 and step6 for |
2) execute step3 for |
3) execute step4 for |
4) execute step5 for |
5) |
[ Calculate word boundaries ] |
6) execute step7 for |
7) execute step8 for |
8) |
if then |