next up previous
次へ: Speaker Feature and Long 上へ: Unknown-Multiple Signal Source Clustering 戻る: Evaluation Method of Speaker

Experimental results


Fig.4 shows the result of experiment, where the LPC cepstrum is calculated with the window length is 21.3ms. The average classification rate is given by the average for 8 sets of speech data for experiments 1 and 2. In experiment 3, 16 different initial models were tried for each of the 8 sets of speech samples. Therefore, the rate is the average of 128 trials.

図 4: Relationship between number of iterations and classification rate
\begin{figure}\begin{center}
\fbox{\epsfig{file=PS/figure1.epsf,height=50mm}}%% ,width=70mm}}
\end{center}\end{figure}

The following observations can be obtained from this figure.

  1. The average speaker classification rates of the Viterbi and the forward algorithm differ little. However, in experiment1 , ( true values are given to all parameters), forward decoding gives 94.0 %, while Viterbi decoding gives only 48.3 %.

  2. In experiment 2 (the true value is given only to the symbol output probability $\mbox{\boldmath$B$}^{(0)}$ ), the average classification rate is about 75%. However, the average classification rate decreases with training.

  3. In experiment 3 ( $\mbox{\boldmath$B$}^{(0)}$ is set at random ), the average classification rate is low, being 30% to 35%. This value is not improved even if the amount of training is increased.

    The reason seems to be as follows. In this experiment, the LPC analysis condition is the same as used for speech recognition [2]. Thus the parameters have a phoneme feature, but not a speaker feature.




Subsections
next up previous
次へ: Speaker Feature and Long 上へ: Unknown-Multiple Signal Source Clustering 戻る: Evaluation Method of Speaker
Jin'ichi Murakami 平成13年1月19日