Evaluation Method of Speaker Classification Rate

次へ: Experimental results 上へ: Speaker Classification Problem 戻る: Initial HMM parameters

Evaluation Method of Speaker Classification Rate

Even though the optimal state sequence is obtained using forward decoding or Viterbi decoding, the relation between the classification and the speaker is still unknown. Therefore, we calculated the classification rate by the following expression.

$\begin{displaymath} R = \frac{1}{T} \max_{\sigma} \sum_{t=1}^T d (\tau(\mbox{\boldmath$x$}_t),\sigma(S_t)) \end{displaymath}$

(1)

In the above, $\tau$ is the optimal state sequence. $\sigma$ is an arbitrary permutation of $(1,2,\ldots, N)$ , and is the correct speaker of the utterance. is the variable that takes value 1 if the values agree, and 0 if otherwise.

In this study, the classification numbers are related to $\sigma$ , the correct classification rate is calculated for each of permutations, and the maximum is defined as the classification rate. Consequently, in the case of 4 speakers, combinations are examined.

Jin'ichi Murakami 平成13年1月19日