next up previous
次へ: Conclusion 上へ: Unknown-Multiple Signal Source Clustering 戻る: Selection of initial model

Discussions


This paper described the problem of identifying multiple speaker utterances, as an example of the unknown-multiple signal source clustering problem. However many unsolved problems remain as described below.

  1. Evaluation of classification rate

    It is necessary to develop a method for evaluating the speaker classification rate when the number of speakers is large. This paper examined all possibilities, and the highest value was taken as the classification rate. However, the number of possible combinations is the factorial of the number of speakers. Thus, it is necessary to speed up the evaluation.

  2. Time resolution

    Speaker transition occurred in some frames because of the LPC analysis. In such a frame, the speaker can not be determined uniquely. In other words, the time resolution of the speaker classification depends on the LPC frame window length. This problem must be studied and resolved.

  3. Estimation of the number of categories $N$

    In this experiment, the number of speakers (the number of categories) was set as 4. This means that the number of speakers is a priori knowledge. A technique is needed that can estimate the number of speakers.



next up previous
次へ: Conclusion 上へ: Unknown-Multiple Signal Source Clustering 戻る: Selection of initial model
Jin'ichi Murakami 平成13年1月19日