表中の括弧内の分母は評価データの総数を,分子は認識できたクロストーク音声 数を示す.また,話者の行には使用したクロストーク音声の話者の組合せを示す.
話者 | mau+ftk | mau+fyn | mms+ftk | mms+fyn | 平均 |
MFCC | 47% | 44% | 41% | 48% | 45% |
Diagonal | (47/100) | (44/100) | (41/100) | (48/100) | (180/400) |
MFCC | 57% | 62% | 54% | 50% | 56% |
Full | (57/100) | (62/100) | (54/100) | (50/100) | (223/400) |
FBANK | 27% | 40% | 29% | 38% | 34% |
Diagonal | (27/100) | (40/100) | (29/100) | (38/100) | (134/400) |
FBANK | 44% | 48% | 50% | 44% | 47% |
Full | (44/100) | (48/100) | (50/100) | (44/100) | (186/400) |
MELSPEC | 8% | 8% | 8% | 17% | 10% |
Diagonal | (8/100) | (8/100) | (8/100) | (17/100) | (41/400) |
話者 | mau+ftk | mau+fyn | mms+ftk | mms+fyn | 平均 |
MFCC | 46% | 55% | 34% | 44% | 45% |
Diagonal | (46/100) | (55/100) | (34/100) | (44/100) | (179/400) |
MFCC | 52% | 49% | 38% | 42% | 45% |
Full | (52/100) | (49/100) | (38/100) | (42/100) | (181/400) |
FBANK | 28% | 51% | 30% | 40% | 37% |
Diagonal | (28/100) | (51/100) | (30/100) | (40/100) | (149/400) |
FBANK | 41% | 40% | 34% | 35% | 38% |
Full | (41/100) | (40/100) | (34/100) | (35/100) | (150/400) |
MELSPEC | 30% | 45% | 24% | 32% | 33% |
Diagonal | (30/100) | (45/100) | (24/100) | (32/100) | (131/400) |
話者 | mau+ftk | mau+fyn | mms+ftk | mms+fyn | 平均 |
MFCC | 9% | 16% | 7% | 7% | 10% |
Diagonal | (9/100) | (16/100) | (7/100) | (7/100) | (39/400) |
FBANK | 1% | 1% | 1% | 2% | 1% |
Diagonal | (1/100) | (1/100) | (1/100) | (2/100) | (5/400) |
MELSPEC | 1% | 1% | 1% | 1% | 1% |
Diagonal | (1/100) | (1/100) | (1/100) | (1/100) | (4/400) |
実験結果より以下のことが得られた.
実験より以下の結果が得られた.