next up previous
次へ: Human Evaluation Results 上へ: Automatic Evaluation Results 戻る: Simple Sentences

Compound/Complex Sentences

We used 10,000 English sentences in this experiment. We obtained 823 sentences matching the English-Japanese translation patterns. We obtained 408 sentences in the A-rank, 31 in the B-rank, 16 in the C-rank, and 368 in the D-rank. And we compared the proposed method and the baseline(Moses) for each rank. The automatic evaluation results are listed in Table XVI.

表 XVI: Automatic Evaluation Results
\begin{tabular}{\vert l\vert c\vert c\vert c\vert c\vert}
All rank(823) & 0.3630 & 5.6752 & 0.3563 & 5.6218\\ \hline

Table XVI shows that the BLEU and NIST values were higher for the A-rank, B-rank, and the C-rank. That is, the proposed method was better than the baseline(Moses) for the A-rank, the B-rank, and the C-rank but not the D-rank.
