next up previous
次へ: Discussion of Compound/Complex Sentences 上へ: Discussion 戻る: Translation Accuracy of Proposed

Comparison of Automatic Evaluation and Human Evaluation

The human evaluation showed that the proposed method worked for the simple sentences and the compound/complex sentences of the A-rank. On the other hand, the BLEU scores of the proposed method and the baseline(Moses) were not so good. We thought that they are the problem of automatic evaluation.
