next up previous
次へ: Discussion 上へ: Results of our Machine 戻る: Automatic Evaluation Results

Human Evaluation Results

Table 4 summarizes human evaluation results of our machine translation evaluation for the JE and EJ tasks. In this table, ``Proposed'' indicates our proposed system. ``adequacy'' indicates the average adequacy . ``acceptability'' indicates the average acceptability. ``()'' means the order of all entry systems. For example, Our system was the 7th place in 19 system for the average adequacy in JE task.


表 4: Results of Human Evaluation
  task adequacy acceptability  
      pairwise comparison score (tie)
Proposed JE 2.73 0.4604 0.3312
(RBMT+SMT)   (7/19) (8/14) (9/14)
Proposed EJ 2.6 0.4318 0.2992
(RBMT+SMT)   (9/17) (8/11) (5/11)










As seen in these results, our method was so effective. And the BLUE sore was worse compared to other systems. Howver results of human evaluation was good compared to other systems.


next up previous
次へ: Discussion 上へ: Results of our Machine 戻る: Automatic Evaluation Results
平成24年1月18日