Table 4 summarizes human evaluation results of our machine translation evaluation for the JE and EJ tasks. In this table, ``Proposed'' indicates our proposed system. ``adequacy'' indicates the average adequacy . ``acceptability'' indicates the average acceptability. ``()'' means the order of all entry systems. For example, Our system was the 7th place in 19 system for the average adequacy in JE task.
task | adequacy | acceptability | ||
pairwise comparison score | (tie) | |||
Proposed | JE | 2.73 | 0.4604 | 0.3312 |
(RBMT+SMT) | (7/19) | (8/14) | (9/14) | |
Proposed | EJ | 2.6 | 0.4318 | 0.2992 |
(RBMT+SMT) | (9/17) | (8/11) | (5/11) |
As seen in these results, our method was so effective. And the BLUE sore was worse compared to other systems. Howver results of human evaluation was good compared to other systems.