We tried to used all training data ( 2920434 sentence ) to improve these results. Also we did not use reordering models or not optimize these parameters using MERT in this experiments.
Table4 shows the results of these experiments. As can be seen this table, proposed method was so effective for BLEU score. Theses BLEU score was the best group of NTCIR-8.
task | bleu | nist | meteor | |
Proposed | Intrinsic-JE | 0.2924 | 7.2904 | 0.6216 |
Baseline (moses) | Intrinsic-JE | 0.2229 | 6.1266 | 0.5842 |
Proposed | Intrinsic-EJ | 0.3276 | 7.5638 | |
Baseline (moses) | Intrinsic-EJ | 0.3232 | 7.2663 |