In our experiments, the translation performance of A and B ranks had excellent quality. The quality of these ranks was comparable with that of the human translation results. However, there were many non-translated sentences that did not match translation patterns. Also, the results of D ranks were not so good. So we used a trial rule-based machine translation to translate non-translated sentences and D-rank sentences. Finally we submitted these results to an NTCIR-10 organizer for our translation results(TORRI).
According to the NTCIR-10 organizer, the results of our system were as follows. Our system was in 5th place amongst 19 systems for the adequacy score, in 2nd place amongst 9 systems for the acceptability score, in 8th place amongst 30 systems for RIBES, and in 22nd place amongst 30 systems for BLEU.
This means that our system was good for human evaluation. However, the results of the automatic evaluation were not so good.