Program of phrase-based SMT (Word reordering)

Next: Proposed Method Up: Phrase-Based Statistical Machine Translation Previous: Outline of Statistical Machine

Program of phrase-based SMT (Word reordering)

As stated in the introduction, phrase-based SMT has been very popular. However, there are many serious problems. One is the translation performance. For Japanese-English translation, the rule-based machine translation system is better than the phrase-based SMT[4]. There are about 3,000,000 Japanese-English parallel sentences used with translating patents[4]. Even when using these parallel sentences, the performance of phrase-based SMT is lower than that of a rule-based machine translation system.

We considered that this is caused by the reordering model. Normally, the -gram model is used for the language model. However, this model has local, not global, information. To surmount this problem, a reordering model is normally used. However, this model is not so effective for Japanese-English translation. In our opinion, word reordering is also local, not global, information. There are many reordering models. For example, ``msd-bidirectional-fe'' is normally used. However, we think that word reordering is related to grammar, especially case grammar [2]. Therefore, we believe it is not a statistical phenomenon.

Next: Proposed Method Up: Phrase-Based Statistical Machine Translation Previous: Outline of Statistical Machine

Jin'ichi Murakami 2013-06-26