Many machine translation systems have been studied for long time. The first generation was a rule-based translation method, which was developed over the course of many years. This method had translation rules that were written by hand. Thus, if the input sentence completely matched the rule, the output sentence had the best quality. However, many expressions are used for natural language, this technology had very small coverage. In addition, the main problem are that the cost to write rules was too high and that maintaining the rules was hard.
Recently a statistical machine translation method is very popular now. This method is based on the statistics and easy to build if parallel corpus are existed. There are many versions of statistical machine translation models available. An early model of statistical machine translation was based on IBM 1 5[1]. This model is based on individual words, and thus a ``null word'' model is needed. However, this ``null word'' model sometimes has very serious problems, especially in decoding. Thus, recent statistical machine translation systems usually use phrase based models. This phrase based statistical machine translation model has translation model and language model. The phrase table is a translation model for phrase-based SMT and consists of Japanese language phrases and corresponding English language phrases and these probabilities. And word -gram model is used as a language model.
However some problems arise with phrase-based statistical machine translation. One problem is as follows. Normally, an -gram model is used as a language model. However, this model consists of local language information and does not have grammatical information.
Our system has a two-stage machine translation system. The first stage consists of Japanese-English rule based machine translation. In this stage, we obtained "ENGLISH" sentences from Japanese sentences. We aim to achieve "ENGLISH" sentences that are generally grammatically correct. However, these "ENGLISH" sentences have low levels of naturalness because they were obtained using rule-based machine translation. In the second stage, we used a normal statistical machine translation system. This stage involves "ENGLISH" to English machine translation. With this stage, we aim to revise the outputs of the first stage improve the naturalness and fluency.
We used a state-of-the-art trial rule based machine translation system for the first stage. We used general statistical machine translation tools for the second stage, such as "Giza++"[5], "moses" [7], and "training-phrase-model.perl" [14]. We used NTCIR-7 and NTCIR-8 data. It means we used 3,186,284 sentences. Also, the score was not optimized, and our method was still very promising. We used these data and these tools and participated in JE and EJ at NTCIR-9.
From the results of experiments, we obtained a BLEU score of 0.1996 in the JE task using our proposed method. In contrast, we obtained a BLEU score of 0.1436 in the JE task using a standard method (moses). And we obtained a BLEU score of 0.2775 in the EJ task using our proposed method. In contrast, we obtained a BLEU score of 0.0831 in the EJ task using a standard method (moses). This means that our proposed method was effective for all task. On the other side, our system was the 28th place in 36 system for BLEU score in JE task. And our system was the 7th place in 19 system for average adequacy score in JE task. It means that our system is better for human evaluation. And it means that the BLEU score is not reliable. Same trend have obtained for EJ task.
For the future study, we will try to improve the performance of RBMT system. So, we will continue to develop the method and try again in the future.