next up previous
次へ: Decoding 上へ: Concepts of our Statistical 戻る: Concepts of our Statistical


Training

The training process is as follows.

  1. Parallel Corpus

    We prepare a Japanese-English parallel corpus.

  2. Rule-based Machine Translation

    We used a Japanese-English rule-based machine translation. Thus, we obtain "ENGLISH" sentences from Japanese sentences. These "ENGLISH" sentences are pairs of English sentences.

  3. Make "ENGLISH"-English phrase table

    We make an "ENGLISH"-English phrase table using training-phrase-model.perl[10].

  4. English $ N$ -gram model

    We make an $ N$ -gram model from English sentences using SRILM [6].

Fig. 1 shows the flow chart of the training process.

図 1: Flowchart of Training
\fbox{ \includegraphics[width=0.9\columnwidth]{figure/figure1.eps} }



Jin'ichi Murakami 平成22年7月5日