Outline of Statistical Machine Translation

Next: Program of phrase-based SMT Up: Phrase-Based Statistical Machine Translation Previous: Phrase-Based Statistical Machine Translation

Outline of Statistical Machine Translation

Statistical machine translation (SMT) was proposed in the 1990s [1]. This translation method uses source and target sentence pairs and has a translation model and language model. A decoder uses these models to output a target sentence with the maximum probability.

The following is an example of Japanese-English SMT [9].

$\displaystyle J$	$\displaystyle =$	$\displaystyle argmax_{w}P(e\vert j)$	(1)
	$\displaystyle \simeq$	$\displaystyle argmax_{e}P(j\vert e)P(e)$	(2)

Here, $P(j\vert e)$ means the Japanese-English translation model, and means the English language model. The translation model has the probabilities of Japanese words translated into English words. These probabilities are calculated from Japanese and English sentence pairs. The language model has the probabilities of target word strings.

The decoder selects the target sentence by referring to the translation model and language model probabilities. Statistical machine translation was initially word-based. Recently, it has become phrase-based.

Next: Program of phrase-based SMT Up: Phrase-Based Statistical Machine Translation Previous: Phrase-Based Statistical Machine Translation

Jin'ichi Murakami 2013-06-26