next up previous
Next: Improved Pattern Based Statistical Up: Discussion Previous: Analysis of Our Proposed

Comparison with Hierarchical phrase-based MT

The pattern acquisition process in the proposed method was similar to the rule extraction of hierarchical statistical phrase-based MT (HSMT). Only, the confident rules are extracted in the proposed method. The reason are discussed follows.

Hierarchical SMT (HSMT) is similar to statistic CFG decoder. So, the number of HSMT parameters is very large. However the number of training data was limited. As the results, they are unreliable and does not perform well, especially for the small amount of training data. Contrast, the proposed method is pattern based. Pattern based approach is similar to network grammar. And it has little parameters compared CFG. So we might obtained these parameters with high reliability.

Also, HSMT has the problem of limiting reordering. The number of spans that are filled during chart decoding is quadratic with respect to sentence length. Hence, it gets worse according as the sentence length increases.

The number of spans that are combined into a span grows linear with sentence length for binary rules, quadratic for trinary rules, and so on. In short, long sentences become a problem. To solved this problem, the size of internal spans has a maximum number. Reordering is limited in hierarchical phrase-based models and should limit reordering for the same reason. On the other hand, the proposed method does not face with such problems because it used patterns. In this reason, we studied the proposed method.


next up previous
Next: Improved Pattern Based Statistical Up: Discussion Previous: Analysis of Our Proposed
Jin'ichi Murakami 2012-11-06