Function Word Generation in Statistical Machine Translation Systems

MT Summit 2011 |

Function words play an important role in sentence
structures and express grammatical relationships
with other words. Most statistical
machine translation (SMT) systems do not
pay enough attention to translations of function
words which are noisy due to data sparseness
and word alignment errors. In this paper,
a novel method is designed to separate the
generation of target function words from target
content words in SMT decoding. With this
method, the target function words are deleted
before the translation modeling while in SMT
decoding they are inserted back into the translations.
To guide the target function words
insertion, a new statistical model is proposed
and integrated into the log-linear model for
SMT, which can lead to better reordering and
partial hypotheses ranking. The experimental
results show that our approach improves the
SMT performance significantly on Chinese-English
translation task.