Qin Gao, Will Lewis, Chris Quirk, and Mei-Yuh Hwang
We investigate two problems in word align-ment for machine translation. First, we com-pare methods for incremental word alignment to save time for large-scale machine transla-tion systems. Various methods of using exist-ing word alignment models trained on a larger, general corpus for incrementally aligning smaller new corpora are compared. In ad-dition, by training separate translation tables, we eliminate the need for any re-processing of the baseline data. Experimental results are comparable or even superior to the baseline batch-mode training. Based on this success, we explore the possibility of sharpening alignment model via incremental training scheme. By first training a general word alignment model on the whole corpus and then dividing the same corpus into domain-specific partitions, followed by applying in-cremental training to each partition, we can improve machine translation quality as meas-ured by BLEU.
In Proceedings of MT Summit XIII
Publisher Asia-Pacific Association for Machine Translation
Asia-Pacific Association for Machine Translation