Incremental Training and Intentional Over-fitting of Word Alignment

Qin Gao, Will Lewis, Chris Quirk, and Mei-Yuh Hwang

Abstract

We investigate two problems in word align-ment for machine translation. First, we com-pare methods for incremental word alignment to save time for large-scale machine transla-tion systems. Various methods of using exist-ing word alignment models trained on a larger, general corpus for incrementally aligning smaller new corpora are compared. In ad-dition, by training separate translation tables, we eliminate the need for any re-processing of the baseline data. Experimental results are comparable or even superior to the baseline batch-mode training. Based on this success, we explore the possibility of sharpening alignment model via incremental training scheme. By first training a general word alignment model on the whole corpus and then dividing the same corpus into domain-specific partitions, followed by applying in-cremental training to each partition, we can improve machine translation quality as meas-ured by BLEU.

Details

Publication typeInproceedings
Published inProceedings of MT Summit XIII
PublisherAsia-Pacific Association for Machine Translation
> Publications > Incremental Training and Intentional Over-fitting of Word Alignment