Training MRF-Based Phrase Translation Models using Gradient Ascent

The North American Chapter of the Association for Computational Linguistics (NAACL) |

Published by Association for Computational Linguistics

This paper presents a general, statistical framework for modeling phrase translation via Markov random fields. The model allows for arbituary features extracted from a phrase pair to be incorporated as evidence. The parameters of the model are estimated using a large-scale discriminative training approach that is based on stochastic gradient ascent and an N-best list based expected BLEU as the objective function. The model is easy to be incoporated into a standard phrase-based statistical machine translation system, requiring no code change in the runtime engine. Evaluation is performed on two Europarl translation tasks, German-English and French-English. Results show that incoporating the Markov random field model significantly improves the performance of a state-of-the-art phrase-based machine translation system, leading to a gain of 0.8-1.3 BLEU points.