A Discriminative Lexicon Model for Complex Morphology

This paper describes successful applications of discriminative lexicon models to the statistical machine translation (SMT) systems into morphologically complex languages. We extend the previous work on discriminatively trained lexicon models to include more contextual information in making lexical selection decisions by building a single global log-linear model of translation selection. In offline experiments, we show that the use of the expanded contextual information, including morphological and syntactic features, help better predict words in three target languages with complex morphology (Bulgarian, Czech and Korean). We also show that these improved lexical prediction models make a positive impact in the end-to-end SMT scenario from English to these languages.

DiscLex_AMTA2010_camera.pdf
PDF file

In  The Ninth Conference of the Association for Machine Translation in the Americas

Publisher  Association for Computational Linguistics

Details

TypeInproceedings
Share
Share this page on Facebook
Share this page on Twitter
Share this page on LinkedIn
E-mail this page
RSS feeds
> Publications > A Discriminative Lexicon Model for Complex Morphology