Share on Facebook Tweet on Twitter Share on LinkedIn Share by email
Dependency Tree Translation: Syntactically Informed Phrasal SMT

Chris Quirk, Arul Menezes, and Colin Cherry

Abstract

We describe a novel approach to statistical machine translation that combines syntactic information in the source language with recent advances in phrasal translation. We use a source-language dependency parser and a word-aligned parallel corpus. The only target language resource assumed is a word breaker. These are used to produce treelet ("phrase") translation pairs as well as several models, including a channel model, an order model, and a target language model. Together these models and the treelet translation pairs provide a powerful and promising approach to MT that incorporates the power of phrasal SMT with the linguistic generality available in a parser. We evaluate two decoding approaches, one inspired by dynamic programming and the other employing an A* search, comparing the results under a variety of settings.

Details

Publication typeTechReport
NumberMSR-TR-2004-113
Pages48
InstitutionMicrosoft Research
> Publications > Dependency Tree Translation: Syntactically Informed Phrasal SMT