Share on Facebook Tweet on Twitter Share on LinkedIn Share by email
Learning prepositional attachment from sentence aligned bilingual corpora

Takako Aikawa, Chris Quirk, and Lee Schwartz

Abstract

Prepositional phrase attachment (PP attachment) is a major source of ambiguity in English. It poses a substantial challenge to Machine Translation (MT) between English and languages that are not characterized by PP attachment ambiguity. In this paper we present an unsupervised, bilingual, corpus-based approach to the resolution of English PP attachment ambiguity. As data we use aligned linguistic representations of the English and Japanese sentences from a large parallel corpus of technical texts. The premise of our approach is that with large aligned, parsed, bilingual (or multilingual) corpora, languages can learn non-trivial linguistic information from one another with high accuracy. We contend that our approach can be extended to linguistic phenomena other than PP attachment.

Details

Publication typeInproceedings
URLhttp://www.amtaweb.org/summit/MTSummit/FinalPapers/39-Aikawa-final.pdf
PublisherAssociation for Machine Translation in the Americas
> Publications > Learning prepositional attachment from sentence aligned bilingual corpora