Share this page
Share this page E-mail this page Print this page RSS feeds
Home > Publications > Fast and Accurate Sentence Alignment of Bilingual Corpora
Fast and Accurate Sentence Alignment of Bilingual Corpora

We present a new method for aligning sentences with their translations in a parallel bilingual corpus. Previous approaches have generally been based either on sentence length or word correspondences. Sentence-length-based methods are relatively fast and fairly accurate. Word-correspondence-based methods are generally more accurate but much slower, and usually depend on cognates or a bilingual lexicon. Our method adapts and combines these approaches, achieving high accuracy at a modest computational cost, and requiring no knowledge of the languages or the corpus beyond division into words and sentences.

sent-align2-amta-final.pdf
PDF file

Publisher: Springer-Verlag
All copyrights reserved by Springer 2002.

Details

Type: Inproceedings
URL: http://www.springer-ny.com/