Share on Facebook Tweet on Twitter Share on LinkedIn Share by email
Haitian Creole: How to Build and Ship an MT Engine from Scratch in 4 Days, 17 Hours, & 30 Minutes

William Lewis

Abstract

We describe the effort of the Microsoft Translator team to develop a Haitian Creole statistical machine translation engine from scratch in a matter of days. Haitian Creole presents a number of difficulties for devleoping an SMT system, principal among these is the lack of significant amounts of parallel training data and an inconsistent orthography, both of which lead to data sparseness. We demonstrate, however, that it is possible to build a translation engine of reasonable quality over very little data by engaging with the native language community and reducing data sparseness in creative ways. As such, we

show that MT as a technology and as a service can be deployed rapidly in crisis situations.

Details

Publication typeInproceedings
Published inEAMT 2010: Proceedings of the 14th Annual conference of the European Association for Machine Translation
PublisherEuropean Association for Machine Translation
> Publications > Haitian Creole: How to Build and Ship an MT Engine from Scratch in 4 Days, 17 Hours, & 30 Minutes