Share on Facebook Tweet on Twitter Share on LinkedIn Share by email
Large-scale deployment of statistical machine translation: Example Microsoft

Chris Wendt

Abstract

Microsoft has a history of being both a supplier of machine translation technology and a user of it. The use scenarios include Microsoft’s own localization and publishing needs, as well as the users of Microsoft’s search engine. In this talk we share the design basics of our statistical MT system and its implementation as a web service, up to the service architecture and approach to scale. We discuss the design and quality criteria for users of our external site at http://translator.live.com and the integration with Microsoft’s Search engine. Lastly, we describe how a custom tailored version of our engine helps internal teams to publish more content in more languages, growing the extent of localization that way, and how we monitor the effect for the users of the material we publish.

MT engine basics

Microsoft’s MT uses a hybrid model: for language pairs where we can make use of substantial linguistic information we make use of grammar and syntax knowledge in pre-and post-processing around a statistical core engine. Where we do not have as much information, we resort to a purely statistical model which scales well to a large number of language pairs. We outline some of the design criteria and experiences with each approach.

Architecture and design for scale

The nature of statistical MT systems is that they are big and slow. An intelligent architecture needs to accommodate that, and provide acceptable performance to an individual document translation request as well as balance the needs of all users of the system. We describe how an implementation can deliver on these design goals.

Search and Translator

Consumer use of a general-purpose machine translation engine through a Search engine and via a Translator web site needs to address the shortcoming of today’s MT services. We explain why we chose the bilingual view that is the trademark of http://translator.live.com, and the design criteria that went into it, as well as the end user feedback we have received and the next steps to take for improving the experience for users as well as site owners.

Use in Human Translation and Raw Publishing

Microsoft has experience in using MT to boost the productivity of human translators, internally and together with our translation service providers. It also publishes raw MT on some of its web sites. We detail the experience with it, and how users react to this material.

Details

Publication typeProceedings
Published inAMTA Conference 2008
> Publications > Large-scale deployment of statistical machine translation: Example Microsoft