*
Quick Links|Home|Worldwide
Microsoft*
Search for


Projects

Microsoft Research

The Microsoft Research ESL Assistant is a web service that provides correction suggestions for typical ESL (English as a Second Language) errors. Such errors include, for example, the choice of determiners (the/a) and the choice of prepositions. The web service also provides word choice suggestions from a thesaurus. In order to help the user make decisions on whether to accept a suggestion, the service displays "before and after" web search results so that the user can see real-life examples of the usage of both their original input and the suggested correction. An Outlook plugin that connects to the web service and copies text from an email into the web service UI is also available.



MSR Core Team



External Contributors


Publications


The Web UI

The text to be checked is entered in the box at the top. The user can check if they would like the MS Office spellchecker to be applied before checking for ESL errors (checkbox to the right). Possible errors are marked with a squiggle, and when hovering over the squiggled word(s), a suggested correction appears. Hovering over the suggested correction will cause "before and after" search results to be shown in the two parallel Live Search panes in the lower part of the screen. See the web UI in action.

There is also an Outlook 2007 plugin that allows one to copy email text into the web UI to be checked (See the Outlook plugin in action.)



More Details

The basic architecture of our system consists of three parts: a set of modules that identify possible corrections, a large language model that evaluates the possible suggestions, and a module that produces search results using Live Search. The individual error modules target specific errors each, and some of these models are based on heuristics, while others use machine learned classifiers. Information that the modules take into account includes the presence of specific words as well as the sequence of part-of-speech tags that are automatically assigned. The language model is trained on the Gigaword corpus, a very large collection of text, and serves as a filter on the suggested corrections: only suggestions that produce a significantly higher language model score than the original user input will be shown to the user.



Team Blog

News and updates will be provided on our team blog site on MSDN.





©2008 Microsoft Corporation. All rights reserved. Terms of Use |Trademarks |Privacy Statement