The Microsoft Research ESL Assistant is a web service that provides correction suggestions for typical ESL (English as a Second Language)
errors. Such errors include, for example, the choice of determiners (the/a) and the choice of prepositions. The web service also provides word choice suggestions
from a thesaurus. In order to help the user make decisions on whether to accept a suggestion, the service displays "before and after"
web search results so that the user can see real-life examples of the usage of both their original input and the suggested correction.
An Outlook plugin that connects to the web service and copies text from an email into the web service UI is also available.
MSR Core Team
External Contributors
Publications
-
Gamon, M., J. Gao, C. Brockett, A. Klementiev, W. B. Dolan, D. Belenko, and L. Vanderwende 2008:
Using Contextual Speller Techniques and Language Modeling for ESL Error Correction.
In Proceedings of IJCNLP, Hyderabad, India.
Chris Brockett, 相川孝子, Michael Gamon, Dmitriy Belenko, Jianfeng Gao, William B. Dolan. 2008.
WEB検索による英語学習支援ツール:教育と実用の接点にて (An English Writing Support Tool Using Web Queries: At the Intersection of Education and Practice.
In Proceedings of 言語処理学会第14回年次大会: ワークショップ「教育・学習を支援する言語処理」 (Workshop on Language Processing in Support of Instruction and Learning, 14th Annual Meeting of the Society of Natural Language Processing) , Tokyo, Japan.
(slides)
The Web UI
The text to be checked is entered in the box at the top. The user can check
if they would like the MS Office spellchecker to be applied before checking for ESL errors (checkbox to the right). Possible errors are
marked with a squiggle, and when hovering over the squiggled word(s), a suggested correction appears. Hovering over the suggested correction will cause "before and after" search results to be shown in the two parallel Live Search panes in the lower part of the screen.
See the web UI in action.
There is also an Outlook 2007 plugin that allows one to copy email text into the web UI to be checked (See the Outlook plugin in action.)

More Details
The basic architecture of our system consists of three parts: a set of modules that identify possible corrections,
a large language model that evaluates the possible suggestions, and a module that produces search results using Live Search. The individual error modules target specific errors each, and some of these models are based on heuristics, while others use machine learned classifiers. Information that the modules take into account includes the presence of specific words as well as the sequence of part-of-speech tags that are automatically assigned. The language model is trained on the Gigaword corpus, a very large collection of text, and serves as a filter on the suggested corrections: only suggestions that produce a significantly higher language model score than the original user input will be shown to the user.

Team Blog
News and updates will be provided on our team blog site on MSDN.
|
|
|