|
|
Speech Technology Home
Speech Recognition Project: Language Modeling
Did I just say "It's fun to recognize speech?" or "It's fun to wreck a nice
beach?" It's hard to tell because they sound about the same. Of course, it's a
lot more likely that I would say "recognize speech" than "wreck a nice beach."
Language models help a speech recognizer figure out how likely a word sequence
is, independent of the acoustics. This lets the recognizer make the right guess
when two different sentences sound the same.
Our language modeling research falls into several categories:
- Language Model Adaptation. Natural language technology in general and
language models in particular are very brittle when moving from one domain to
another. Current statistical language models are built from text specific to
newspapers and TV/radio broadcasts which has little to do with the everyday use
of language by a particular individual. We are investigating means of adapting
a general-domain statistical language model to a new domain/user when we have
access to limited amounts of sample data from the new domain/user.
- Can Syntactic Structure Help? Current language models make no use of the
syntactic properties of natural language but rather use very simple statistics
such as word co-occurences. Recent results show that incorporating syntactic
constraints in a statistical language model reduces the word erroror rate on a
conventional dictation task by 10% . We
are working on finding the best way of "putting language into language models"
as well as exploring the new possibilities opened by such structured language models
for other tasks such as speech and language understanding.
- Speech Utterance Classification A simple first step to more natural
user interfaces in interactive voice response systems is automated call routing.
Instead of listening to prompts like "If you are trying to reach department X say Yes,
otherwise say No" or punching keys on your telephone keypad, one could simply state in
a sentence what the problem is, for example "There is a fraudulous transaction on my
last statement" and get connected to the right customer service representative.
We are developing technology that aims at classifying speech utterances in a limited
set of classes, enhancing the role of the traditional language model such that
it also assigns a category to a given utterance
- Building the best language models we can. In general, the better the
language model, the lower the error rate of the speech recognizer. By putting
together the best results available on language modeling, we have created a
language model that outperforms a standard baseline by 45%, leading to a 10%
reduction in error rate for our speech recognizer. The system has the best
reported results of any language model.
- Language modeling for other applications. Speech recognition is not the
only use for language models. They are also useful in fields like handwriting
recognition, spelling correction, even typing Chinese! Like speech recognition,
all of these are areas where the input is ambiguous in some way, and a language
model can help us guess the most likely input. We're also working on finding
new uses for language models, in other areas.
|