Language Modeling for Soft Keyboards
Joshua Goodman, Gina Venolia, Keith Steury, and
Chauncey Parker
Microsoft Research
Summary: By combining language models and a pen down position model, we can reduce error rates for soft keyboards by a factor of 1.7 to 1.9
Soft
Keyboard: An image of a keyboard that can be tapped with a stylus.
Perhaps the fastest
way to enter text on handheld
computers.
Language
Model: Computes probability of sequences of letters.
Probabilities are determined
by counting occurrences
in real text.
Language
models are used in speech
recognition, handwriting
recognition, information
retrieval, etc.
Experiments:
8 users were asked to type four sets of 1000 characters, two sets with the language model, and two sets without, counterbalancing the order. No significant difference in speed was observed.
Error rate using
the language model was
reduced by a factor of
1.7 to 1.9.
Pen
Down Position Model: Probability of pen position given intended key.
A simple Gaussian
works well, but the mean of the
Gaussian is shifted
from the center of the key,
and there is some covariance
between x and y. Other factors, such as pen up position, and error at previous time were not useful.
Conclusion: Language models can be used to substantially reduce error rates. The
technique could be applied to
many different input types.
Intuition: If the user taps on or near key boundaries, we can use the language model to guess the intended letters. For
instance, if user taps “q” and
then taps between “u” and “i”, he
intended “qu”
In
fact, even if the user hits inside
“i”, he probably meant “qu”:
Future
information can change our
guess for the past.
Mathematics: Find the most probable letter sequence given the observed pen down positions:
Blue letters may change later. When user hits the r, the system corrects the error.
