Li Deng, Alex Acero, Ye-Yi Wang, Kuansan Wang, Hsiao-Wuen Hon, Jasha Droppo, Milind Mahajan, and XD Huang
Speech technology has been playing a central role
in enhancing human-machine interactions, especially for small
devices for which CUI has obvious limitations. The speechcentric
perspective for hnman-compnter interface advanced in
this paper derives from the view that speech is the only natural
and expressive modality to enable people to access information
from and to interact with any device. In this paper, we describe
the work conducted at Microsoft Research, in the project
codenamed &.Who, aimed at the development of enabling
technologies for speech-centric multimodal human-computer
interaction. In particular, we present MiF'ad as the first Dr.
Who's application that addresses specifically the mobile user
interaction scenario. MiPad is a wireless mobile PDA prototype
that enables users to accomplish many common tasks using a
multimodal spoken language interface and wireless-data
technologies. It fuUy integrates continuous speech recognition
and spoken language understanding, and provides a novel
solution to the current prevailing problem of pecking with tiny
styluses or typing on minuscule keyboards in today's PDAs or
In Proc. of the IEEE Fifth Workshop on Multimedia Signal Processing
Publisher Institute of Electrical and Electronics Engineers, Inc.
© 2007 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.