It’s 2054 in Washington, D.C. Anderton speaks on the phone while gesturing in the air with both his hands, maneuvering programs and windows on a dazzling transparent screen. Even viewers who were not fans of science fiction or computer games were amazed by the technology in the movie "Minority Report."
In truth, it may be possible to look as cool as Tom Cruise’s Anderton, as gesturing before a machine may soon no longer be a wasteful use of energy. At Microsoft Research Asia, “Writing in the Air” technology has enabled computers, smart appliances, and gaming consoles to recognize the text that one “writes” by waving one’s hands in the air, in languages including Chinese, Japanese, Korean and English. The process kind of resembles a game of charades, where the one that’s guessing is a machine “trained” to think like a human. The only hardware requirement for “Writing in the Air” is a regular computer, a camera, and a game controller -- interaction between man and machine can be realized without a mouse, keyboard, or even a touch screen found on a gadget like the iPhone.
If the keyboard-mouse input and touch screen control represent, respectively, the first two stages of man-machine interaction, “Writing in the Air” is perhaps “Man-Machine 3.0.” "I cannot help but get excited when thinking of the numerous imaginative applications that can be developed on the basis of this innovative technology.” said Qiang Huo, Lead Researcher from Microsoft Research Asia, as his colleague Lei Ma, with an orange in hand, demonstrated this technology by “writing” the name of their company in Chinese. “This is the most anticipated moment for a researcher. No one can foresee what applications software developers and Microsoft platform developers will create with the writing in air recognition technology. It can go as far as we dare to imagine."
It is not easy to make a machine understand a person’s ideas. Traditional keyboard input and the more advanced touch control input frequently adopted in smart phones are able to clearly “inject” information into machines. “Writing in the Air” technology looks more like the traditional Chinese martial art of Tai Chi not only because the gestures need to be as smooth as flowing water, but also in that the hands must be able to move wherever the brain wants them to.
The project group came up with two separate ways to achieve its goals: the low-cost edition was to have an ordinary video camera observe users’ hand movements; the other edition was to adopt sensors known as gyroscopes and accelographs to capture hand movements. Such movements, when captured would allow the recognition module to identify the text written in the air.
"The entire process of capturing, mapping, recognition, and display involves two technologies: the capturing of moving objects, and handwriting recognition," said Lei Ma. One has to choose an object whose color is distinctive from its background, put it in the middle of the video capture frame, and then press the button on the game controller to start writing. One also has to “teach” the computer how to identify the selected object so that it will not be disturbed by the movements of other objects in the background.
The camera maps a two-dimensional track based on the three-dimensional movement of the target object, after a proper treatment of all the blurring and jittering in the captured images. With clear-cut tracks, the handwriting recognition module analyzes the image to obtain the desired text. "This seemingly simple process of image analysis and character recognition has been under research for decades in many related areas," said Qiang Huo. "You get similar ‘tracks’ from similar characters such as ‘3’ and ‘g’, ‘0’ and ‘o’, and ‘1’ and ‘l’, which certainly adds to the difficulty in recognition." The final results take into consideration the analysis of both the writing procedure and the track image. “Shape information is very important, and comes first,” says Qiang Huo, “but the writing process provides a very useful reference."
According to Lei Ma, “Writing in the Air” differs from writing pad recognition in that people write characters with separate strokes and sharp turning points on the pad, but with connected strokes and no sharp turning points in the air. The process of writing in the air is like " Tai Chi, as smooth as flowing water,” Lei Ma says. “There is no sharp turning point in the track of a target object even if its speed drops to zero, so a lot of special processing is needed here."
Bill Gates once predicted that human-computer interaction would become increasingly similar to the interaction between people. Both Apple’s iPhone and Nintendo’s Wii game console owe their popularity to their outstanding interactive user interfaces. As part of its natural user interface, Microsoft's next-generation operating system, Windows 7, will also have built-in touch control support. Microsoft Surface, the company’s desktop with smart touch control, looks more like a multi-functional desktop screen.
"It was meant to address the issue of gesture-based character input in the absence of keyboards or writing pads,” said Frank Soong, Principal Researcher and Research Manager of the Speech Group at Microsoft Research Asia, when asked about the original purpose of this technology. “This is surely an innovation, either as a technology or as an improvement in user experience. I believe this innovation will likely be applied to many of Microsoft's current and future products and services."
As far as R&D is concerned, the “Writing in the Air” technology brings people one step closer to the things they can imagine or provides a preview of the mainstream interactive user experience of the future. The result will be more and more “hand talkers” who can enjoy Microsoft’s human-computer interaction experience.
Lei Ma described three possible applications of the “Writing in the Air” technology. In interactive gaming, the user can input characters, such as names, answers, or even symbols, to enjoy a different and potentially more fun experience. As for devices with Internet access, such as Internet TV (IPTV), the Xbox, and smart appliances, remote "gestures" could be a convenient option for character input when searching for video content on an IPTV, or indexing game animations in the Xbox. “Hand talk” can also serve as an amusing way for parents to teach young children how to move around. Thanks to the advantages of remote input as compared with a keyboard or mouse, “writing in air” recognition technology would help Microsoft Research Asia address practical problems and enable other interesting services. Moreover, upgraded editions of this technology are expected to allow for more than one tracking point, which could be applied in Microsoft's map search and zooming.
"All our research and development efforts aim to enhance user experience. Microsoft Research Asia works with the Product Division to look for possible applications of these innovative technologies, or to decide on the proper time for technology transfer,” said Qiang Huo. “It can be the most exciting thing for researchers to see their innovative ideas adopted by thousands of families."
Qiang Huo and Lei Ma just returned from their trip to Microsoft's headquarters in Redmond, where they presented their “Writing in the Air” technology at the annual TechFest event at Microsoft Research. It attracted the attention of Microsoft’s Products Division, colleagues from other Microsoft Research offices, as well as the mass media. Well-known blogger Chris Pirillo wrote in one of his articles that he was amazed by the technology though it still has no specific application. Xbox fans, he added, should keep their eyes open for when this technology is applied to gaming devices.
Such innovations in basic research are powerful simply because of their infinite possibilities. They may one day lead to countless products that can make a wonderful impact on everyday life. Microsoft has been good at laying a solid foundation for future development by mobilizing its extensive research resources. “Writing in the air” recognition is one step forward in enriching its technological arsenal for the future. It’s also a reflection of the “charm of R&D” at Microsoft Research Asia today.