By Janie Chang
October 17, 2011 9:00 AM PT
Microsoft Research Redmond researchers Hrvoje Benko and Scott Saponas have been investigating the use of touch interaction in computing devices since the mid-’00s. Now, two sharply different yet related projects demonstrate novel approaches to the world of touch and gestures.
Wearable Multitouch Interaction gives users the ability to make an entire wall a touch surface, while PocketTouch enables users to interact with smartphones inside a pocket or purse, a small surface area for touch. Both projects will be unveiled during UIST 2012, the Association for Computing Machinery’s 24th Symposium on User Interface Software and Technology, being held Oct. 16-19 in Santa Barbara, Calif.
Wearable Multitouch Interaction turns any surface in the user’s environment into a touch interface. A paper co-authored by Chris Harrison, a Ph.D. student at Carnegie Mellon University and a former Microsoft Research intern; Benko; and Andy Wilson—describes a wearable system that enables graphical, interactive, multitouch input on arbitrary, everyday surfaces.
“We wanted to capitalize on the tremendous surface area the real world provides,” explains Benko, of the Natural Interaction Research group. “The surface area of one hand alone exceeds that of typical smart phones. Tables are an order of magnitude larger than a tablet computer. If we could appropriate these ad hoc surfaces in an on-demand way, we could deliver all of the benefits of mobility while expanding the user’s interactive capability.”
The Wearable Multitouch Interaction prototype is built to be wearable, a novel combination of laser-based pico projector and depth-sensing camera. The camera is an advanced, custom prototype provided by PrimeSense. Once the camera and projector are calibrated to each other, the user can don the system and begin using it.
“This custom camera works on a similar principle to Kinect,” Benko says, “but it is modified to work at short range. This camera and projector combination simplified our work because the camera reports depth in world coordinates, which are used when modeling a particular graphical world; the laser-based projector delivers an image that is always in focus, so didn’t need to calibrate for focus.”
The early phases of this work raised some metaphysical questions. If any surface can act as an interactive surface, then what does the user interact with and what is the user interacting on? The team also debated the notion of turning everything in the environment into a touch surface. Sensing touch on an arbitrary deformable surface is a difficult problem that no one has tackled before. Touch surfaces are usually highly engineered devices, and they wanted to turn walls, notepads, and hands into interactive surfaces—while enabling the user to move about. The researchers agree that the first three weeks of the project were the most challenging.
Harrison recalls their early brainstorming sessions.
“We had to assume it was possible,” he recalls, “then go about defining the system and its interactions, then conduct initial tests with different technologies to see how we could implement the concept. It was during those initial weeks that we achieved the biggest breakthroughs in our thinking. That was a really exciting stage of research.”
One of the key decisions for Wearable Multitouch Interaction was that the system would interact with fingers. This raised the challenge of finger segmenting: defining to the system what fingers look like so that it could identify fingers or shapes that looked like fingers. Following this decision was the notion that any surface underneath those fingers is potentially a projected surface for touch interaction.
Then came the next problem: click detection. How can the system detect a touch when the surface being touched contains no sensors?
“In this case, we're detecting proximity at a very fine level,” Benko explains. “The system decides the finger is touching the surface if it’s close enough to constitute making contact. This was fairly tricky, and we used a depth map to determine proximity. In practice, a finger is seen as “clicked” when its hover distance drops to one centimeter or less above a surface, and we even manage to maintain the clicked state for dragging operations.”
One of the more interesting discussions during this project was how to determine where to place the interface surface. The team explored two approaches. The first was a classification-driven model in which the system classified specific objects that could be used as a surface: a hand, an arm, a notepad, or a wall. This required creating a machine-learning classifier to learn these objects.
The second approach took a completely user-driven model, enabling the user to finger-draw a working area on any surface in front of the camera/projector system.
“We wanted the ability to use any surface,” Benko says. “Let the user define the area of where they want the interface to be, and have the system do its best to track it frame to frame. This creates a highly flexible, on-demand user interface. You can tap on your hand or drag your interface out to specify the top left and bottom right border. All this stems from the main idea that if everything around you is a potential interface, then the first action has to be defining an interface area.”
The team stresses that, although the prototype is not as small as they would like it to be, there are no significant barriers to miniaturization and that it is entirely possible that a future version of Wearable Multitouch Interaction could be the size of a matchbox and as easy to wear as a pendant or a watch.
PocketTouch: Through-Fabric Capacitive Touch Input—written by Saponas, Harrison, and Benko—describes a prototype that consists of a custom, multitouch capacitive sensor mounted on the back of a smartphone. It uses the capacitive sensors to enable eyes-free multitouch input on the device through fabric, giving users the convenience of a rich set of gesture interactions, ranging from simple touch strokes to full alphanumeric text entry, without having to remove the device from a pocket or bag.
“People already try to interact with a computing device through fabric,” says Saponas, of the Computational User Experiences group. “Think of when you try to reach through your pocket to the slider that silences your phone. We wanted to take a different spin by asking: Can we use a higher-bandwidth touch surface to provide a wider range of actual input?”
The challenge was detecting multitouch strokes through fabric in a reliable manner, and one of the key problems to solve was orientation. Harrison recalls the brainstorming around PocketTouch.
“If you think about a device that is randomly positioned in your pocket or purse,” Harrison explains, “you really have no idea of user orientation. Sure, you can have a gyroscope tell if the device is facing up or down, but you still wouldn’t know from which side the user is going to approach the device.”
The team resolved this by using an orientation-defining unlock gesture to determine the coordinate plane, thus initializing the device for interaction. Once initialized, user orientation can be from any direction as long as it’s consistent. PocketTouch then separates purposeful finger strokes from background noise and uses them as input.
The next challenge was to process strokes to enable text recognition of characters written over the same small physical area. Happily, the problem of recognizing multistrokes as input turned out to be a matter of adapting existing solutions.
“Microsoft Windows already contains a very rich and adaptive stroke-recognition engine,” Benko says. “So if the user is sloppy with strokes—and believe me, when you're doing it through the pocket of a jacket, the results are sloppy—these systems have the language model to handle it. That made PocketTouch a lot more robust than one would expect, as you can see from the video.”
Finally, the researchers tested the feasibility of using finger strokes through enclosing material to control a device equipped with capacitive sensing. They tested according to thickness, fiber type, types of garments, and pocket location. The results exceeded expectations.
“We didn’t think that a heavy fleece or a jacket pocket would provide enough of a signal to the sensor,” Saponas recalls. “We only included them during testing to demonstrate a full range of options. To our astonishment, they worked anyway. So we knew we had solved the toughest challenge, which was to figure out a reliable way to detect and segment strokes from the capacitive touch sensor through fabric.”
The defining difference of this work is adaptability. Touch devices such as touch screens are carefully engineered, manufactured, and calibrated to give users an optimal experience. PocketTouch is fundamentally different in that, instead of calibrating once for a particular surface, it calibrates continuously, adaptively optimizing the touch experience to account for different surfaces.
“Sometimes the best technologies are surprises to the people who build them,” Harrison grins. “This was an elegant idea that worked much better than we’d hoped. It's a blessing that researchers don't often get.”
Benko also stresses that both Wearable Multitouch Interaction and PocketTouch are evolutionary steps of a larger effort by Microsoft Research to investigate the unconventional use of touch in devices to extend Microsoft’s vision of ubiquitous computing. He notes that PocketTouch has a lineage dating to the Mouse 2.0 project and work on the multitouch pen, while Wearable Multitouch Interaction shares concepts in common with LightSpace.
“It’s interesting to isolate these projects,” Benko remarks, “But sometimes, it’s much more interesting to look at them as evolving toward a broader vision. We are trying to push the boundaries of this rich space of touch and gestures, making gestural interactions available on any surface and with any device.”
Besides the Wearable Multitouch Interaction and PocketTouch papers, five others from Microsoft Research are being presented during UIST 2012:
PocketTouch: Through-Fabric Capacitive Touch Input
T. Scott Saponas, Microsoft Research Redmond; Chris Harrison, Carnegie Mellon University; and Hrvoje Benko, Microsoft Research Redmond.
Pause-and-Play: Automatically Linking Screencast Video Tutorials with Applications
Suporn Pongnumkul, University of Washington; Mira Dontcheva, Adobe Systems; Wilmot Li, Adobe Systems; Jue Wang, Adobe Systems; Lubomir Bourdev, Adobe Systems; Shai Avidan, Adobe Systems; and Michael Cohen, Microsoft Research Redmond.
Access Overlays: Improving Non-Visual Access to Large Touch Screens for Blind Users
Shaun K. Kane, University of Washington and University of Maryland, Baltimore County; Meredith Ringel Morris, Microsoft Research Redmond; Annuska Perkins, Microsoft; Daniel Wigdor, University of Toronto and Microsoft Research Redmond; Richard E. Ladner, University of Washington; and Jacob O. Wobbrock, University of Washington.
Portico: Tangible Interaction on and Around a Tablet
Daniel Avrahami, Intel and University of Washington; Jacob O. Wobbrock, University of Washington; and Shahram Izadi, Microsoft Research Cambridge.
KinectFusion: Real-time 3D Reconstruction and Interaction Using a Moving Depth Camera
Shahram Izadi, Microsoft Research Cambridge; David Kim, Microsoft Research Cambridge; Otmar Hilliges, Microsoft Research Cambridge; David Molyneaux, Microsoft Research Cambridge; Richard Newcombe, Imperial College London; Pushmeet Kohli, Microsoft Research Cambridge; Jamie Shotton, Microsoft Research Cambridge; Steve Hodges, Microsoft Research Cambridge; Dustin Freeman, University of Toronto; Andrew Davison, Imperial College London; and Andrew Fitzgibbon, Microsoft Research Cambridge.
Vermeer: Direct Interaction with a 360-degree Viewable 3D Display
Alex Butler, Microsoft Research Cambridge; Otmar Hilliges, Microsoft Research Cambridge; Shahram Izadi, Microsoft Research Cambridge; Steve Hodges, Microsoft Research Cambridge; David Molyneaux, Microsoft Research Cambridge; David Kim, Microsoft Research Cambridge; and Danny Kong, Microsoft Research Cambridge.