A Strong Sense for Natural Interactions

Published

Hrvoje Benko

Hrvoje Benko

This week, Microsoft researcher Hrvoje Benko (opens in new tab) (@hrvojebenko (opens in new tab)) is in Hawaii, but not on one of the islands’ beautiful beaches. As conference chair for UIST 2014 (opens in new tab)—the 27th Association for Computing Machinery (ACM) Symposium on User Interface Software and Technology—Benko will be busy ensuring that the event, the premier forum for innovations in the software and technology of human-computer interaction (HCI), proceeds smoothly.

In addition to serving as conference chair, Benko—subject of a new Microsoft Research Luminaries video (opens in new tab)—has contributed to three of the eight Microsoft papers being presented during the conference, including Sensing Techniques for Tablet+Stylus Interaction (opens in new tab), which won an ACM UIST 2014 Best Paper Award. This is the fourth paper he’s co-written that’s received a best-paper award by a major conference, evidence of a remarkable career for a young scientist who joined Microsoft Research in 2007, just after earning his Ph.D.

Bringing Augmented Reality to Ordinary Spaces

Microsoft Research Podcast

AI Frontiers: AI for health and the future of research with Peter Lee

Peter Lee, head of Microsoft Research, and Ashley Llorens, AI scientist and engineer, discuss the future of AI research and the potential for GPT-4 as a medical copilot.

The award-winning paper, though, might not be the most attention-grabbing contribution to UIST 2014 from Benko and his colleagues. Many attendees are certain to be amazed by the research featured in the paper RoomAlive: Magical Experiences Enabled by Scalable, Adaptive Projector-Camera Units (opens in new tab), written by Benko and a host of Microsoft and academic collaborators.

RoomAlive (opens in new tab) uses a unified, scalable, multiprojector system that adapts gaming content to its room and explores additional methods for physical interaction in the room, resulting in an immersive gaming experience.

“RoomAlive,” Benko says, “enables any space to be transformed into an augmented, interactive display.”

It will be a while, though, before that proof-of-concept could become a practical reality, but it certainly is tantalizing. The RoomAlive prototype deploys projectors and depth cameras to cover the entire room—including the furniture and people inside—with pixels capable of being used for both input and output. The presentation results in an experience that coexists seamlessly with the existing physical environment.

The system that drives RoomAlive uses multiple projector-camera units—referred to as “procams”—consisting of a depth camera, a wide-field-of-view projector, and a computer. These devices are combined via a scalable, distributed framework to cover an entire room. The procams are auto-calibrating and can self-localize as long as their views have some overlap.

“Our system enables new interactive projection mapping experiences that dynamically adapts content to any room,” Benko says. “Users can touch, shoot, stomp, dodge, and steer projected content that seamlessly coexists with their existing physical environment.”

Another augmented-reality paper being presented during UIST 2014, written by Benko, Microsoft researcher Andy Wilson (opens in new tab) and designer Federico Zannier, uses different technologies, this time to support face-to-face—“dyadic”—interaction with 3-D virtual objects (opens in new tab). Dyadic Projected Spatial Augmented Reality (opens in new tab) combines dynamic projection mapping, multiple perspective views, and device-less interaction.

The main advantage of spatial augmented reality (SAR) over more traditional augmented-reality approaches, such as handheld devices with composited graphics or see-through, head-worn displays, is that users are able to interact with 3-D virtual objects and each other without bulky equipment with a limited field of view that hinders face-to-face interaction. Instead, SAR projects the augmenting graphics over the physical object itself and avoids diverting the users’ attention from the real world.

“That used to be the domain of theme parks or highly immersive theaters,” Benko says. “You needed a space that was designed for the experience. But now, with enough computing power, depth cameras, and projectors, it’s possible to create these immersive environments within an ordinary living space.

“Augmented reality fundamentally changes the nature of communication, with rich interactions not just for entertainment, but also for work and collaboration.”

Enabling More Natural, Nuanced Interactions

The paper that won a UIST 2014 best-paper award represents a huge joint effort. Written by team lead Ken Hinckley (opens in new tab) along with colleagues, Benko, Michel Pahud (opens in new tab), Pourang Irani, François Guimbretière, Marcel Gavriliu, Xiang ‘Anthony’ Chen, Fabrice Matulic, Bill Buxton (opens in new tab), and Wilson, it explores grip and motion sensing with a tablet and a stylus (opens in new tab).

“We’re at the point now where small mobile devices, such as pens, tablets, or phones, can be equipped with sensors to help us understand their use,” Benko says. “The way you grasp the tablet, whether you hold the stylus in a writing grip or tucked between your fingers—these all affect the position and movement of your gestures.”

The biomechanics behind each task are anything but ordinary, involving the interplay between two hands, each containing 27 bones, more than 30 muscles, nearly 50 nerves, and 30 or so arteries.

The team’s goal was to capitalize on the hand’s dexterity to explore new frontiers in human-computer interaction using new sensing techniques.

“How can we interpret the signals correctly, more accurately?” Benko muses. “That’s the larger goal of the work: Instead of assuming explicit interaction, how can we enable a more natural and nuanced interface based on context of use? How can we impart new subtleties to interactions on mobile devices?”

Current device-interaction models tend to rely on having the user explicitly select the state of interaction. Models in which devices present context-based behaviors are few and still fairly simple: accelerometers, which flip screens between portrait and landscape modes, or Bluetooth, which makes automatic, appropriate connections to a vehicle, a home phone, or a computer. The researchers wanted to extend the interaction vocabulary and build on the existing range of gestures so that more nuanced context is possible.

“It’s an attempt to create a set of gestures that are meaningful to a particular task and span the space of possibilities,” he says. “We know that not all the gestures are going to be successful. But it’s only by introducing them and obtaining feedback that this vocabulary will eventually reduce down to a few highly functional interactions that get widely adopted. This is still a long ways off from moving into the mainstream, but this is the exciting part for researchers—opening up new possibilities and working to make them useful.”

Fascinated by the Human Aspects of Computing

HCI first caught Benko’s interest during his undergraduate years.

“I was doing a lot of programming but found myself asking, ‘Where do people come into the equation?’” says Benko, a native of Croatia. “So for my graduate studies at Columbia University, I continued on with computing but focused on the human aspects. I was really lucky to work there with some of the pioneers in augmented reality.”

Currently, augmented reality is what captures his imagination the most. The notion that technology can augment our senses and alter how we comprehend reality was what initially inspired him to go into HCI research.

“It’s almost like you’re giving people superpowers,” he says, laughing. “I was fascinated with the idea that computing could be a tool that lets you do things you couldn’t do before, or do them faster, or change how you perceive reality. I was just really interested in the notion that computing should be about interacting with people.”

Flying High at Microsoft Research

Since joining Microsoft, Benko’s work has spanned many different areas, from augmented reality, computational illumination, surface computing, and new input form factors and devices, to touch and freehand gestural input. When he arrived in the United States at age 16, though, Benko never imagined he would contribute to award-winning papers or collaborate with eminent scientists on scientific papers and journals.

“I came to the U.S. through an exchange-student scholarship,” he says. “I attended a prep school, at one of those places with old buildings straight out of the Dead Poets Society. I had a really good time, my scholarship got extended for a second year, and I went on to university in the U.S., and then graduate school.”

Benko calls his 2005 internship at Microsoft an “eye-opener.” For one thing, he found himself in an office a few doors down the hall from researchers whose papers he had been reading and referencing.

“There were so many luminaries whose work I revered,” he recalls. “Not in my wildest dreams could I have imagined having lunch and chatting about my work with Andy Wilson or Ken Hinckley. They’re world-renowned experts in their fields.”

While sometimes working at Microsoft Research can feel like a day at the beach for Benko, this week won’t. With three papers and a demo to present, not to mention his responsibilities as conference chair, the beach, ironically, might have to be experienced virtually.

Continue reading

See all blog posts