Loris D'Antoni, Alan Dunn, Suman Jana, Tadayoshi Kohno, Benjamin Livshits, David Molnar, Alexander Moshchuk, Eyal Ofek, Franziska Roesner, Scott Saponas, Margus Veanes, and Helen J Wang
Augmented reality (AR) takes natural user input (NUI), such as gestures, voice, and eye gaze, and produces digital visual overlays on top of reality seen by a user. Today, multiple shipping AR applications exist, most notably titles for the Microsoft Kinect and smartphone applications such as Layar, Wikitude, and Junaio. Despite this activity, little attention has been paid to operating system support for AR applications. Instead, each AR application today does its own sensing and rendering, with the help of user-level libraries like OpenCV or the Microsoft Kinect SDK.
In this paper, we explore how operating systems should evolve to support AR applications. Because AR applications work with fundamentally new inputs and outputs, an OS that supports AR applications needs to re-think the input and display abstractions exposed to applications. Unlike mouse and keyboard, which form explicit, separate channels for user input, NUI requires continuous sensing of the real-world environment, which often has sensitive data mixed with user input. Hence, the OS input abstractions must ensure that user privacy is not violated, and the OS must provide a fine-grained permission system for access to recognized objects like a user's face and skeleton. In addition, because visual outputs of AR applications mix real-world and virtual objects, the synthetic window abstraction in traditional GUIs is no longer viable, and OSes must rethink the display abstractions and their management. We discuss research directions for solving these and other issues and building an OS that let multiple applications share one (augmented) reality.