Home
 
 
Selected Publications
 
 
  S. Jana, D.  Molnar, A. Moshchuk, A. Dunn, B. Livshits, H. J. Wang & E. Ofek

Enabling fine-grained permissions for augmented reality
              applications with recognizers

The 22nd USENIX Security Symposium 2013 

Augmented reality (AR) applications sense the environment, then create virtual objects overlaid on
human senses. Examples include annotating store fronts with reviews and XBox Kinect games that
show\avatars" mimicking human movements. No current OS has special support for such applications, so
they must perform all sensing and object recognition themselves. As a result, permissions for AR
applications are necessarily coarse grained: applications must ask for access to raw sensor feeds, such as
video and audio. These raw feeds expose significant additional information over and above what applications need, including sensitive information such as the user's location, face, or surroundings.
Instead of exposing raw sensor data to applications directly, we introduce a new OS abstraction:
the recognizer. A recognizer takes raw sensor data as input, then exposes objects with higher level semantics to applications, such as a skeleton or a face. We propose a new grained permission system where applications request permission at the granularity of recognizer objects. We isolate a core recognizer set that is sufficient for over 90% of shipping AR applications, and we survey users' attitudes towards the data revealed by these recognizers. To manage permissions, we introduce privacy goggles, which allows users to inspect sensitive data exposed to an application, both at install time and afterwards.
We run into recognizer errors (e.g., mistakenly recognizing a white board drawing as a human face), which could lead to a privacy leak. We introduce a new OS component, recognizer error correction, to reduce recognizer errors in a recognizer-independent way. We demonstrate that our component can reduce false positives on three common recognizers for computer vision benchmarks. Our abstraction improves performance, as we can eliminate cross-app duplication of heavyweight recognizers.

   

   

L. D'Antoni, A. Dunn, S. Jana, T. Kohno, B. Livshits, A. Molnar,
A. Moshchuk, E. Ofek, F. Roesner, S. Saponas, M. Veanes & H. J. Wang

Operating System Support For Augmented Reality Applications


To be appear in The 14th Workshop on Hot Topics in Operating Systems (HotOS XIV), May 2013, New Mexcio, USA

   

IllumiRoom: Peripheral Projected Illusions for Interactive Experiences

 

B. Jones, H. Benko, E. Ofek & A. Wilson

IllumiRoom

ACM CHI 2013, Paris, France

CHI 2013 Best Paper Award

CHI 2013 Golden Mouse award (Best Video show)

IllumiRoom is a proof-of-concept system that augments the area surrounding a television with projected visualizations to enhance traditional gaming experiences. We investigate how projected visualizations in the periphery can negate, include, or augment the existing physical environment and complement the content displayed on the television screen. Peripheral projected illusions can change the appearance of the room, induce apparent motion, extend the field of view, and enable entirely new physical gaming experiences. Our system is entirely self-calibrating and is designed to work in any room. We present a detailed exploration of the design space of peripheral projected illusions and we demonstrate ways to trigger and drive such illusions from gaming content. We also contribute specific feedback from two groups of target users (10 gamers and 15 game designers); providing insights for enhancing game experiences through peripheral projected illusions.

 (additional materials at Brett's website)

   

 

E. Ofek, K. Strauss & S.T. Iqbal

Reducing Disruption from Subtle Information Delivery during a Conversation: Mode and Bandwidth Investigation

ACM CHI 2013, Paris, France.

 With proliferation of mobile devices that provide ubiquitous access to information, the question arises of how distracting processing information can be in social settings, especially during a face-to-face conversation. At the same time, relevant information presented at opportune moments may help enhance conversation quality. In this paper, we investigate how much information users can consume dur-ing a conversation and what information delivery mode, via audio or visual aids, helps them effectively conceal the fact that they are receiving information. We observe that users can internalize more information while still disguising this fact the best when information is delivered visually in batches (multiple pieces of information at a time and per-form better on both dimensions if information is delivered while they are not speaking. Participants qualitatively did not prefer this mode as being the easiest to use, preferring modes that displayed one piece of information at a time.

Paper (pdf)

 

   

Description: Description: Description: \\research\root\web\external\en-us\UM\People\eyalofek\image001.jpg

M. Kroepfl, Y. Wexler & E. Ofek

Efficiently Locating Photographs in Many Panoramas

ACM SIGSPATIAL GIS 2010 (Full Paper)

 We present a method for efficient and reliable geo-positioning of images. It relies on image-based matching of the query images onto a trellis of existing images that provides accurate 5-DOF calibration (camera position and orientation without scale). As such it can handle any image input, including old historical images, matched against a whole city.  At that scale, care needs to be taken with the size of the database. We deviate from previous work by using panoramas to simultaneously reduce the database size and increase the coverage.  To reduce the likelihood of false matches, we restrict the range of angles for matched features. Furthermore, we enhance the RANSAC procedure to include two phases. The second phase includes guided feature matching to increase the likelihood of positive matches. Hence, we devise a matching confidence score that separates between true and false matches.   We demonstrate the algorithm on a large scale database covering a whole city and show its uses to vision-based augmented reality system.

 

   

  

B. Zhang, Q. Li, H. Chao, B. Chen, E. Ofek & Y.Q. Xu

Annotating and Navigating Tourist Videos

ACM SIGSPATIAL GIS 2010 (Full paper)

 Due to the rapid increase in video capture technology, more and more tourist videos are captured every day, creating a challenge for organization and association with metadata.

In this paper, we present a novel system for annotating and navigating tourist videos. Placing annotations in a video is difficult because of the need to track the movement of the camera. Navigation of a regular video is also challenging due to the sequential nature of the media. To overcome these challenges, we introduce a system for registering videos to geo-referenced 3D models and analyzing the video contents.

We also introduce a novel scheduling algorithm for showing annotations in video. We show results in automatically annotated videos and in a map-based application for browsing videos. Our user study indicates the system is very useful.

   

  Description: Description: Description: \\research\root\web\external\en-us\UM\People\eyalofek\text.png 

B. Epshtein, Y. Wexler & E. Ofek

 

Detecting Text in Natural Scenes with Stroke Width Transform

CVPR 2010 (Oral)

 We present a novel image operator that seeks to find the value of stroke width for each image pixel, and demonstrate its use on the task of text detection in natural images. The suggested operator is local and data dependent, which makes it fast and robust enough to eliminate the need for multi-scale computation or scanning windows. Extensive testing shows that the suggested scheme outperforms the latest published algorithms. Its simplicity allows the algorithm to detect texts in many fonts and languages.

Video of the CVPR talk

Data base of detected texts

 

   

Description: Description: Description: \\research\root\web\external\en-us\UM\People\eyalofek\Back_SeamlessMontage.jpg

R. Gal, Y. Wexler, H. Hoppe, E. Ofek & D. Cohen-Or

Seamless Montage for Texturing Models

EuroGraphics 2010 (Oral)

 We present an automatic method to recover high-resolution texture over an object shape by mapping detailed photographs onto its surface. Such high-resolution detail often reveals inaccuracies in geometry and registration, as

well as lighting variations and surface reflections. Simple image projection results in visible seams on the surface. We minimize such seams using a global optimization that assigns compatible texture to adjacent triangles. The key idea is to search not only combinatorially over the source images, but also over a set of local image transformations that compensate for geometric misalignment. This broad search space is traversed using a discrete labeling algorithm, aided by a coarse-to-fine strategy. Our approach significantly improves resilience to acquisition errors, thereby allowing simple, easy creation of textured models for use in computer graphics.

   

Description: Description: Description: \\research\root\web\external\en-us\UM\People\eyalofek\Mouse.jpg

 

N. Villar, S. Izadi, H. Benko, J. Helmes, D. Rosenfeld, E. Ofek, J. Westhues, A. Butler, X. Cao, B. Chen &  Steve Hodges

Mouse 2.0: Multi-Touch Meets the Mouse

UIST 2009, Victoria, Canada

UIST 2009 Best Paper Award

 

 In this paper we explore the possibilities for augmenting the standard computer mouse with multi-touch capabilities so that it can sense the position of the user’s fingers and thereby complement traditional pointer-based desktop interactions with touch and gestures. We present five different multi-touch mouse implementations, each of which explores a different touch sensing strategy, which leads to differing form-factors and hence interaction possibilities. In addition to the detailed description of hardware and software implementations of our prototypes, we discuss the relative strengths, limitations and affordances of these different input devices as informed by the results of a preliminary user study.

 

   

Description: Description: Description: \\research\root\web\external\en-us\UM\People\eyalofek\back_VideoDD.jpg

B. Chen, B. Neubert, E. Ofek, O. Deussen & M. F. Cohen

Integrated Videos and Maps for Driving Directions

UIST 2009, Victoria, Canada

 While onboard navigation systems are gaining in importance, maps are still the medium of choice for laying out a route to a destination and for way finding. However, even with a map, one is almost always more comfortable navigating a route the second time due to the visual memory of the route. To

make the first time navigating a route feel more familiar, we present a system that integrates a map with a video automatically constructed from panoramic imagery captured at close intervals along the route. The routing information is used to create a variable speed video depicting the route. During playback of the video, the frame and field of view are dynamically modulated to highlight salient features along the route and connect them back to the map. A user interface is demonstrated to allow exploration of the combined map, video, and textual driving directions. We discuss the construction of the hybrid map and video interface. Finally, we report the results of a study that provides evidence of the effectiveness of such a system for route following.

Description: Description: Description: \\research\root\web\external\en-us\UM\People\eyalofek\public3.jpg

T. Yamaguchi, Wilburn , Z. Cao & E. Ofek
Video-based modeling of dynamic hair

Pacific-Rim Symposium on Image & Video Technology 2009, Tokyo, Japan

 

 

 

Description: Description: Description: \\research\root\web\external\en-us\UM\People\eyalofek\BECOnflate.jpg

P. Mishra, E. Ofek & G. Kimchi
Validation of Vector Data using Oblique Images

ACM SIGGIS 2008, CA USA (Full Paper)

Oblique images are aerial photographs taken at oblique angles to the earth’s surface. Projections of vector and other geospatial data in these images depend on camera parameters, positions of the entities, surface terrain, and visibility. This paper presents a robust and scalable algorithm to detect inconsistencies in vector data using oblique images. The algorithm uses image descriptors to encode the local appearance of a geospatial entity in images. These image descriptors combine color, pixel-intensity gradients, texture, and steerable filter responses. A Support Vector Machine classifier is trained to detect image descriptors that are not consistent with underlying vector data, digital elevation maps, building models, and camera parameters. In this paper, we train the classifier on visible road segments and non-road data. Thereafter, the trained classifier detects inconsistencies in vectors, which include both occluded and misaligned road segments. The consistent road segments validate our vector, DEM, and 3-D model data for those areas while inconsistent segments point out errors. We further show that a search for descriptors that are consistent with visible road segments in the neighborhood of a misaligned road yields the desired road alignment that is consistent with pixels in the image.

 

 

Description: Description: Description: \\research\root\web\external\en-us\UM\People\eyalofek\MineapolisSS.jpg

J. Xiao, T. Fang. P. Tan, Z. Peng, E. Ofek & L. Quan
Image Based Facade Modeling
ACM SIGGRAPH Asia 2008, Singapure

We propose in this paper a semi-automatic image-based approach that uses images captured along the streets, and relies on structure from motion to automatically recover the camera positions and point clouds as the initial stage for the modeling. We start a building facade as a flat rectangular plane or a developable surface, and the texture image of the flat facade is composited from the multiple visible images with handling of occluding objects. A facade is then decomposed and structured into a Directed Acyclic Graph of rectilinear elementary patches. The decomposition is carried out top-down by a recursive subdivision, and followed by a bottom-up merging with the detection of the architectural bilateral symmetry and repetitive patterns. Each subdivided patch of the flat facade is augmented with the depth that is optimized from the 3D points. Our system also allows the user to easily provide feedbacks in the 2D image space for the proposed decomposition and augmentation. Finally, our approach is demonstrated on a large number of facades from a variety of street-side images.

 

 

Description: Description: Description: \\research\root\web\external\en-us\UM\People\eyalofek\image014.jpg

H. Wang, H. Hoppe, Y. Wexler and E. Ofek

Factoring Repeated Content Within and Among Images (11M)

ACM SIGGRAPH 2008, Los Angeles USA

 

We reduce transmission bandwidth and memory space for images by factoring their repeated content. A transform map and a condensed epitome are created such that all image blocks can be reconstructed from transformed epitome patches. The transforms may include affine deformation and color scaling to account for perspective and tonal variations across the image. The factored representation allows efficient random-access through a simple indirection, and can therefore be used for real-time texture mapping without expansion in memory. Our scheme is orthogonal to traditional image compression, in the sense that the epitome is amenable to further compression such as DXT. Moreover it allows a new mode of progressivity, whereby generic features appear before unique detail. Factoring is also effective across a collection of images, particularly in the context of image-based rendering. Eliminating redundant content lets us include textures that are several times as large in the same memory space.

 

           Description: Description: Description: \\research\root\web\external\en-us\UM\People\eyalofek\image016.gif   Description: Description: Description: \\research\root\web\external\en-us\UM\People\eyalofek\image020.gif

B. Epshtein, E. Ofek, Y. Wexler and P. Zhang

Hierarchical photo organization using geo-relevance

ACM GIS 2007, Seattle WA, USA (Full paper)

 

We present a novel framework for organizing large collections of images in a hierarchical way, based on scene semantics. Rather than score images directly, we use them to score the scene in order to identify typical views and important locations which we term Geo-Relevance. This is done by relating each image with its viewing frustum which can be readily computed for huge collections of images nowadays. The frustum contains much more information than only camera position that has been used so far. For example, it distinguishes between a photo of the Eiffel Tower and a photo of a garbage bin taken from the exact same place. The proposed framework enables a summarized display of the information and facilitates efficient browsing.

 

 Description: Description: Description: \\research\root\web\external\en-us\UM\People\eyalofek\image003.gif

N.Li, N. Moraveji, H. Kimura and E. Ofek

Improving the Experience of Controlling Avatars in Camera-Based Games Using Physical Input

The 14th ACM International Conference on Multimedia, Santa Barbara, CA, USA 2006

 

This paper investigates two methods of improving the user experience of camera-based interaction. First, problems that arise when avatars are designed to mimic a user’s physical actions are presented. Second, a solution is proposed: adding a layer of separation between user and avatar while retaining intuitive user control. Two methods are proposed for this separation: spatially and temporally. Implementations of these methods are then presented in the context of a simple game and evaluate their effect on performance and satisfaction. Results of a human subject experiment are presented, showing that reducing the amount of user control can maintain, and even improve, user satisfaction if the design of such a reduction is appropriate. This is followed by a discussion of how the findings inform camera-based game design.

 

 

 

 

Description: Description: Description: \\research\root\web\external\en-us\UM\People\eyalofek\image022.gif

H. Jiang, E. Ofek, N. Moraveji and S. Yuanchun

Direct Pointer: Direct Manipulation for Large Display Interaction using Handheld Cameras

SIG CHI  2006, Montreal, Canada

 

This paper describes the design and evaluation of a technique, Direct Pointer, that enables users to interact intuitively with large displays using cameras equipped on handheld devices, such as mobile phones and personal digital assistant (PDA). In contrast to many existing interaction methods that attempt to address the same problem, ours offers direct manipulation of the pointer position with continuous visual feedback. The primary advantage of this technique is that it only requires equipment that is readily available: an electronic display, a handheld digital camera, and a connection between the two. No special visual markers in the display content are needed, nor are fixed cameras pointing at the display. We evaluated the performance of Direct Pointer as an interaction product, showing that it performs as well as comparable techniques that require more sophisticated equipment.

 

 

 

 

Description: Description: Description: \\research\root\web\external\en-us\UM\People\eyalofek\Publications_files - Copy\image023.gif

Y. Matsushita, E. Ofek, X. Tang and H. Shum

Full Frame Video Stabilization with Motion Inpainting

IEEE PAMI 2006

 

Video stabilization is an important video enhancement technology which aims at removing annoying shaky motion from videos. We propose a practical and robust approach of video stabilization that produces full-frame stabilized videos with good visual quality. While most previous methods end up with producing low resolution stabilized videos, our completion method can produce full-frame videos by naturally filling in missing image parts by locally aligning image data of neighboring frames. To achieve this, motion inpainting is proposed to enforce spatial and temporal consistency of the completion in both static and dynamic image areas. In addition, image quality in the stabilized video is enhanced with a new practical deblurring algorithm. Instead of estimating point spread functions, our method transfers and interpolates sharper image pixels of neighboring frames to increase the sharpness of the frame. The proposed video completion and deblurring methods enabled us to develop a complete video stabilizer which can naturally keep the original image quality in the stabilized videos. The effectiveness of our method is confirmed by extensive experiments over a wide variety of videos.

 

 

 Description: Description: Description: \\research\root\web\external\en-us\UM\People\eyalofek\LongHair.jpg

Y. Wei, E. Ofek, L. Quan and H. Shum

Modeling Hair from Multi Views

SIGGRAPH  2005, Los Angeles, CA.

 

In this paper, we propose a novel image-based approach to model hair geometry from images taken at multiple viewpoints. Unlike previous hair modeling techniques that require intensive user interactions or rely on special capturing setup under controlled illumination conditions, we use a handheld camera to capture hair images under uncontrolled illumination conditions. Our multi-view approach is natural and flexible for capturing. It also provides inherent strong and accurate geometric constraints to recover hair models. In our approach, the hair fibers are synthesized from local image orientations. Each synthesized fiber segment is validated and optimally triangulated from all visible views. The hair volume and the visibility of synthesized fibers can also be reliably estimated from multiple views. Flexibility of acquisition, little user interaction, and high quality results of recovered complex hair models are the key advantages of our method

 

 

 

Description: Description: Description: Description: Description: Description: \\research\root\web\external\en-us\UM\People\eyalofek\SideBySide.gif

Y. Matsushita, E. Ofek, X. Tang and H. Shum

Video Completion with Motion Inpainting for Video Stabilization

CVPR  2005, San Diego, CA. (Oral)

 

Video stabilization is an important video enhancement technology which aims at removing annoying shaky motion from videos. We propose a practical and robust approach of video stabilization that produces full-frame stabilized videos with good visual quality. While most previous methods end up with producing low resolution stabilized videos, our completion method can produce full-frame videos by naturally filling in missing image parts by locally aligning image data of neighboring frames. To achieve this, motion inpainting is proposed to enforce spatial and temporal consistency of the completion in both static and dynamic image areas. In addition, image quality in the stabilized video is enhanced with a new practical deblurring algorithm. Instead of estimating point spread functions, our method transfers and interpolates sharper image pixels of neighboring frames to increase the sharpness of the frame. The proposed video completion and deblurring methods enabled us to develop a complete video stabilizer which can naturally keep the original image quality in the stabilized videos. The effectiveness of our method is confirmed by extensive experiments over a wide variety of videos.

 

 

 

Description: Description: Description: \\research\root\web\external\en-us\UM\People\eyalofek\image002.pngDescription: Description: Description: \\research\root\web\external\en-us\UM\People\eyalofek\image004.png

 

B. Chen, E. Ofek, H. Shum and M. Levoy

Interactive Deformation of Light Fields

SIGGRAPH I3D  2005, Washington D.C.

 

 

 

   Description: Description: Description: \\research\root\web\external\en-us\UM\People\eyalofek\image029.jpg

X. Cao, E. Ofek, and D. Vronay

Evaluation of Alterantive Presentation Control Techniques. (PDF, 420K)

SIG CHI 2005, Protland, OR.

 

This paper describes the design and evaluation of a technique, Direct Pointer, that enables users to interact intuitively with large displays using cameras equipped on handheld devices, such as mobile phones and personal digital assistant (PDA). In contrast to many existing interaction methods that attempt to address the same problem, ours offers direct manipulation of the pointer position with continuous visual feedback. The primary advantage of this technique is that it only requires equipments that are readily available: a large electronic display, a handheld camera, and a connection between the two. No special visual markers in the display content are needed, nor are fixed cameras pointing at the display. We evaluated the performance of Direct Pointer as an interaction product, showing that it performs as well as comparable techniques that require more sophisticated equipment.

 

 

 

   Description: Description: Description: Description: Description: Description: girl

R. Gvili, A. Kaplan, E. Ofek, G. Yahav

Depth Key 

Stereoscopic Displays and Applications: The Engineering Reality of Virtual Reality 2003  (Proceedings of SPIE/IS&T Volume 5006), San Jose, CA.

 

 

 

   Description: Description: Description: Description: Description: Description: 3DTV

A. RedertM. Op de Beeck, C. Fehn, W. IJsselsteijn, M. Pollefeys, L. J. Van Gool, E. Ofek, I. Sexton and P. Surman

ATTEST: Advanced Three-dimensional Television System Technologies.

3DPVT 2002, Padova, Italy.

 

 

 Description: Description: Description: \\research\root\web\external\en-us\UM\People\eyalofek\public1.jpg

E. Ofek and A. Rappoport

Interactive reflections on curved objects

ACM SIGGRAPH 1998

Global view-dependent illumination phenomena, in particular reflections, greatly enhance the realism of computer-generated imagery. Current interactive rendering methods do not provide satisfactory support for reflections on curved objects.
In this paper we present a novel method for interactive computation of reflections on curved objects. We transform potentially reflected scene objects according to reflectors, to generate virtual objects. These are rendered by the graphics system as ordinary objects, creating a reflection image that is blended with the primary image. Virtual objects are created by tessellating scene objects and computing a virtual vertex for each resulting scene vertex. Virtual vertices are computed using a novel space subdivision, the reflection subdivision.
For general polygonal mesh reflectors, we present an associated approximate acceleration scheme, the explosion map. For specific types of objects (e.g., linear extrusions of planar curves) the reflection subdivision can be reduced to a 2-D one that is utilized more accurately and efficiently.

 

 

 

Description: Description: Description: \\research\root\web\external\en-us\UM\People\eyalofek\Back_public2.jpg

E. Ofek, E. Shilat, A. Rappoport, and M. Werman.

Multi-resolution Textures from Image Sequences (PDF, 515K)

IEEE Computer Graphics and Applications 1997

 

 

 

 Copyrights:

ACM Copyright Notice

Copyright by the Association for Computing Machinery, Inc. Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from Publications Dept, ACM Inc., fax +1 (212) 869-0481, or permissions@acm.org. The definitive version of this paper can be found at ACM's Digital Library http://www.acm.org/dl/.

Eurographics Association Copyright Notice

Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than Eurographics must be honored.

IEEE Copyright Notice

This material is posted here with permission of the IEEE. Internal or personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution must be obtained from the IEEE by writing to pubs-permissions@ieee.org.

Springer-Verlag Copyright Notice

The Author may publish his/her contribution on his/her personal Web page provided that he/she creates a link to the above mentioned volume of LNCS at the Springer-Verlag server or to the LNCS series Homepage (URL: http://www.springer.de/comp/lncs/index.html) and that together with this electronic version it is clearly pointed out, by prominently adding "© Springer-Verlag", that the copyright for this contribution is held by Springer.