dan's face  
Daniel C. Robbins

3D Industry (whitepaper)


Related Pages:

Thoughts on the state of 3D in our industry

Daniel C. Robbins: UI Designer, Microsoft Research

The problem is this: today’s computer users are presented with an ever expanding amount of information, so much so that they can be easily overwhelmed. At Microsoft Research, and at many other groups, efforts are underway to determine how best to present this information to users and give them control over it. Since most information coming over our computers is inherently abstract, one of the designer’s main goals is to determine the proper form of representation. Often we choose to recast abstract data into a familiar theme or metaphor. The bet is that by placing this information into a 3D world, we can better take advantage of user’s natural perceptual abilities. Some attempts at 3D interfaces have borrowed from real-world metaphors: the office, the city, the forest, or the inner workings of the computer itself. The type of metaphor chosen and the type of information represented have broad consequences throughout the design of a 3D user-interface. Current explorations of 3D UI range from add-ons to existing applications such as 3D charting for common data (encyclopedia entries or web categories), to formative desktop overlays that let users assign items of information to readily recognized real-world metaphors. Work now is just beginning on replacing the entire desktop experience with rich 3D environments.

Look and feel is just the tip of the iceberg. The Windows UI must evolve to handle many new user experiences.  The kind of information a typical user explores in everyday computing is more varied and dynamic than it used to be. A user may sort through text files, play a series of music clips, look at some maps, and fill out web-based forms all the while switching between tasks on different devices, each with different displays and input characteristics. To put it concisely, there are several problems with current user-interfaces that 3D can hope to address:

  • Discretely shifting spatial configurations: Views tend to shift around on users in very “choppy” ways. Click on a sort button and boom, all of a sudden, everything in the view changes.
  • Multiple simultaneous views: Many user interfaces force users to look in multiple places at one time. The user many be dragging a file in one place but have to look in another to read status information about the object.
  • Homogeneous representations: Objects such as files, folders and links tend to look the same no matter where you are on the screen, in the hierarchy one is or type of object it is (image, text document, movie, etc); thereby, making it difficult to distinguish one object from another.

No context: Because every place in our current representations of information hierarchies, like file systems, set of web pages, etc, tend to look the same, users get lost. There is no sense of place or locale. This means that when examining a detailed piece of information (such as reading a particular document) it is difficult to determine how that document fits into the overall information space.

The explosion of information provides wonderful opportunities (both within Microsoft and for the industry as a whole) to develop new methods of interaction and presentation. Many groups within Microsoft are working hard to extend, evolve, and sometimes replace the current user experience with new paradigms.

Within Microsoft Research we are looking at some of the following: 1) ways to display non-distracting peripheral notifications of highly salient information, 2) speech enabled applications, 3) alternative input devices (such as touch-sensing and room awareness), 4) and 3D user interfaces.

3D will be a component of future user interfaces. The question is for what tasks does it actually make sense. Our observation is that most of the time users just want to get their work done. They want to write a letter to someone, balance a budget, or order something off the web. For most of these common tasks, the UI will probably remain full-screen single-application and 2D. The magic starts to happen when users want to bring in information from multiple places, see relationships between multiple items and quickly get overviews of changing data-streams like stock quotes to weather to traffic to email). 3D will provide a natural way to do this.

3D allows users to switch between multiple tasks in a fluid, discoverable, and repeatable way.  Secondly, show multi-dimensional data in easy comprehensible ways (channel guides for future televisions), and lastly, let users manage large collections of information in a personal and comfortable manner.

Great deal of our research takes advantage of natural human abilities. Our intent is to shift much of the user’s efforts from the cognitive (“thinking”) to the perceptual (“sensing”). By doing this, we hope to address many of the problems with current interfaces mentioned before:

  • Spatial memory: In the “real” world people generally remember where things are, not necessarily which drawer the keys are in, but probably where financial records are kept in a home office. By using 3D, and thus presenting a richer environment, we take better advantage of our spatial and navigational abilities. A sense of place is enhanced via visual and auditory landmarks (current 2D UI’s do not  provide) and via maintaining high spatial stability (things should stay where they are put unless the user moves them).
  • Human attention: the scarcest resource we have to deal with is not CPU power, network bandwidth, or hard drive space, its human attention. Users are bombarded with more and more information throughout their day. A 3D environment allows for more ways to hierarchically arrange regions for information organization, and a user can quickly change their focus from the narrow (such as a file) to the broad (an overview of all the changing information coming into the user’s environment).
  • Scalability: We can present more information to users by using 3D. Collections of information can be tilted, warped, stacked, arranged in various geometric clusters, and conglomerated into iconic neighborhoods. In 2D, as information density increases, we are left with noise. In 3D, we can take advantage of various metaphors to show natural groupings.
  • Metaphor: This is a vastly under explored area, ripe for innovation. Most 3D desktop replacements have focused on borrowing a particular metaphor and using that for information representation. Files in a 3D folder are one-to-one proxies for files in the file system, buildings on the horizon denote network resources, and so on. Our current bet, though, is that great wins come when the metaphor is used for giving a sense of place, not as stand-ins for actual items of information. Elements from the real world (and other metaphors) are very useful as landmarks, grouping mechanisms, and status-indicators but not as items of information in themselves.

With the emergence of the web and soon to be explosion in wireless devices, the pace of change in terms of what information we provide users is rapid.  However, the pace of change in how we present that information is glacial. In some ways, it looks like “reverse” progress when you see how most web pages are just collections of text input fields and submit buttons. The ubiquity of the web demands that services presented over the web be accessible to the largest audience.  Right now we don’t have the luxury of asking users to access their information in a 3D environment, which requires 3D hardware acceleration, large high-resolution displays and decent audio systems, but that is changing.  Today, even a mid-range PC has some kind of 3D acceleration built in.

The harder issue to address is UI design. We have the great fortune at Microsoft Research of having multi-disciplinary teams working on this. I am a designer with a sculpture background. Just our small group includes several cognitive psychologists, several decision theory experts, and an audio composer!  We can take the time to look at the very deep questions of 3D user interface design. We regularly run user-studies to tease out and understand underlying human perceptual abilities. Can a user remember where things are better in a virtual 3D environment than a 2D one? How does tilting text in 3D affect readability? What are the right durations for animated transitions between different information configurations? These kinds of questions have to be answered before we can even begin to make full-fledged 3D applications and desktop software.

But surely one of the biggest impediments is the disconnect between any 3D information environment we provide and the existing strictly 2D nature of most desktop productivity applications. New technology, developed jointly between our group and the Windows 2000 team is just now allowing the tentative embedding of standard, unmodified Windows applications into a 3D environment. Yes, each of these applications is still generally 2D but with our redirection technology each application “lives” seamlessly in a 3D environment. It is a subtle difference conceptually, but during user-studies we are seeing interesting effects.

Some of the most interesting and most rapidly evolving 3D UI design work is taking place in the gaming world. Many high-profile games are incorporating 3D representations of virtual worlds and from multiple perspectives. In a way, many game designers are tackling the same issues that any 3D windowing environment will need to address:

  • Show cause-and-effect: Show what the user can do (affordances), what they are doing (direct manipulation), and what they have done (situational awareness)
  • Combine related user actions: If the user wants to look at something, just let them click on it, rather than requiring free-form 3D navigation.

Distinguish UI from data: When objects and data in the 3D environment are represented very richly and are subject to perspective, lighting, and animation, it can be hard to find the controls for the objects.

Any work on 3D UIs is valuable. The more of these interfaces that we get out there in user’s hands, the better off we will be at getting some clue as to what works and what doesn’t. We have had much time to gather data on the existing Windows metaphor and we are just starting to gain an understanding of how users can navigate, personalize, and share information within a virtual 3D environment.