3-D Talking Heads On the Horizon
MSR Project Enables Users To Create Images of Themselves

MicroNews News Service

  In his office in Building 112, Zhengyou Zhang reaches for a stack of books with a small digital video camera propped on top. He slides the stack this way and that until his face is directly in front of the camera and the image of his face is in the center of the video-display window on his monitor. He clicks a button and steadily turns his head from left to right for five seconds.

From the video, he selects two individual images where he is facing front. He uses his mouse to make a dot in the corner of each eye, in the corner of each mouth, and at the nose tip. He clicks "Okay."

See for Yourself During Tech Festival

See the 3-D face-modeling technology and more during the MSR Tech Festival 2001. It will be held from 10 A.M. TO 7 P.M. Jan. 22 in the McKinley and Hood/Baker rooms of the Microsoft Conference Center, located in Building 33. The event, open to employees, will feature more than 120 research-technology demonstrations. International research offices also will showcase their latest technologies.

The program does its dance, and, less than a minute later, an image, also called a texture map, is created. The texture map is applied to, or "wrapped" around, a mesh that mimics the shape of a head. It's not a standard mesh, either. The mesh that is the underlying structure for the model changes based on the shape of the face used in the video. So, when Zhang applies the texture map of his face to the mesh the program generated, the final model has the same face and bone structure as Zhang's. Zhang's animated face floats around the window on the monitor and blinks. Then it smiles and says, "Hello."
Nearly two years ago, Zhang, from the Vision Based Modeling group within Microsoft Research (MSR), and Zicheng Liu, from MSR's 3-D Graphics group, wanted to know if they could enable a user to create an animated, accurate, three-dimensional model of a face-well, not just any face, but the user's own face. At the time, they had no idea it could work-the mark of risk-taking researchers.
They went to work over the next 1½ years, trying a variety of different technologies and methods. Finally, early last year, they accomplished their goal. They created a program that can make a 3-D face model from a simple digital video camera-the kind that many users have at home on their PCs. Not only that, but the 3-D models can talk, blink, frown, smile, and look sad. The model can be created in less than three minutes, too. Using existing technologies such as text-to-speech and "visemes" (visual representations of words, such as how pronouncing the letter "o" makes the lips take on a round shape) helped to enhance the program to make a face interactive.

Now, they need to sell the technology inside Microsoft. Since last year, the two have been enticing groups to use the technology. Zhang and Liu propose many interesting uses for 3-D face modeling.

Games are an obvious choice as demonstrated by Microsoft Chairman Bill Gates during his keynote address during the Games Developers Conference, held in San Jose, Calif., in March, when he announced the launch of Xbox. With the face-modeling technology, players can put their own face on a character. Importing a 3-D image of a player's face and applying it to a character would dramatically enhance the role-playing experience offered in many games.

"The avatar becomes a true representation of the controller of the character," Liu said. "Being able to use facial expressions in a game and talk would also enhance the gaming experience."

Users could send greeting cards using their own faces on various backgrounds and have the faces talk to the recipient. They could type a message and have their 3-D face model speak it. Users also could record their voices to be used with the models they create. Quick commands like smiling or frowning could be accomplished at the click of a button or by typing a simple command.

"There are lots of applications for this technology," Zhang said, "and we have plans to improve the realism of expressions, which we will be working on in the very near future."