See the World Through Microsoft
By Hui Ma, China Internet Weekly
July 5, 2009 11:00 AM PT

Why do objects appear to us the way they do? How does a beam of light change after multiple reflections? How can we enable a computer to digitally present physical principles in the real world? Finding answers to these questions was part of Yasuyuki Matsushita’s job description at Microsoft Research Asia. How can one deal with a picture blurred due to a jittering camera? Matsushita has also had to overcome his own personal challenges.

Matsushita, in a language “exclusive to researchers,” explained that “low-level vision research” and “full-frame video stabilization technology” could be applied to solve the aforementioned problems. “These happen to fall into the two subjects I'm looking into: photometry and video analysis.” Now the Lead Researcher of Visual Computing Group at Microsoft Research Asia, Matsushita said, “There is no close relation between the two, but that is just what interested me.”

A doctoral candidate from the University of Tokyo and an intern at Microsoft Research Asia; Tokyo, Japan, and the Sigma Building in Beijing; a major in electrical engineering and physics-based computer vision and video analysis and synthesis. The absence of relationships can be found everywhere in Matsushita’s life, and they have led to nice surprises and joyful coincidences. All of these have resulted in a story about incidence and necessity, as well as choice and insistence. In his story, it does not take an expert to notice the rigorousness of Japanese culture, the vividness of the American style, and the kindness shared by his Chinese colleagues at Microsoft Research Asia.

Encounter the Future

“Innovation is more of an accident.” said Hsiao-Wuen Hon, Managing Director of Microsoft Research Asia. The tie between Matsushita and Microsoft Research Asia, however, has to be an “inevitable accident.”

While still at the University of Tokyo, Matsushita obtained all his degrees in electrical engineering, particularly in the intelligent transportation system. “However, I realized that my real passion is for more basic research that allows my work to be applied to different fields.” Two years prior to finishing his doctoral studies, Matsushita became strongly interested in computer vision. Electrical engineering is closely related to computer science in that computer system architecture and software are compulsory subjects in both disciplines. It was not a leap for Matsushita to shift from electrical engineering to computer vision.

In 2002 when still a doctoral candidate, Matsushita ran into Harry Shum, then-Managing Director of Microsoft Research Asia, at an international conference on computer vision. “I knew him from long before, as he had been highly celebrated among researchers in computer vision,” Matsushita said. “I wanted to have the opportunity to work with, and learn more from him." Matsushita recommended himself to Shum, and "accidentally" became an intern at Microsoft Research Asia. The four-month internship saw Matsushita falling in love with the working environment there and resulted in a job contract.

It has been a mandate of Microsoft Research Asia to change more people’s lives by solving practical problems with technology. With Matsushita the opposite was also true: life itself turned out to be a constant source of inspiration and unexpected benefit to his work.

The birth of “full-frame video stabilization technology,” for instance, owes thanks to Matsushita’s wedding. His bride was disappointed with blurry images from hand-held cameras at the ceremony. The loving groom was motivated to put an end to blurry images through his own research efforts. “The digital image mosaic technology that existed then effectively stabilizes the images of still objects, but not moving ones,” he said. “With ‘full-frame video stabilization technology,’ lost pixels can be naturally filled in.” In a similar way, this technology is also able to cover unwanted text on screen, or remove dark dots due to dirty lenses.

Technology-based Light and Shadow Magic

“Video analysis will become increasingly important, as the boundary between still images and motion pictures is getting thinner and thinner. I believe images will eventually all be ‘on the move,’” Matsushita asserted.

Microsoft Research Asia’s research in computer vision is divided into two schools: high-level vision (such as face recognition technology) and low-level vision (such as photometry that looks into interaction among lights and objects). Matsushita's projects fall into the latter.

“Photometry is also very important, because if we cannot understand what happens at the ‘lower’ levels, we will not be able to make breakthroughs at ‘higher’ levels. Development in research on ‘low-level vision’ is bound to inspire progress in ‘high-level vision’ research.”

Although it was all about changes undetectable to the “naked eye,” Matsushita gave a vivid description of the application of photometry – 3-D restoration and digitalization of physical images, the application of which clearly relies on extraordinary vision well beyond that of a human.

“Multi-view stereo has been a traditional method for computer vision, where pictures taken from different perspectives are used to reproduce 3-D visual effects, but you find few details in the image. Photometric stereo is another approach where the video camera and the object are fixed, but the lighting conditions are manipulated to obtain a variety of observation values that are translated into surface orientations.”

The first approach captures the overall shape of an object, but does not provide details; while the second offers surface orientations instead of the overall shape. How does one combine the advantages of the two methods to obtain the most realistic 3-D graphics?

“If we mount a sustainable light source on a camera, we can manipulate the light source and the camera at the same time.” Matsushita and his interns from the University of Tokyo hammered out a 3-D digital video camera that did not look too different from one used for household purposes. “All accessories in this 3D camera are easily available on the market. Hand-held devices should always be simple, as no one wants to carry around a monster,” said Matsushita.

Cultural Convergence

Now the Area Chairman o for IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) 2009 and International Conference on Computer Vision (ICCV) 2009, and an editorial board member of International Journal of Computer Vision (IJCV) and Journal of Computer Vision Applications (CVA), Matsushita believes that it is the support from Microsoft Research Asia for its researchers to freely exchange ideas with the entire academic community that allows professionals to “see farther and clearer.”

“Through these positions, I’ve gained a better understanding of my research, which in turn helps me decide which research projects carry greater value for the future. Moreover, I’ve been able to network with many others of my kind in the field of computer vision,” Matsushita said with a smile.

“Most of my friends are my colleagues at Microsoft. We came from different cultural backgrounds, which itself is an interesting mixture. My wife had been to Beijing before, and she also likes the atmosphere and food here.” Like other researchers at Microsoft Research Asia, Matsushita is subject to influences from various cultures. Apart from photography and skiing, Matsushita loves to hunt for specialty foods in Beijing or play badminton and mahjong with his co-workers.

“It is apples and oranges to compare the culture inside Sigma Building and the traditional one outside in Beijing. My friends sometimes joke around, saying I am a Japanese man working for Americans in China,” said a laughing Matsushita.

Translated From