Use Voice to "Watch" Videos
China Internet Weekly
China Internet Weekly published a full page article on Microsoft Research Asia's new progress in the field of voice recognition systems. New possibilities in the area for further research are examined, as well as existing problems that are being dealt with by Microsoft Research Asia, such as performing efficient searches amongst the increasing amount of audio and video files on the Internet, and improving the accuracy of search results when the quality and content of videos and audios could be rather poor. The new technologies developed by Microsoft Research Asia are highly impressive.
The article states that computers can easily handle text, but voice presents as a much greater challenge. In the past 20 years, researchers have exerted great effort in harmonizing computers and human needs, and through that effort, voice recognition technology has become more mature. In response to the problem of inaccurate search results, Microsoft Research Asia has developed a technology that is based on video content and allows computers to "learn" from content appearance.
Another technology that helps to solve the problem of matching voices to the correct words is the analysis of voice content and the following provision of multiple possibilities. The system also has two models, "voice" and "language," which come together to produce the most logical results, greatly enhancing accuracy. These are only a sample of the many developments made in Microsoft Research Asia, with more to be expected in the near future to even better facilitate voice recognition.