Share on Facebook Tweet on Twitter Share on LinkedIn Share by email

Asia Faculty Summit 2012
October 26–27, 2012 | Tianjin, China


Title and Description


SIRE: A Social Image Retrieval Engine

This demo presents a web-based multimodal paradigm for large-scale social image retrieval, termed "Social Image Retrieval Engine" (SIRE), which exploits both textual and visual content in attempting to close the semantic gap between high-level concepts and low-level visual features. A relevance feedback mechanism is also equipped to learn and refine the search results based on users’ feedback. 

Steven C.H. Hoi, Nanyang Technological University 


EventTeller: Event Detection and Topic Tracking Based on Web Data

Event detection and topic tracking within the Internet environment is very important not only for managing web data effectively but also for satisfying users’ information needs. We propose a solution that provides real-time event detection and topic tracking from web data, and we have implemented a prototype system to validate the effectiveness of our proposed methods.


Our solution employs the following methods.

  • We select mainstream news site data as the source for event detection and topic tracking and integrate it with users’ micro-blogs to provide more comprehensive information.
  • We focus on the algorithm efficiency so that the proposed methods can detect events and topics in real time.
  • We exploit the structure information of web data, which could help compute the similarity between news pages and better support event detection and topic tracking.

The prototype system we developed crawls and indexes event-related web data in real time. To do that, we use multiple types of storage simultaneously. The system is built on the Hadoop framework and programmed on a cluster of computers based on the mapReduce model.

Jun He, Renmin University of China;
Xiaoyong Du, Renmin University of China


SoWise: Managing the Wisdom of Crowds on Social Media Services

Recently, the “Wisdom of Crowds” has attracted a huge amount of interests from both research and industrial communities. For the current stage, most of focus has been put on several specific crowdsourcing marketplaces like Amazon MTurk1 or CrowdFlower, on which “requesters” publish tasks and “workers” select tasks according to their own needs. However, users on social media services can also serve as candidate “workers” for crowdsourcing tasks, and it is possible for the “requesters” to actively manage the quality and cost of such crowds.


In this demo, we will present a novel application of SoWise, to Manage the Wisdom of Crowds on social media services. The system is built based on the findings from our recent studies on crowdsourcing on social media services. SoWise can analyze the ability and cost of a given set of social media users when they are treated as potential “workers” for crowdsourcing tasks. Moreover, SoWise has two built-in mechanisms to conduct the enrollment of such “workers,” namely “pay-as-you-go” and “prediction market.” For each mechanism, an optimal set of users can be selected according to specific quality or budget requirements.

Lei Chen, Hong Kong University of Science and Technology

Whisper: Tracing the Spatiotemporal Process of Information Diffusion in Twitter

Whisper is an interactive visualization for monitoring information diffusion in Twitter. Our design highlights three major characteristics of diffusion processes related to a topic of interest in social media: the temporal trend, social-spatial extent, and community response. Such social, spatiotemporal processes are conveyed based on a sunflower metaphor whose seeds are often dispersed far away. In Whisper, we summarize the collective responses of communities on a given topic based on the sentiments extracted from Tweet data, and trace the pathways of re-Tweets on a spatial hierarchical layout. 

Huamin Qu, Hong Kong University of Science and Technology 

Practical Approaches for Dummy-based Location Privacy Protection

We will show a demo system to present two methods for location privacy protection, which we have proposed under joint work with Microsoft Research Asia including CORE 7. In the demo system, a user can arbitrarily specify a route and stopping points of his/her trip on a map, and then the system generates dummy trajectories in real time to protect the user's location privacy (the user's current/past positions and trajectory). Our methods take into account some physical restrictions in the real world, such as the user's moving speed and road topology, and try to distribute dummies and the user as uniformly as possible in the target region and make them often cross with each other to reduce the traceability of the user's trajectory. 

Takahiro Hara, Osaka University 


From SenseCampus to SenseCity: Towards Real Worlds Event Detection with Place-Triggered Geotagged Tweets and Sensor Networks

Recently, much research has addressed the detection of real-world events from social media such as Twitter. However, geotagged tweets often contain noise, which means tweets which are not content-wise related to users’ location. This noise is problem for detecting real world events. To address and solve the problem, we define the Place-Triggered Geotagged Tweet, meaning tweets that have both geotag and content-based relation to users’ locations. We designed and implemented a keyword-based matching technique to detect and classify place-triggered geotagged tweets. Additionally, we also present two example applications for visualizing place-triggered geotagged tweets. This project was originally supported by Microsoft Research Asia as SenseCampus Projects. We also discuss possibilities for detecting further real-world events with sensor networks.

Takuro Yonezawa, Keio University


Extending from Our COMPAD Simulator to the Cloud Platform

Extending from the success of our previous Microsoft Research Asia-highlighted COMPAD simulator, intended to facilitate students’ understanding of various computer systems and low-level program behavior on registers and relevant devices in a fundamental computer course, we will first give a demonstration of the enhanced COMPAD simulator with modeling of systems through the IEEE learning object metadata for greater flexibility and extensibility. On top of it, we will consider and show how our flexible framework for the COMPAD simulator can be adapted onto the cloud platform—namely the Windows Azure platform—as a repository of simulation models to demonstrate the potential uses of high-performance computing models in solving various problems in advanced computer courses.

Vincent Tam, The University of Hong Kong


Vibrotactile Applications for Mobile Devices

Several applications of vibrotactile feedback in mobile devices will be demonstrated. The demonstration includes a real-time audio-to-vibrotactile conversion algorithm and a new vibrotactile pattern authoring method based on the user’s demonstration.

Seungmoon Choi, Pohang University of Science and Technology (POSTECH)

Mobile Context Monitoring Platform for Sensor-Rich Environments

We present a mobile sensing platform for context computing in sensor-rich mobile environments. The platform runs as a middleware on top of smartphones and sensor devices. We plan to demonstrate the key functionalities of the platform.

  • Context monitoring: the platform provides sensing applications with APIs to specify the contexts of interest (such as location or activity) in a declarative query. We will show diverse context types and sensing devices that the platform supports, and the detailed operation of context monitoring.
  • Gesture monitoring: the platform supports mobility-robust gesture recognition with a hand-worn sensor device. On the platform, we show the control of an MP3 player by hand gestures under dynamic mobile environments such as walking and running. For example, by revolving the MP3 player volume control to adjust the volume.
  • Cooperative monitoring: We address a severe energy problem caused by continuous sensing through opportunistic cooperation among nearby mobile users. The platform automatically detects nearby cooperators and performs cooperative monitoring.

In addition, as a proof-of-concept, we plan to show a pervasive exercise game on top of the mobile sensing platform. It is an interactive and collaborative racing game running on multiple exercise devices, such as the jump rope and hula hoop. The platform extracts exercise intensity information from the devices and utilizes it as a game input value.

Junehwa Song, Korea Advanced Institute of Science & Technology (KAIST)



Constructing Mobile Phone-Based Context Recognition Systems for Real World Use

In this project, we aim to construct a context-aware system that works in high degree of performance for any purpose. Most conventional research did not achieve both accuracy and lightness at the same time. Moreover, in actual use of context-aware systems, there are many problems that are considered in conventional research: it is difficult to collect data of users, developers, and users who are not in the same place, and the quantity of possible gestures is enormous. This project consists of three themes that solve these problems.

  1. We construct accurate and online activity recognition for the real-world by use the smartphone (Windows Phones) and cloud server (Windows Azure). Training the system with data from the user is commonly used in conventional studies, but is not realistic and would lead to deterioration in performance since collecting and annotating data are difficult tasks for ordinary users. Though duplicated processing and a large dataset would solve this problem instead of individual training, they are heavy load for smartphone. We construct a general context-aware system by turning heavy tasks over the server.
  2. This theme investigates the effect of communication errors between developers and users through instructions for gesture recognition, and proposes a method to bridge the gap.
  3. This theme investigates the effect of explosion of kinds of gestures covering daily life, and we propose a method to recognize them if there is a problem. Output of this project must help construct reliable context-aware products easily, which will be widely spread over the world.

Terada Tsutomu, Kobe University


Windows Phone Applications for Campus of USTC

Thanks to the development of wireless networks, most people in universities are already using smartphones. For teachers and students in some campuses, many application demands are closely related with campus features, such as location information. These demands cannot always be met by general mobile applications and provides computer science students a good opportunity to practice developing suitable applications.


The demo shows a variety of Windows Phone applications for the campus of the University of Science and Technology of China, including course schedule, school bus schedule, campus map, finding available classrooms, lecture notes, and campus card information query. All these applications were designed and developed by the students themselves and most have been already received positive user feedback. 

Guangzhong Sun, University of Science and Technology of China 


Mimic Motion: Action Sharing System with Windows Phone

In this demo, we demonstrate an application to motivate people to share body action data, which will lead to a big dataset for the research of human activity recognition.


This system shares video, acceleration, and motion data that users publish, and automatically evaluates them. First, the user watches video on a web app and mimics the video's action by using a Windows Phone app. Then, this system evaluates the results from the acceleration data. Finally, this system displays the evaluation results on a web app. 

Sozo Inoue, Kyushu Institute of Technology;
Yuichi Hattori, Kyushu Institute of Technology 


Genius-on-the-Go: Design and Exploration of One Windows Phone Accessory Using FM Radio

Smartphones provide a convenient platform for social applications with their rich set of sensors and communication interfaces, and have become an integral part of our daily lives. In Genius-on-the-Go, we hope to explore a new social sharing mechanism for mobile users based on proximity. Our key contribution is an efficient discovery layer, and enabling interactions with nearby devices using FM radio. But since current mobile phones do not export radio drivers that support transmit mode (even though most hardware chips have this capability), we designed a custom accessory incorporating an FM chip and communicating with the phone via the headphone jack. By using this accessory, we demonstrate efficient discovery and communication between nearby smart phones. We also demonstrate a mobile-DJ application that employs the accessory. 

Jiao Wang, Northeast University, China 


Report of Cloud Computing Curriculum at SJTU

As cloud computing evolves, we have recognized that even within the academic and industry communities there can be people with little or no understanding on its fundamentals or how it is realized. As cloud computing will likely be influential in the future, it is imperative to clear away the any misconceptions or confusion.


It is for this reason that SJTU has endeavored to develop a cloud computing curriculum to amend its current operating systems and Windows courses. The objective is to introduce to students the principles and intricacies of cloud computing with hands-on experience in cloud application development and deployment. This curriculum will be based on the Windows Azure platform with comparison to other cloud platforms such as Cloud Foundry and App Engine. In-depth discussion about how cloud computing affects business and society will be included to enable recognition of the changed landscape in this cloudy era. Specific topics covered include cloud computing basics, cloud ontology, cloud software architecture, virtualization, Windows Azure, and cloud construction. This course would not be possible without the support of Microsoft Research Asia and Microsoft China Cloud Innovation Center. Microsoft has provided a wide range of support that includes fund, material, manpower, and Azure Access Pass. This support greatly facilitated the course development. 

Hengming Zou, Shanghai Jiao Tong University 


Microsoft Research Asia Curriculum Program

Microsoft Research Asia encourages the advancement of curricula in computer science and related disciplines and works with academia to enhance teaching content and explore innovative teaching methodologies. Researchers at Microsoft Research Asia have volunteered to teach credit courses at universities—primarily in topics about cutting-edge technologies that are not yet addressed by formal courses. With curriculum support from Microsoft Research Asia, the Windows Phone, Kinect, Windows Core, and Windows Azure Cloud courses were taught extensively in leading Asia universities, and the new Windows Embedded curriculum has seen a dramatic expansion among universities. The Software Engineering Curriculum Program cultivated high-quality software talent to boost rapid development of the software industry through partnerships with leading universities in Asia. Moving forward, our curriculum program goes online to benefit more students throughout Asia. We look forward to developing more collaborations with academia to provide instruction in the latest Microsoft technology and cultivate the best talent.

Bei Li, Microsoft Research Asia;
Heidi Fu, Microsoft Research Asia


Microsoft Research Asia Talent Program

At Microsoft Research Asia, we are not only creating technologies for tomorrow, but also for the day after tomorrow. Microsoft Research Connections supports innovative projects to advance state-of-the-art research and teaching in computer science and computational science around the world. We recognize and support top talent as interns, fellows, and scholars who are invited to work at Microsoft Research labs worldwide. The Microsoft Research Asia fellowship program and Young Faulty program support outstanding students and early-career faculty to help advance computer science and computational sciences in academia. Our internship program supports students who are considering careers in research. Learn more…

Wenxi Cai, Microsoft Research Asia;
Jennifer Cao, Microsoft Research Asia


ChronoZoom: Bringing History into the Future

Learn about and try out ChronoZoom, an intuitive, online tool that accesses multimedia data to provide cross-discipline insight into all of history and to bridge the gap between the sciences and the humanities. Explore this master timeline of the cosmos, Earth, life, and human experience, from billions of years ago to today. ChronoZoom unifies a wide variety of data and historical perspectives, enabling researchers, educators, and students to examine historical events, trends, and themes and to synthesize unexpected relationships and historical convergences that help explain the sweep of Big History. 

Richie Huang, Microsoft Research Asia


Urban Computing with City Dynamics

We will show a prototype system that discovers sections of a city that are used for various purposes (for example, residential, commercial, and educational areas) by using human mobility among areas and points of interests located in an area. We further identify the intensity of each function in different locations. The results generated by our system can benefit a variety of applications, including urban planning, location choosing for a business, and social recommendations. We evaluated our method by using large-scale and real-world datasets, consisting of two point-of-interest datasets of Beijing (from 2010 and 2011) and two 3-month GPS trajectory datasets (representing human mobility) generated by more than 12,000 taxicabs in Beijing in 2010 and 2011.  

Xing Xie, Microsoft Research Asia 


Engkoo Pinyin

Microsoft Research Asia utilized cutting-edge research in natural language processing (NLP), machine learning (ML), data mining, and cloud computing to build a state-of-the-art input method editor (IME) called Engkoo Pinyin. Advanced natural language processing was applied to build a novel core engine that utilizes new decoder algorithms based on statistical machine translation techniques and modeling. The resulting engine is state-of-the-art in accuracy, and optimized for high performance. Additionally, NLP is used to offer an innovative English input engine and writing assistance experience. Machine learning and data mining were utilized to discover new language usage on the web as it changes daily, and to build highly efficient, large-scale language models and lexicons for both the cloud and the client. Essentially, as the web continues to grow in language knowledge, Engkoo Pinyin learns from it and improves itself automatically over time. Cloud computing is used for delivering the IME as a high performance, scalable service. We also use web search and other cloud-based applications to power rich candidates—a new kind of input suggestion technology that Microsoft invented, which can better match user intent and needs for task completion.

Mu Li, Microsoft Research Asia


Dual View DualView: Enabling Concurrent Dual Views on Common LCD Screens

We present a pure software solution (with no hardware modification) that allows us to present two independent views concurrently on the most widely used and affordable type of LCD (liquid crystal display) screen, namely, twisted nematic (TN). This enables many useful and interesting applications in everyday scenarios. 

Xiang Cao, Microsoft Research Asia 


SlickFeel: Sliding and Clicking Haptic Feedback on a Touchscreen

We present SlickFeel, a single haptic display setup that can deliver two distinct types of feedback to a finger on a touchscreen during typical operations of sliding and clicking. Sliding feedback enables the sliding finger to feel interactive objects on a touchscreen through variations in friction. Clicking feedback provides a key-click sensation for confirming a key or button click. Two scenarios have been developed to demonstrate the utility of the two haptic effects. In the first, simple button-click scenario, a user feels the positions of four buttons on a touchscreen by sliding a finger over them and feels a simulated key-click signal by pressing on any of the buttons. In the second scenario, the advantage of haptic feedback is demonstrated in a haptically-enhanced thumb-typing scenario. A user enters text on a touchscreen with two thumbs without having to monitor the thumbs’ locations on the screen. By integrating SlickFeel with a Kindle Fire tablet, we demonstrate that SlickFeel can be used with existing mobile touchscreen devices. 

Hong Tan, Microsoft Research Asia 


Face SDK

Microsoft Research Face software development kit (SDK) integrates the latest face technologies from Microsoft research teams. It provides state-of-the-art algorithms to process face images, such as face detection, alignment, tracking, recognition, and cartoon generation. Developers can use these technologies to experiment with interesting scenarios on a Windows Phone, Windows PC, and Windows Slate.


We’ll demonstrate some basic functionality, like face detection, alignment, tracking, recognition, and swap. 

Ning Xu, Microsoft Research Asia;
Qiufeng Yin, Microsoft Research Asia 


Kinect Reality: Real-Time 3-D Scanning-Based Augmented Reality

We present Kinect Reality, an augmented reality system that consists of real-time view-oriented 3-D scanning, HMD (Head Mount Display), and hand gesture interaction. For real-time view oriented 3-D scanning, we design a Kinect Helmet, where a Kinect is attached to a helmet worn by a user and aligned with the view direction of the user. With a Kinect helmet, we can achieve WYSIWYR (What You See Is What You Reconstruct!), meaning that the parts of the scene in the view range of the user are reconstructed using real-time 3-D scanning of Kinect. The HMD provides the user with the display of the reconstructed scene, possibly with augmented virtual objects. The user can interact with the scanned and augmented objects using hand gestures on a tablet. To show the capability and performance of Kinect Reality, we present digital graffiti as an interactive demo application, where arbitrary objects (for example, building walls or car bodies) can be used as the canvas for augmented graffiti. 

Seungyong Lee, Pohang University of Science and Technology (POSTECH) 


Multi-User Source Localization Using Audio-Visual Information of Kinect

Along with the advancement of computer technology, more natural interfaces for human computer interaction are introduced. Natural user interface (NUI) enables us to interact with computers by using our perceptions, such as hearing, voice, gesture, and touch. However, most of these NUIs are limited to the application for a single sensor and single user scenario.


In this demonstration, we bring forward a real-time framework for multi-user source localization by using audio/visual information from Kinect. Video cues, which are immune to acoustic effects, such as reverberation and background noise, are employed primarily to extract candidate locations for speech sources by using head-detection/tracking algorithms. An audio stream that is acquired by four-channel microphones is then used to detect speaking users among the candidate locations.


Since the proposed framework is designed to overcome limitations and drawbacks of each sensor, it not only works well in a controlled environment, but also in noisy and reverberant conditions. Our demonstration system provides a new insight for NUIs that use multiple sensors in a multi-user environment.

Hong-Goo Kang, Yonsei University


Performance Capture of Interacting Characters Using Handheld Kinect Sensors

We present an algorithm for marker-less performance capture of interacting humans by using only three hand-held Kinect sensors. Our method reconstructs human skeletal poses, deforming surface geometry and camera poses for every time step of the depth video. As opposed to previous performance capture methods, our algorithm succeeds on general uncontrolled indoor scenes with potentially dynamic background, and it succeeds even if the cameras are moving. We believe that the advancement of depth camera techniques and improvement of sensor resolution, quality, portability, and compatibility (between Kinect sensors) will be achieved to allow the efficient production of 3-D content in everyday life. 

Yebin Liu, Tsinghua University;
Genzhi Ye, Tsinghua University;
Qionghai Dai, Tsinghua University 


EnseWing: Ensemble Playing Experience for Children Without Musical Training

Playing in a music ensemble not only offers children music education opportunities, but also contributes to the development of other skills, such as collaboration. However, playing in an ensemble usually requires high musical instrument skills, and may exclude those children without such skills. To open up this opportunity to more children, we designed EnseWing, an interactive experience to enable children without musical instrument training to experience ensemble playing. A field deployment was conducted in a primary school in Beijing. 

Fei Lu, Chinese Academy of Sciences 


Writing in the Air—Recognition of Virtual Handwritten Characters Based on Kinect

We present a real-time handwritten Chinese character recognition system that uses Kinect. By moving the hand to “write” virtual characters in the air, people can input characters in a new way. The recognition system runs in the real-world environment, providing an enhanced input experience for human-computer interaction. We design fast finger tracking and fingertip detection algorithm by taking advantage of depth, color, and motion information. This system recognizes all Chinese characters, 26 English letters (upper case or lower case) and 10 numbers (0 to 9) in real time. 

Lianwen Jin, South China University of Technology;
Xin Zhang, South China University of Technology 


Towards Better Communication with Kinect

Sign language, the primary means of communication for the hearing impaired, is only understandable to those who have learned the language; a situation that can lead to debilitating social isolation. This demo will show our primary efforts on sign language recognition and translation with Kinect. As Kinect offers an opportunity to capture action in RGB images and depth simultaneously, we apply Kinect to capture the signer’s actions and recognize sign language from both the trajectory and hand shape. 

Guang Li, Yushun Lin, Yili Tang, Zhihao Xu, Hanjing Li, and Xilin Chen,
Chinese Academy of Sciences and Beijing Union University 


Scanning 3-D Full Human Bodies Using Kinect Sensors

Depth camera, such as that used by Microsoft Kinect, is much cheaper than conventional 3-D scanning devices, so it can be acquired for everyday users easily. However, the depth data that Kinect captures over a certain distance is of extremely low quality. In this paper, we present a novel scanning system for capturing 3-D full human body models by using multiple Kinect sensors. To avoid the interference phenomena, we use two Kinect sensors to capture the upper and lower parts of a human body without overlapping regions. A third Kinect sensor is used to capture the middle part of the human body from the opposite direction.


We propose a practical approach for registering the various body parts from different views under non-rigid deformation. First, a rough mesh template is constructed and used to deform successive frames in pairs. Second, global alignment is performed to distribute errors in the deformation space, which can solve the loop closure problem efficiently. Misalignment that is caused by complex occlusion can also be handled reasonably by our global alignment algorithm. The experimental results have demonstrated the efficiency and applicability of our system. Our system obtains impressive results in a few minutes with low-price devices, thus is practically useful for generating personalized avatars for everyday users. Our system has been used for 3-D human animation and virtual try on, and can further facilitate a range of home-oriented virtual reality (VR) applications. 

Ligang Liu, University of Science and Technology of China 


The Kite Runner

The kite plays an important role in Chinese cultural heritage. The project uses technical innovations with Microsoft Kinect technology, which enables users to fly kites inside and to help preserve, digitally, this aspect of our culture heritage that is on the verge of extinction. 

Yingqing Xu, Tsinghua University 


A Hand Gesture API for Kinect

We demonstrate “Winect,” an open-source API that uses the Kinect sensor’s depth camera to recognize a variety of hand gestures and additional low-level features, such as finger positions and hand orientation. We show several apps that use this API, including one that allows you to control the computer’s mouse cursor. Different gestures can be used to define the various mouse functions, such as right, left, and middle click. You can also use gestures to scroll through your screen. We also show how to use depth position and hand gesture together for various type of game play. The API is publically available

Ho Kok Wei (Daniel), National University of Singapore