Share on Facebook Tweet on Twitter Share on LinkedIn Share by email
External Research & Programs: Awards

eScience 2007 Awards

Learn more about the biomedical computing projects selected by Microsoft Research.

eScience Award Recipients

Data Mining Everywhere
Alexander G. Gray
Georgia Institute of Technology, U.S.

Data is everywhere: scientists and businesses are dealing with an explosion of data opening unprecedented possibilities, and everyone has Gigabytes of music, video, photos, and email on their computers and smaller devices, as never before. As a response to needs in science and business, data mining technology is now fairly mature. So why isn�t data mining everywhere? And what if it were? Having developed data mining methods and software for users in over 30 different applications, from scientists to non-scientists, the main factors which make statistical machine learning tools useful are also the same, whether the user is mathematically-trained or not.

This project will mapthree state-of-the-art algorithms used in these other applications onto a customized, large-scale, relational astronomic SQL Server database: 1. Fast all-nearest-neighbors, 2. Fast density estimation, 3. Rules of thumb for algorithmic transfer (synthesis thetradeoffs and best practices in developing fast algorithms within the relational database setting).The two statistical methods have been chosen due to their general applicability to central problems in observational cosmology, their need for tractable solutions due their inherent difficulty, and their use in exploring the space ofefficient algorithmic behavior and interaction with SQL. Kernel density estimation is approximate, requiring notionsof error control, and returns a fixed-size query result, while all-nearest-neighbors is exact and returns a variable-sizequery result. Both potentially explore both small and large spatial ranges, though with different emphases, leadingto possibly different heuristics for mapping the abstract algorithms to the database setting. Exploring these twodifferent methods should give a good understanding of the overall tradeoff space in mapping algorithms torelational databases and result in a critical foundation for allowing a large number ofthe most useful machine learning methods to be implemented in a highly efficient manner within relational databases.

Henri ter Hofte
Telematica Instituut, The Netherlands

Smartphones go wherever people go. This puts these devices in an ideal position to capture the aspects of human behavior. In prior work, the Xensor application enabled the development of new methods and tools to study human behavior in naturalistic contexts, capitalizing on the increasing capabilities of mobile devices and on the increasing adoption of smartphones and other mobile devices. It showed how Windows Mobile 5.0 smartphones can be used to capture data about human behavior, by combining experience sampling with automatic logging of data.

RouteXensor is a step towards bringing Xensor to the sports and wellness domain. This project applies and extends the generic Xensor method and tools to the domain of outdoor recreational sports. GPS, accelerometer and physiological capabilities are monitored during performance field tests along with contextual data about routes and route segments, both objective data (e.g. speed, direction, heart rate, steps/strokes per minute) and subjective ratings (experience samples about e.g. road quality, scenery quality). This data is then analyzed to generate ratingson a route segment and route difficulty. The RouteXensor is designed to make it easier for people to collect and share and ratings about routes and route segments for recreational outdoor sport activities.

 > Collaboration > Opportunities