Kristin M. Tolle is the Director of the Data Science Initiative in Microsoft Research Outreach, Redmond, WA.
Since joining Microsoft in 2000, Dr. Tolle has acquired numerous patents and worked for several product teams including the Natural Language Group, Visual Studio, and the Microsoft Office Excel Team. Since joining Microsoft Research’s outreach program in 2006, she has run several major initiatives from Biomedical computing and environmental science to more traditional computer and information science programs around natural user interactions and data curation. She was also directed the development of the Microsoft Translator Hub and the Environmental Science Services Toolkit.
Dr. Tolle is an editor, along with Tony Hey and Stewart Tansley, of one of the earliest books on data science, The Fourth Paradigm: Data Intensive Scientific Discovery. Her current focus is develop an outreach program to engage with academics on data science in general and more specifically around using data to create meaningful and useful user experiences across devices platforms.
Prior to joining Microsoft, Tolle was an Oak Ridge Science and Engineering Research Fellow for the National Library of Medicine and a Research Associate at the University of Arizona Artificial Intelligence Lab managing the group on medical information retrieval and natural language processing. She earned her Ph.D. in Management of Information Systems with a minor in Computational Linguistics.
Dr. Tolle's present research interests include global public health, climate change, mobile computing to enable field scientists and inform the public, sensors used to gather ecological and environmental data, and integration and interoperability of large heterogeneous environmental data sources. She collaborates with several major research groups in Microsoft Research including the natural language processing group, eScience, computational science laboratory, computational ecology and environmental science, and the sensing and energy research group.
- Environmental Science Services Initiative: Development Director of this project for creating a common, easy-to-use infrastructure and user experience that gives ready access to existing and new Microsoft Research tools targeted at climate change and environmental science.
- National Flood Interoperability Experiment: Working with researchers at several major institutions as well as government agencies who provide much of the data to develop a real-time flood mapping system to improve better National disaster response.
- DataUp: Project Director and Development Manager to create a data curation tool to enable environmental scientists to preserve and share their datasets (largely in Microsoft Excel) with the broader community.
- Microsoft Translator Hub: Product and Program Director to create a tool to enable the creation of custom translation models for businesses and language communities.
- Cultural Preservation Initiative: Primary member of this group focused on language preservation.
- Devices Sensors and Mobility for Healthcare: Founder and Program Director for this program founded in 2006 to seed the development of over 25 mHealth projects and culminated in starting an annual mHealth Summit run by the Foundation for the National Institutes of Health.
- Computational Challenges in Genome Wide Association Studies: Program Manager for this program designed to further computer science research in support of furthering scientific discoveries in GWAS.
- UCL Data Science Student Challenge
6-7 February 2016
Exeter, Oxford, UK
22-25 February 2016
San Francisco, CA
11-14 April 2016
Montreal, Quebec, Canada
- Intl Conf on Big Data Management and Analytics
25-26, April 2016
- ICML @ NYC
19-24 June 2016
New York City, New York
13-17 August 2016
San Francisco, CA
- 2020 Science
- Conservation at Microsoft
- Data Mining
- Data science at Microsoft Research
- Data-Driven Conversation
- Deep Learning for Natural Language Processing: Theory and Practice (CIKM2014 Tutorial)
- Environmental Informatics Framework
- Microsoft Translator Hub
- Open source for academics
- Protecting Biodiversity
- Standards-based open source data management solutions for earth science
- Anshumali Srivastava, Arnd Christian König, and Misha Bilenko, Time Adaptive Sketches (Ada-Sketches) for Summarizing Data Streams, in ACM SIGMOD Conference, ACM – Association for Computing Machinery, 26 June 2016.
- Mohan Yang, bolin ding, surajit chaudhuri, and kaushik chakrabarti, Finding Patterns in a Knowledge Base using Keywords to Compose Table Answers, VLDB – Very Large Data Bases, August 2015.
- Fotis Psallidas, Bolin Ding, Kaushik Chakrabarti, and Surajit Chaudhuri, S4: Top-k Spreadsheet-Style Search for Query Discovery, in Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD 2015), ACM – Association for Computing Machinery, June 2015.
- Philip A. Bernstein, Sudipto Das, Bailu Ding, and Markus Pilman, Optimizing Optimistic Concurrency Control for Tree-Structured, Log-Structured Databases, in International Conference on Management of Data (SIGMOD), ACM – Association for Computing Machinery, 31 May 2015.
- Chi Wang, Kaushik Chakrabarti, Yeye He, Kris Ganjam, Zhimin Chen, and Phil A. Bernstein, Concept Expansion Using Web Tables, in WWW, ACM – Association for Computing Machinery, May 2015.