Share on Facebook Tweet on Twitter Share on LinkedIn Share by email
Our research
Content type
+
Downloads (445)
+
Events (401)
 
Groups (150)
+
News (2608)
 
People (739)
 
Projects (1064)
+
Publications (12061)
+
Videos (5296)
Labs
Research areas
Algorithms and theory47205 (734)
Communication and collaboration47188 (1366)
Computational linguistics47189 (448)
Computational sciences47190 (751)
Computer systems and networking47191 (1997)
Computer vision208594 (1054)
Data mining and data management208595 (175)
Economics and computation47192 (292)
Education47193 (773)
Gaming47194 (365)
Graphics and multimedia47195 (1141)
Hardware and devices47196 (999)
Health and well-being47197 (437)
Human-computer interaction47198 (2133)
Machine learning and intelligence47200 (1642)
Mobile computing208596 (97)
Quantum computing208597 (49)
Search, information retrieval, and knowledge management47199 (1678)
Security and privacy47202 (739)
Social media208598 (72)
Social sciences47203 (803)
Software development, programming principles, tools, and languages47204 (1438)
Speech recognition, synthesis, and dialog systems208599 (112)
Technology for emerging markets208600 (37)
1–25 of 22764
Sort
Show 25 | 50 | 100
1234567Next 
Nicholas Jing Yuan, Yu Zheng, Xing Xie, Yingzi Wang, Kai Zheng, and Hui Xiong

The step of urbanization and modern civilization fosters different functional zones in a city, such as residential areas, business districts, and educational areas. In a metropolis, people commute between these functional zones every day to engage in different socioeconomic activities, e.g., working, shopping, and entertaining. In this paper, we propose a data-driven framework to discover functional zones in a city. Specifically, we introduce the concept of Latent Activity Trajectory (LAT), which...

Publication details
Date: 1 August 2016
Type: Article
Publisher: IEEE – Institute of Electrical and Electronics Engineers
Abram Hindle, Christian Bird, Thomas Zimmermann, and Nachiappan Nagappan

Large organizations like Microsoft tend to rely on formal requirements documentation in order to specify and design the software products that they develop. These documents are meant to be tightly coupled with the actual implementation of the features they describe. In this paper we evaluate the value of high-level topic-based requirements traceability and issue report traceability in the version control system, using Latent Dirichlet Allocation (LDA). We evaluate LDA topics on practitioners...

Publication details
Date: 1 December 2015
Type: Article
Publisher: Springer
Yanjie Fu, Yong Ge, Yu Zheng, Yao, Yanchi Liu, Hui Xiong, and Nicholas Jing Yuan

Ranking residential real estates based on investment values can provide decision making support for home buyers and thus plays an important role in estate marketplace. In this paper, we aim to develop methods for ranking estates based on investment values by mining users opinions about estates from online user reviews and offline moving behaviors (e.g., taxi traces, smart card transactions, check-ins). While a variety of features could be extracted from these data, these features are intercorrelated and...

Publication details
Date: 1 December 2015
Type: Inproceeding
Publisher: IEEE – Institute of Electrical and Electronics Engineers
Publication details
Date: 1 December 2015
Type: Article
Emerson Murphy-Hill, Thomas Zimmermann, Christian Bird, and Nachiappan Nagappan

When software engineers fix bugs, they may have several options as to how to fix those bugs. Which fix they choose has many implications, both for practitioners and researchers: What is the risk of introducing other bugs during the fix? Is the bug fix in the same code that caused the bug? Is the change fixing the cause or just covering a symptom? In this paper, we investigate alternative fixes to bugs and present an empirical study of how engineers make design choices about how to fix bugs. We start...

Publication details
Date: 1 December 2015
Type: Article
Publisher: IEEE – Institute of Electrical and Electronics Engineers
Badrish Chandramouli, Jonathan Goldstein, Mike Barnett, Robert DeLine, Danyel Fisher, John C. Platt, James F. Terwilliger, and John Wernsing

This paper introduces Trill – a new query processor for analytics. Trill fulfills a combination of three requirements for a query processor to serve the diverse big data analytics space: (1) Query Model: Trill is based on a tempo-relational model that enables it to handle streaming and relational queries with early results, across the latency spectrum from real-time to offline; (2) Fabric and Language Integration : Trill is architected as a high-level language library that supports...

Publication details
Date: 1 August 2015
Type: Inproceeding
Publisher: VLDB – Very Large Data Bases
Mohan Yang, bolin ding, surajit chaudhuri, and kaushik chakrabarti

We aim to provide table answers to keyword queries using a knowledge base. For queries referring to multiple entities, like “Washington cities population” and “Mel Gibson movies”, it is better to represent each relevant answer as a table which aggregates a set of entities or joins of entities within the same table scheme or pattern. In this paper, we study how to find highly relevant patterns in a knowledge base for user-given keyword queries to compose table answers. A knowledge base is...

Publication details
Date: 1 August 2015
Type: Inproceeding
Publisher: VLDB – Very Large Data Bases
Fuzheng Zhang, Nicholas Jing Yuan, David Wilkie, Yu Zheng, and Xing Xie

Urban transportation is an important factor in energy consumption and pollution, and is of increasing concern due to its complexity and economic significance. Its importance will only increase as urbanization continues around the world. In this paper, we explore drivers’ refueling behavior in urban areas. Compared to questionnaire-based methods of the past, we propose a complete data-driven system that pushes towards real-time sensing of individual refueling behavior and citywide petrol consumption. Our...

Publication details
Date: 1 June 2015
Type: Article
Publisher: ACM – Association for Computing Machinery
Shuo Ma, Yu Zheng, and Ouri Wolfson

We proposed and developed a taxi-sharing system that accepts taxi passengers’ real-time ride requests sent from smartphones and schedules proper taxis to pick up them via ridesharing, subject to time, capacity, and monetary constraints. The monetary constraints provide incentives for both passengers and taxi drivers: passengers will not pay more compared with no ridesharing and get compensated if their travel time is lengthened due to ridesharing; taxi drivers will make money for all the detour distance...

Publication details
Date: 1 June 2015
Type: Article
Publisher: IEEE
This is the website of the second International Workshop on Rack-scale Computing to be co-located with EuroSys'15
Event details
Date: 21 April 2015
Location: Bordeaux, France
Type: Workshop
Accurate indoor localization has the potential to transform the way people navigate indoors in a similar way that GPS transformed the way people navigate outdoors. Over the last 15 years, several human-centric approaches to indoor localization have been proposed by both academia and industry, but we have yet to see large scale deployments. This competition aims to bring together real-time or near real-time indoor location technologies and compare their performance.
Event details
Date: 11–13 April 2015
Location: Seattle, WA, USA
Type: Workshop
Arvind Arasu, Ken Eguro, Manas Joglekar, Raghav Kaushik, Donald Kossmann, and Ravi Ramamurthy

Cipherbase is a comprehensive database system that provides strong end-to-end data confidentiality through encryption. Cipherbase is based on a novel architecture that combines an industrial strength database engine (SQL Server) with lightweight processing over encrypted data that is performed in secure hardware. Cipherbase has the smallest trusted computing base (TCB) among comparable systems and provides significant benefits over the state-of-the-art in terms of security, performance, and...

Publication details
Date: 1 April 2015
Type: Inproceeding
Wen Hua, Zhongyuan Wang, Haixun Wang, Kai Zheng, and Xiaofang Zhou

Understanding short texts is crucial to many applications, but challenges abound. First, short texts do not always observe the syntax of a written language. As a result, traditional natural language processing methods cannot be easily applied. Second, short texts usually do not contain suffi cient statistical signals to support many state-of-the-art approaches for text processing such as topic modeling. Third, short texts are usually more ambiguous. We argue that knowledge is needed in order to better...

Publication details
Date: 1 April 2015
Type: Inproceeding
Jiajun Zhang, Shujie Liu, Mu Li, Ming Zhou, and Chengqing Zong

Language model is one of the most important modules in statistical machine translation and currently the wordbased language model dominants this community. However, many translation models (e.g. phrase-based models) generate the target language sentences by rendering and compositing the phrases rather than the words. Thus, it is much more reasonable to model dependency between phrases, but nearly no research work focuses on this problem. In this paper, we tackle this problem by designing a novel...

Publication details
Date: 1 April 2015
Type: Proceedings
Publisher: AAAI - Association for the Advancement of Artificial Intelligence
Irene Rae, Gina Venolia, John C. Tang, and David Molnar

As a field, telepresence has grown to include a wide range of systems, from multi-view videoconferencing units to humanlike androids. However, the diversity of systems and research makes it difficult to form a holistic understanding of where the field stands. We propose a framework consisting of seven design dimensions for understanding telepresence, iteratively developed from previous literature, a series of three surveys, the construction of two design probes, and a field study. These design...

Publication details
Date: 14 March 2015
Type: Inproceeding
Publisher: ACM – Association for Computing Machinery
Kathryn Zyskowski, Meredith Ringel Morris, Jeffrey P. Bigham, Mary L. Gray, and Shaun Kane

We present the first formal study of crowdworkers who have disabilities via in-depth open-ended interviews of 17 people (disabled crowdworkers and job coaches for people with disabilities) and a survey of 631 adults with disabilities. Our findings establish that people with a variety of disabilities currently participate in the crowd labor marketplace, despite challenges such as crowdsourcing workflow designs that inadvertently prohibit participation by, and may negatively affect the worker reputations...

Publication details
Date: 1 March 2015
Type: Inproceeding
Publisher: ACM – Association for Computing Machinery
Publication details
Date: 1 February 2015
Type: Inproceeding
Publisher: ACM – Association for Computing Machinery
TechVista is Microsoft Research India's annual research symposium. It brings together the best minds from the scientific and academic worlds onto a common platform. TechVista provides an opportunity for the research community, government, and students to interact and exchange ideas on research and its future directions.
Event details
Date: 23 January 2015
Location: Bangalore
Type: Conference
VMCAI provides a forum for researchers from the communities of Verification, Model Checking, and Abstract Interpretation, facilitating interaction, cross-fertilization, and advancement of hybrid methods that combine these and related areas.
Event details
Date: 11–13 January 2015
Location: Mumbai, India
Type: Conference
Jason D. Williams, Nobal B. Niraula, Pradeep Dasigi, Aparna Lakshmiratan, Carlos Garcia Jurado Suarez, Mouni Reddy, and Geoff Zweig

In personal assistant dialog systems, intent models are classifiers that identify the intent of a user utterance, such as to add a meeting to a calendar, or get the director of a stated movie. Rapidly adding intents is one of the main bottlenecks to scaling — adding functionality to — personal assistants. In this paper we show how interactive learning can be applied to the creation of statistical intent models. Interactive learning [10] combines model definition, labeling, model...

Publication details
Date: 11 January 2015
Type: Inproceeding
Moshe Babaioff, Moran Feldman, and Moshe Tennenholtz

We consider the problem of designing mechanisms that interact with strategic agents through strategic intermediaries (or mediators), and investigate the cost to society due to the mediators' strategic behavior. Selfish agents with private information are each associated with exactly one strategic mediator, and can interact with the mechanism exclusively through that mediator. Each mediator aims to optimize the combined utility of his agents, while the mechanism aims to optimize the combined utility of...

Publication details
Date: 11 January 2015
Type: Inproceeding
Publisher: ACM – Association for Computing Machinery
Justin Levandoski, David Lomet, Sudipta Sengupta, Ryan Stutsman, and Rui Wang

The Deuteronomy architecture provides a clean separation of transaction functionality (performed in a transaction component, or TC) from data management functionality (performed in a data component, or DC). In prior work we implemented both a TC and DC that achieved modest performance. We recently built a high performance DC (the Bw-tree key value store) that achieves very high performance on modern hardware and is currently shipping as an indexing and storage layer in a number of Microsoft systems....

Publication details
Date: 4 January 2015
Type: Inproceeding
Publisher: Conference on Innovative Data Systems Research (CIDR 2015)
Gao Huang, Jianwen Zhang, Shiji Song, and Zheng Chen

This paper proposes a new approach for discriminative clustering. The intuition is, for a good clustering, one should be able to learn a classifier from the clustering labels with high generalization accuracy. Thus we define a novel metric to evaluate the quality of a clustering labeling, named Minimum Separation Probability (MSP), which is a lower bound of the generalization accuracy of a classifier learnt from the clustering labeling. We take MSP as the objective to maximize and propose our...

Publication details
Date: 1 January 2015
Type: Inproceeding
Publisher: AAAI - Association for the Advancement of Artificial Intelligence
Nihar B. Shah and Dengyong Zhou

Human computation or crowdsourcing involves joint inference of the ground-truth-answers and the worker abilities by optimizing an objective function, for instance, by maximizing the data likelihood based on an assumed underlying model. A variety of methods have been proposed in the literature to address this inference problem. As far as we know, none of the objective functions in existing methods is convex. In machine learning and applied statistics, a convex function such as the objective function of...

Publication details
Date: 1 January 2015
Type: Inproceeding
Publisher: AAAI - Association for the Advancement of Artificial Intelligence
Margus Veanes, Todd Mytkowicz, David Molnar, and Benjamin Livshits

String-manipulating programs are an important class of programs with applications in malware detection, graphics, input sanitization for Web security, and large-scale HTML processing. This paper extends prior work on BEK, an expressive domain-specific language for writing string-manipulating programs, with algorithmic insights that make BEK both analyzable and data-parallel. By analyzable we mean that unlike most general purpose programming languages, many algebraic properties of a BEK...

Publication details
Date: 1 January 2015
Type: Inproceeding
Publisher: ACM – Association for Computing Machinery
1–25 of 22764
Sort
Show 25 | 50 | 100
1234567Next 
> Our research