Share on Facebook Tweet on Twitter Share on LinkedIn Share by email
Our research
Content type
+
Downloads (449)
+
Events (439)
 
Groups (151)
+
News (2698)
 
People (745)
 
Projects (1091)
+
Publications (12444)
+
Videos (5578)
Labs
Research areas
Algorithms and theory47205 (329)
Communication and collaboration47188 (212)
Computational linguistics47189 (223)
Computational sciences47190 (213)
Computer systems and networking47191 (750)
Computer vision208594 (901)
Data mining and data management208595 (98)
Economics and computation47192 (100)
Education47193 (82)
Gaming47194 (73)
Graphics and multimedia47195 (229)
Hardware and devices47196 (209)
Health and well-being47197 (87)
Human-computer interaction47198 (868)
Machine learning and intelligence47200 (855)
Mobile computing208596 (48)
Quantum computing208597 (25)
Search, information retrieval, and knowledge management47199 (668)
Security and privacy47202 (299)
Social media208598 (39)
Social sciences47203 (260)
Software development, programming principles, tools, and languages47204 (615)
Speech recognition, synthesis, and dialog systems208599 (118)
Technology for emerging markets208600 (32)
1–25 of 668
Sort
Show 25 | 50 | 100
1234567Next 
Bhaskar Mitra

Search logs contain examples of frequently occurring patterns of user reformulations of queries. Intuitively, the reformulation "san francisco" → "san francisco 49ers" is semantically similar to "detroit" →"detroit lions". Likewise, "london"→"things to do in london" and "new york"→"new york tourist attractions" can also be considered similar transitions in intent. The reformulation "movies" → "new movies" and "york" → "new york", however, are clearly different despite the lexical similarities in the two...

Publication details
Date: 1 August 2015
Type: Inproceeding
Publisher: ACM – Association for Computing Machinery
Peter Bailey, Alistair Moffat, Falk Scholer, and Paul Thomas

Test collection design eliminates sources of user variability to make statistical comparisons among information retrieval (IR) systems more affordable. Does this choice unnecessarily limit generalizability of the outcomes to real usage scenarios? We explore two aspects of user variability with regard to evaluating the relative performance of IR systems, assessing effectiveness in the context of a subset of topics from three TREC collections, with the embodied information needs categorized against three...

Publication details
Date: 1 August 2015
Type: Inproceeding
Publisher: ACM – Association for Computing Machinery
Publication details
Date: 1 August 2015
Type: Proceedings
Publisher: ACL – Association for Computational Linguistics
Emre Kıcıman and Matthew Richardson

Every day, people take action, trying to achieve their personal, high-order goals. People decide what actions to take based on their personal experience, knowledge and gut instinct. While this leads to positive outcomes for some people, many others do not have the necessary experience, knowledge and instinct to make good decisions. What if, rather than making decisions based solely on their own personal experience, people could take advantage of the reported experiences of hundreds of millions of other...

Publication details
Date: 1 August 2015
Type: Inproceeding
Publisher: ACM – Association for Computing Machinery
Anne Schuth, Katja Hofmann, and Filip Radlinski

The gold standard for online retrieval evaluation is AB testing. Rooted in the idea of a controlled experiment, AB tests compare the performance of an experimental system (treatment) on one sample of the user population, to that of a baseline system (control) on another sample. Given an online evaluation metric that accurately reflects user satisfaction, these tests enjoy high validity. However, due to the high variance across users, these comparisons often have low sensitivity, requiring millions of...

Publication details
Date: 1 August 2015
Type: Inproceeding
Publisher: ACM – Association for Computing Machinery
Zhongyuan Wang, Kejun Zhao, Haixun Wang, Xiaofeng Meng, and Ji-Rong Wen

The goal of query conceptualization is to map instances in a query to concepts defined in a certain ontology or knowledge base. Queries usually do not observe the syntax of a written language, nor do they contain enough signals for statistical inference. However, the available context, i.e., the verbs related to the instances, the adjectives and attributes of the instances, do provide valuable clues to understand instances. In this paper, we first mine a variety of relations among terms from a large web...

Publication details
Date: 1 July 2015
Type: Inproceeding
Jialu Liu, Jingbo Shang, Chi Wang, Xiang Ren, and Jiawei Han

Text data are ubiquitous and play an essential role in big data applications. However, text data are mostly unstructured. Transforming unstructured text into structured units (e.g., semantically meaningful phrases) will substantially reduce semantic ambiguity and enhance the power and efficiency at manipulating such data using database technology. Thus mining quality phrases is a critical research problem in the field of databases. In this paper, we propose a new framework that extracts quality...

Publication details
Date: 1 June 2015
Type: Inproceeding
Publisher: ACM – Association for Computing Machinery
Fotis Psallidas, Bolin Ding, Kaushik Chakrabarti, and Surajit Chaudhuri

An enterprise information worker is often aware of a few example tuples that should be present in the output of the query. Query discovery systems have been developed to discover project-join queries that contain the given example tuples in their output. However, they require the output to exactly contain all the example tuples and do not perform any ranking. To address this limitation, we study the problem of efficiently discovering top-k project join queries which approximately contain the...

Publication details
Date: 1 June 2015
Type: Proceedings
Publisher: ACM – Association for Computing Machinery
Yiwei Chen and Katja Hofmann

Online learning to rank holds great promise for learning personalized search result rankings. First algorithms have been proposed, namely absolute feedback approaches, based on contextual bandits learning; and relative feedback approaches, based on gradient methods and inferred preferences between complete result rankings. Both types of approaches have shown promise, but they have not previously been compared to each other. It is therefore unclear which type of...

Publication details
Date: 20 May 2015
Type: Inproceeding
Publisher: ACM – Association for Computing Machinery
Elad Yom-Tov

Syndromic surveillance refers to the analysis of medical information for the purpose of detecting outbreaks of disease earlier than would have been possible otherwise and to estimate the prevalence of the disease in a population. Internet data, especially search engine queries and social media postings, have shown promise in contributing to syndromic surveillance for in uenza and dengue fever. Here we focus on the recent outbreak of Ebola Virus Disease and ask whether three major sources of Internet...

Publication details
Date: 18 May 2015
Type: Inproceeding
Publisher: ACM – Association for Computing Machinery
Kuansan Wang

Human is the only species on earth that has mastered the technologies in writing and printing to capture ephemeral thoughts and scientific discoveries. The capabilities to pass along knowledge, not only geographically but also generationally, have formed the bedrock of our civilizations. We are in the midst of a silent revolution driven by the technological advancements: no longer are computers just a fixture of our physical world but have they been so deeply woven into our daily routines that they are...

Publication details
Date: 18 May 2015
Type: Inproceeding
Publisher: ACM – Association for Computing Machinery
Yi Wei, Nirupama Chandrasekaran, Sumit Gulwani, and Youssef Hamadi

Software developers heavily rely on code snippets and API usage examples searched on the Internet. This paper presents Bing Code Search, a Visual Studio extension that allows developers to write, within an IDE, free-form natural language questions, and get C# code snippets answering those questions. Bing Code Search automatically adapts the suggested snippets into the user’s programming context via variable renaming, and records users’ interactions to improve its suggestions. Compared to prior related...

Publication details
Date: 11 May 2015
Type: Technical report
Number: MSR-TR-2015-36
Helen J. Wang, Alexander Moshchuk, Michael Gamon, Mona Haraty, Shamsi Iqbal, Eli T. Brown, Ashish Kapoor, Chris Meek, Eric Chen, Yuan Tian, Jaime Teevan, Mary Czerwinski, and Susan Dumais

In this paper, we advocate “activity” to be a central abstraction between people and computing instead of applications. We outline the vision of the activity platform as the next-generation social platform.

Publication details
Date: 8 May 2015
Type: Technical report
Publisher: Microsoft Research
Number: MSR-TR-2015-38
Ryen W. White, Matthew Richardson, and Wen-tau Yih

Search systems traditionally require searchers to formulate information needs as keywords rather than in a more natural form, such as questions. Recent studies have found that Web search engines are observing an increase in the fraction of queries phrased as natural language. As part of building better search engines, it is important to understand the nature and prevalence of these intentions, and the impact of this increase on search engine performance. In this work, we show that while 10.3% of queries...

Publication details
Date: 1 May 2015
Type: Inproceeding
Publisher: ACM – Association for Computing Machinery
Jeong-Min Yun, Yuxiong He, Sameh Elnikety, and Shaolei Ren

A web search engine often employs partition-aggregate architecture, where an aggregator propagates a user query to all index serving nodes (ISNs) and collects the responses from them. An aggregation policy determines how long the aggregators wait for the ISNs before returning aggregated results to users, crucially affecting both query latency and quality. Designing an aggregation policy is, however, challenging: Response latency among queries and among ISNs varies significantly, and aggregators lack of...

Publication details
Date: 1 May 2015
Type: Technical report
Number: MSR-TR-2015-39
Huan Sun, Hao Ma, Wen-tau Yih, Chen-Tse Tsai, Jingjing Liu, and Ming-Wei Chang

Most recent question answering (QA) systems query large-scale knowledge bases (KBs) to answer a question, after parsing and transforming natural language questions to KBs-executable forms (e.g., logical forms). As a well-known fact, KBs are far from complete, so that information required to answer questions may not always exist in KBs. In this paper, we develop a new QA system that mines answers directly from the Web, and meanwhile employs KBs as a significant auxiliary to further boost the QA...

Publication details
Date: 1 May 2015
Type: Inproceeding
Publisher: ACM – Association for Computing Machinery
Milad Shokouhi and Qi Guo

The growing accessibility of mobile devices has substantially reformed the way users access information.
While the reactive search by query remains as common as before, recent years
have witnessed the emergence of various proactive systems such as Google Now
and Microsoft Cortana. In these systems, relevant content is presented to users based on their context without a query.
Interestingly, despite the increasing popularity of such services, there is very little known about how users...

Publication details
Date: 1 May 2015
Type: Inproceeding
Publisher: ACM – Association for Computing Machinery
Chi Wang, Kaushik Chakrabarti, Yeye He, Kris Ganjam, Zhimin Chen, and Phil A. Bernstein

We study the following problem: given the name of an ad-hoc concept as well as a few seed entities belonging to the concept, output all entities belonging to it. Since producing the exact set of entities is hard, we focus on returning a ranked list of entities belonging to the concept. Previous approaches either use seed entities as the only input, or inherently require negative examples. They suffer from input ambiguity and semantic drift, or are not viable options for ad-hoc tail concepts. In this...

Publication details
Date: 1 May 2015
Type: Inproceeding
Publisher: ACM – Association for Computing Machinery
Milad Shokouhi, Marc Sloan, Paul N. Bennett, Kevyn Collins-Thompson, and Siranush Sarkizova

Queries issued to a search engine are often under-specified or ambiguous. The user’s search context or background may provide information that disambiguates their information need in order to automatically predict and issue a more effective query. The disambiguation can take place at different stages of the retrieval process. For instance, contextual query suggestions may be computed and recommended to users on the result page when appropriate, an approach that does not require modifying the...

Publication details
Date: 1 May 2015
Type: Proceedings
Publisher: ACM – Association for Computing Machinery
Gennady Pekhimenko, Dimitrios Lymberopoulos, Oriana Riva, Karin Strauss, and Doug Burger

Trending search topics cause unpredictable query load spikes that hurt the end-user search experience, particularly the mobile one, by introducing longer delays. To understand how trending search topics are formed and evolve over time, we analyze 21 million queries submitted during periods where popular events caused search query volume spikes. Based on our findings, we design and evaluate PocketTrend, a system that automatically detects trending topics in real time, identifies the search...

Publication details
Date: 1 May 2015
Type: Inproceeding
Ali Mamdouh Elkahky, Yang Song, and Xiaodong He

Recent online services rely heavily on automatic personalization to recommend relevant content to a large number of users. This requires systems to scale promptly to accommodate the stream of new users visiting the online services for the first time. In this work, we propose a content-based recommendation system to address both the recommendation quality and the system scalability. We propose to use a rich feature set to represent users, according to their web browsing history and search queries. We use...

Publication details
Date: 1 May 2015
Type: Inproceeding
Publisher: WWW – World Wide Web Consortium (W3C)
Chen-Tse Tsai, Wen-tau Yih, and Christopher J.C. Burges

Web-based QA, pioneered by Kwok et al. (2001), successfully demonstrated the power of Web redundancy. Early Web-QA systems, such as AskMSR (Brill et al., 2001), rely on various kinds of rewriting and pattern-generation methods for identifying answer paragraphs and for extracting answers. In this paper, we conducted an experimental study to examine the impact of the advance of search engine technologies and the growth of the Web, to such Web-QA approaches. When applying AskMSR to a new question answering...

Publication details
Date: 1 April 2015
Type: Technical report
Number: MSR-TR-2015-20
Wen Hua, Zhongyuan Wang, Haixun Wang, Kai Zheng, and Xiaofang Zhou

Understanding short texts is crucial to many applications, but challenges abound. First, short texts do not always observe the syntax of a written language. As a result, traditional natural language processing methods cannot be easily applied. Second, short texts usually do not contain suffi cient statistical signals to support many state-of-the-art approaches for text processing such as topic modeling. Third, short texts are usually more ambiguous. We argue that knowledge is needed in order to better...

Publication details
Date: 1 April 2015
Type: Inproceeding
Awards: Best Paper Award
Milad Shokouhi, Ryen W. White, and Emine Yilmaz

People's tendency to overly rely on prior information has be en well studied in psychology in the context of anchoring and adjustment.

biases pervade many aspects of human behavior. In this paper, we present a study of anchoring bias in information retrieval (IR) settings. We provide strong evidence of anchoring during the estimation of document relevance via both human relevance judging and in natural user behavior collected via search log analysis. In particular, we show that...

Publication details
Date: 1 April 2015
Type: Inproceeding
Publisher: ACM – Association for Computing Machinery
Shiri Dori-Hacohen, Elad Yom-Tov, and James Allan

Seeking information on a controversial topic is often a complex task, for both the user and the search engine. There are multiple subtleties involved with information seeking on controversial topics. Here we discuss some of the challenges in addressing these complex tasks, describing the spectrum between cases where there is a clear right answer, through fact disputes and moral debates, and discuss cases where search queries have a measurable effect on the well-being of people. We brie y survey the...

Publication details
Date: 29 March 2015
Type: Inproceeding
Publisher: Elsevier
1–25 of 668
Sort
Show 25 | 50 | 100
1234567Next 
> Our research