Our research
Content type
+
Downloads (436)
+
Events (393)
 
Groups (149)
+
News (2566)
 
People (818)
 
Projects (1050)
+
Publications (11873)
+
Videos (5144)
Labs
Research areas
Algorithms and theory47205 (247)
Communication and collaboration47188 (182)
Computational linguistics47189 (173)
Computational sciences47190 (181)
Computer systems and networking47191 (660)
Computer vision208594 (32)
Data mining and data management208595 (51)
Economics and computation47192 (93)
Education47193 (78)
Gaming47194 (67)
Graphics and multimedia47195 (191)
Hardware and devices47196 (189)
Health and well-being47197 (69)
Human-computer interaction47198 (764)
Machine learning and intelligence47200 (697)
Mobile computing208596 (25)
Quantum computing208597 (8)
Search, information retrieval, and knowledge management47199 (604)
Security and privacy47202 (257)
Social media208598 (13)
Social sciences47203 (228)
Software development, programming principles, tools, and languages47204 (529)
Speech recognition, synthesis, and dialog systems208599 (44)
Technology for emerging markets208600 (24)
1–25 of 604
Sort
Show 25 | 50 | 100
1234567Next 
Rakesh Agrawal, Sreenivas Gollapudi, Anitha Kannan, and Krishnaram Kenthapadi

The rapid proliferation of hand-held devices has led to the development of rich, interactive and immersive applications, such as e-readers for electronic books. These applications motivate retrieval systems that can implicitly satisfy any information need of the reader by exploiting the context of the user’s interactions. Such retrieval systems differ from traditional search engines in that the queries constructed using the context are typically complex objects (including the document and its...

Publication details
Date: 4 November 2014
Type: Inproceeding
Publisher: ACM – Association for Computing Machinery
Sreenivas Gollapudi and Debmalya Panigrahi

where A key characteristic of a successful online market is the large specific participation of agents (producers and consumers) on both definition sides of the market. While there has been a long line of tion problems, impressive work on understanding such markets in terms of main revenue maximizing (also called max-sum) objectives, par- • ticularly in the context of allocating online impressions to interested advertisers, fairness considerations have surprisingly not received much attention in online...

Publication details
Date: 4 November 2014
Type: Inproceeding
Publisher: ACM – Association for Computing Machinery
Katja Hofmann, Bhaskar Mitra, Filip Radlinski, and Milad Shokouhi

Query Auto Completion (QAC) suggests possible queries to web search users from the moment they start entering a query. This popular feature of web search engines is thought to reduce physical and cognitive effort when formulating a query.

Perhaps surprisingly, despite QAC being widely used, users’ interactions with it are poorly understood. This paper begins to address this gap. We present the results of an in-depth user study of user interactions with QAC in web search. While study participants...

Publication details
Date: 1 November 2014
Type: Inproceeding
Publisher: ACM – Association for Computing Machinery
Yelong Shen, Xiaodong He, Jianfeng Gao, Li Deng, and Gregoire Mesnil

In this paper, we propose a new latent semantic model that incorporates a convolutional-pooling structure over word sequences to learn low-dimensional, semantic vector representations for search queries and Web documents. In order to capture the rich contextual structures in a query or a document, we start with each word within a temporal context window in a word sequence to directly capture contextual features at the word n-gram level. Next, the salient word n-gram features in the word sequence are...

Publication details
Date: 1 November 2014
Type: Inproceeding
Publisher: CIKM
Emine Yilmaz, Manisha Verma, Nick Craswell, Filip Radlinski, and Peter Bailey

Relevance judgments sit at the core of test collection construction, and are assumed to model the utility of documents to real users. However, comparisons of judgments with signals of relevance obtained from real users, such as click counts and dwell time, have demonstrated a systematic mismatch.

In this paper, we study one important source of the mismatch between user data and relevance judgments: Those due to the high degree of effort required by users to identify and consume the information in...

Publication details
Date: 1 November 2014
Type: Inproceeding
Publisher: ACM – Association for Computing Machinery
Jianfeng Gao, Patrick Pantel, Michael Gamon, Xiaodong He, Li Deng, and Yelong Shen

This paper presents a deep semantic model (DSM) for recommending target documents to be of interest to a user based on a source document she is reading. We observe, identify, and detect naturally occurring signals of interestingness in click transitions on the Web between source and target documents, which we collect from commercial Web browser logs. The DSM is trained on millions of Web transitions, and maps source-target document pairs to feature vectors in a latent space in such a...

Publication details
Date: 1 October 2014
Type: Proceedings
Publisher: EMNLP
Siyu Qiu, Qing Cui, Jiang Bian, Bin Gao, and Tie-Yan Liu

The techniques of using neural networks to learn distributed word representations (i.e., word embeddings) have been used to solve a variety of natural language processing tasks. The recently proposed methods, such as CBOW and Skip-gram, have demonstrated their effectiveness in learning word embeddings based on context information such that the obtained word embeddings can capture both semantic and syntactic relationships between words. However, it is quite challenging to produce high-quality word...

Publication details
Date: 1 August 2014
Type: Inproceeding
Publisher: Choose...
Edith Cohen

Distance queries are a basic tool in data analysis. They are used for detection and localization of change for the purpose of anomaly detection, monitoring, or planning. Distance queries are particularly useful when data sets such as measurements, snapshots of a system, content, traffic matrices, and activity logs are collected repeatedly.

Random sampling, which can be efficiently performed over streamed or distributed data, is an important tool for scalable data analysis. The sample...

Publication details
Date: 1 August 2014
Type: Technical report
Publisher: ACM – Association for Computing Machinery
Number: MSR-TR-2014-111
Fei Tian, Hanjun Dai, Jiang Bian, Bin Gao, Rui Zhang, Enhong Chen, and Tie-Yan Liu

Distributed word representations have been widely used and proven to be useful in quite a few natural language processing and text mining tasks. Most of existing word embedding models aim at generating only one embedding vector for each individual word, which, however, limits their effectiveness because huge amounts of words are polysemous (such as \emph{bank} and \emph{star}). To address this problem, it is necessary to build multi embedding vectors to represent different meanings of a word...

Publication details
Date: 1 August 2014
Type: Inproceeding
Publisher: Choose...
Publication details
Date: 1 August 2014
Type: Technical report
Publisher: Choose...
Number: MSR-TR-2014-108
Edith Cohen, Daniel Delling, Thomas Pajor, and Renato Werneck

Propagation of contagion through networks is a fundamental process. It is used to model the spread of information, influence, or a viral infection. Diffusion patterns can be specified by a probabilistic model, such as Independent Cascade (IC), or captured by a set of representative traces.

Basic computational problems in the study of diffusion are influence queries (determining the potency of a specified seed set of nodes) and Influence Maximization (identifying the...

Publication details
Date: 1 August 2014
Type: Technical report
Publisher: Microsoft Research
Number: MSR-TR-2014-110
Sandeep Panem, Manish Gupta, and Vasudeva Varma

As soon as natural disaster events happen, users are eager to know more about them. However, search engines currently provide a ten blue links interface for queries related to such events. Relevance of results for such queries can be significantly improved if users are shown a structured summary of the fresh events related to such queries. This would not just reduce the number of user clicks to get the relevant information but would also help users get updated with more fine grained attribute-level...

Publication details
Date: 1 August 2014
Type: Inproceeding
Publisher: Choose...
Nina Mishra, Ryen White, Samuel Ieong, and Eric Horvitz

We study time-critical search, where users have urgent information needs in the context of an acute problem. As examples, users may need to know how to stem a severe bleed, help a baby who is choking on a foreign object, or respond to an epileptic seizure. While time-critical situations and actions have been studied in the realm of decision-support systems, little has been done with time-critical search and retrieval, and little direct support is offered by search systems. Critical challenges with...

Publication details
Date: 10 July 2014
Type: Inproceeding
Publisher: ACM
Isabelle Stanton, Samuel Ieong, and Nina Mishra

Circumlocution is when many words are used to describe what could be said with fewer, e.g., “a machine that takes moisture out of the air” instead of “dehumidifier”. Web search is a perfect backdrop for circumlocution where people struggle to name what they seek. In some domains, not knowing the correct term can have a significant impact on the search results that are retrieved. We study the medical domain, where professional medical terms are not commonly known and where the consequence of not...

Publication details
Date: 8 July 2014
Type: Inproceeding
Publisher: ACM
Sunandan Chakraborty, Filip Radlinski, Milad Shokouhi, and Paul Baecke

Online search evaluation metrics are typically derived based on implicit feedback from the users. For instance, computing the number of page clicks, number of queries, or dwell time on a search result. In a recent paper, Dupret and Lalmas introduced a new metric called absence time, which uses the time interval between successive sessions of users to measure their satisfaction with the system. They evaluated this metric on a version of Yahoo! Answers. In this paper, we investigate the effectiveness of...

Publication details
Date: 1 July 2014
Type: Inproceeding
Publisher: ACM
Bhaskar Mitra, Milad Shokouhi, Filip Radlinski, and Katja Hofmann

Query Auto-Completion (QAC) is a popular feature of web search engines that aims to assist users to formulate queries faster and avoid spelling mistakes by presenting them with possible completions as soon as they start typing. However, despite the wide adoption of auto-completion in search systems, there is little published on how users interact with such services.

In this paper, we present the first large-scale study of user interactions with auto-completion based on query logs of Bing, a...

Publication details
Date: 1 July 2014
Type: Proceedings
Publisher: ACM
Marios Kokkodis, Anitha Kannan, and Krishnaram Kenthapadi

The emergence of tablet devices, cloud computing, and abundant online multimedia content presents new opportunities to transform traditional paper-based textbooks into tablet-based electronic textbooks. Towards this goal, techniques have been proposed to automatically augment textbook sections with relevant web content such as online educational videos. However, a highly relevant video can be created at a granularity that may not mimic the organization of the textbook. We focus on the video...

Publication details
Date: 1 July 2014
Type: Inproceeding
Publisher: International Educational Data Mining Society
Marios Kokkodis, Anitha Kannan, and Krishnaram Kenthapadi

The emergence of tablet devices, cloud computing, and abundant online multimedia content presents new opportunities to transform traditional paper-based textbooks into tablet-based electronic textbooks, and to further augment the educational experience by enriching them with relevant supplementary materials. The use of multimedia content such as educational videos along with textual content has been shown to improve learning outcomes. While such videos are becoming increasingly available, even a highly...

Publication details
Date: 1 July 2014
Type: Technical report
Publisher: Microsoft Research
Number: MSR-TR-2014-62
Parth Gupta, Kalika Bali, Rafael E. Banchs, Monojit Choudhury, and Paolo Rosso

For many languages that use non-Roman based indigenous scripts (e.g., Arabic, Greek and Indic languages) one can often find a large amount of user generated transliterated content on the Web in the Roman script. Such content creates a monolingual or multi-lingual space with more than one script which we refer to as the Mixed-Script space. IR in the mixed-script space is challenging because queries written in either the native or the Roman script need to be matched to the documents written in both the...

Publication details
Date: 1 July 2014
Type: Inproceeding
Publisher: ACM – Association for Computing Machinery
Ryen W. White, Matthew Richardson, and Wen-tau Yih

Search systems have traditionally required searchers to formulate information needs as a set of keywords rather than in a more natural form, such as questions. Recent studies have found that search engines are observing an increase in the fraction of Web search queries that take the form of natural language. As part of building better search engines, it is important to understand the nature and prevalence of these intentions, and the impact of this increase on search engine performance and searcher...

Publication details
Date: 1 July 2014
Type: Technical report
Publisher: Microsoft Research
Number: MSR-TR-2014-96
Yang Song, Xiaolin Shi, Ryen White, and Ahmed Hassan
Publication details
Date: 1 July 2014
Type: Inproceeding
Publisher: ACM
Fei Tian, Bin Gao, Qing Cui, Enhong Chen, and Tie-Yan Liu

Recently deep learning has been successfully adopted in many applications such as speech recognition and image classification. In this work, we explore the possibility of employing deep learning in graph clustering. We propose a simple method, which first learns a nonlinear embedding of the original graph by stacked autoencoder, and then runs k -means algorithm on the embedding to obtain clustering result. We show that this simple method has solid theoretical foundation, due to the similarity...

Publication details
Date: 1 July 2014
Type: Inproceeding
Publisher: Choose...
Edith Cohen

Random samples are lossy summaries which allow queries posed over the data to be approximated by applying an appropriate estimator to the sample. The effectiveness of sampling, however, hinges on estimator selection. The choice of estimators is subjected to global requirements, such as unbiasedness and range restrictions on the estimate value, and ideally, we seek estimators that are both efficient to derive and apply and admissible (not dominated, in terms of variance, by other estimators). ...

Publication details
Date: 1 July 2014
Type: Proceedings
Publisher: ACM
Myeongjae Jeon, Saehoon Kim, Seung-won Hwang, Yuxiong He, Sameh Elnikety, Alan L. Cox, and Scott Rixner

Web search engines are optimized to reduce the high-percentile response time to consistently provide fast responses to almost all user queries. This is a challenging task because the query workload exhibits large variability, consisting of many short-running queries and a few long-running queries that significantly impact the high-percentile response time. With modern multicore servers, parallelizing the processing of an individual query is a promising solution to reduce query execution time, but it...

Publication details
Date: 1 July 2014
Type: Inproceeding
Publisher: ACM
Zhen Wang, Jianwen Zhang, Jianlin Feng, and Zheng Chen

We deal with embedding a large scale knowledge graph composed of entities and relations into a continuous vector space. TransE is a promising method proposed recently, which is very efficient while achieving state-of-the-art predictive performance. We discuss some mapping properties of relations which should be considered in embedding, such as reflexive, one-to-many, many-to-one, and many-to-many. We note that TransE does not do well in dealing with these properties. Some complex models are capable of...

Publication details
Date: 1 July 2014
Type: Inproceeding
Publisher: AAAI - Association for the Advancement of Artificial Intelligence
1–25 of 604
Sort
Show 25 | 50 | 100
1234567Next 
> Our research