Share on Facebook Tweet on Twitter Share on LinkedIn Share by email
Our research
Content type
+
Downloads (449)
+
Events (427)
 
Groups (147)
+
News (2665)
 
People (740)
 
Projects (1079)
+
Publications (12262)
+
Videos (5513)
Labs
Research areas
Algorithms and theory47205 (307)
Communication and collaboration47188 (202)
Computational linguistics47189 (205)
Computational sciences47190 (200)
Computer systems and networking47191 (721)
Computer vision208594 (892)
Data mining and data management208595 (88)
Economics and computation47192 (98)
Education47193 (79)
Gaming47194 (73)
Graphics and multimedia47195 (222)
Hardware and devices47196 (201)
Health and well-being47197 (84)
Human-computer interaction47198 (821)
Machine learning and intelligence47200 (826)
Mobile computing208596 (44)
Quantum computing208597 (22)
Search, information retrieval, and knowledge management47199 (647)
Security and privacy47202 (285)
Social media208598 (35)
Social sciences47203 (248)
Software development, programming principles, tools, and languages47204 (586)
Speech recognition, synthesis, and dialog systems208599 (104)
Technology for emerging markets208600 (28)
1–25 of 647
Sort
Show 25 | 50 | 100
1234567Next 
Jialu Liu, Jingbo Shang, Chi Wang, Xiang Ren, and Jiawei Han

Text data are ubiquitous and play an essential role in big data applications. However, text data are mostly unstructured. Transforming unstructured text into structured units (e.g., semantically meaningful phrases) will substantially reduce semantic ambiguity and enhance the power and efficiency at manipulating such data using database technology. Thus mining quality phrases is a critical research problem in the field of databases. In this paper, we propose a new framework that extracts quality...

Publication details
Date: 1 June 2015
Type: Inproceeding
Publisher: ACM – Association for Computing Machinery
Kuansan Wang

Human is the only species on earth that has mastered the technologies in writing and printing to capture ephemeral thoughts and scientific discoveries. The capabilities to pass along knowledge, not only geographically but also generationally, have formed the bedrock of our civilizations. We are in the midst of a silent revolution driven by the technological advancements: no longer are computers just a fixture of our physical world but have they been so deeply woven into our daily routines that they are...

Publication details
Date: 18 May 2015
Type: Inproceeding
Publisher: ACM – Association for Computing Machinery
Elad Yom-Tov

Syndromic surveillance refers to the analysis of medical information for the purpose of detecting outbreaks of disease earlier than would have been possible otherwise and to estimate the prevalence of the disease in a population. Internet data, especially search engine queries and social media postings, have shown promise in contributing to syndromic surveillance for in uenza and dengue fever. Here we focus on the recent outbreak of Ebola Virus Disease and ask whether three major sources of Internet...

Publication details
Date: 18 May 2015
Type: Inproceeding
Publisher: ACM – Association for Computing Machinery
Gennady Pekhimenko, Dimitrios Lymberopoulos, Oriana Riva, Karin Strauss, and Doug Burger

Trending search topics cause unpredictable query load spikes that hurt the end-user search experience, particularly the mobile one, by introducing longer delays. To understand how trending search topics are formed and evolve over time, we analyze 21 million queries submitted during periods where popular events caused search query volume spikes. Based on our findings, we design and evaluate PocketTrend, a system that automatically detects trending topics in real time, identifies the search...

Publication details
Date: 1 May 2015
Type: Inproceeding
Ryen W. White, Matthew Richardson, and Wen-tau Yih

Search systems traditionally require searchers to formulate information needs as keywords rather than in a more natural form, such as questions. Recent studies have found that Web search engines are observing an increase in the fraction of queries phrased as natural language. As part of building better search engines, it is important to understand the nature and prevalence of these intentions, and the impact of this increase on search engine performance. In this work, we show that while 10.3% of queries...

Publication details
Date: 1 May 2015
Type: Inproceeding
Publisher: ACM – Association for Computing Machinery
Ali Mamdouh Elkahky, Yang Song, and Xiaodong He

Recent online services rely heavily on automatic personalization to recommend relevant content to a large number of users. This requires systems to scale promptly to accommodate the stream of new users visiting the online services for the first time. In this work, we propose a content-based recommendation system to address both the recommendation quality and the system scalability. We propose to use a rich feature set to represent users, according to their web browsing history and search queries. We use...

Publication details
Date: 1 May 2015
Type: Inproceeding
Publisher: WWW – World Wide Web Consortium (W3C)
Huan Sun, Hao Ma, Wen-tau Yih, Chen-Tse Tsai, Jingjing Liu, and Ming-Wei Chang

Most recent question answering (QA) systems query large-scale knowledge bases (KBs) to answer a question, after parsing and transforming natural language questions to KBs-executable forms (e.g., logical forms). As a well-known fact, KBs are far from complete, so that information required to answer questions may not always exist in KBs. In this paper, we develop a new QA system that mines answers directly from the Web, and meanwhile employs KBs as a significant auxiliary to further boost the QA...

Publication details
Date: 1 May 2015
Type: Inproceeding
Publisher: ACM – Association for Computing Machinery
Wen Hua, Zhongyuan Wang, Haixun Wang, Kai Zheng, and Xiaofang Zhou

Understanding short texts is crucial to many applications, but challenges abound. First, short texts do not always observe the syntax of a written language. As a result, traditional natural language processing methods cannot be easily applied. Second, short texts usually do not contain suffi cient statistical signals to support many state-of-the-art approaches for text processing such as topic modeling. Third, short texts are usually more ambiguous. We argue that knowledge is needed in order to better...

Publication details
Date: 1 April 2015
Type: Inproceeding
Shiri Dori-Hacohen, Elad Yom-Tov, and James Allan

Seeking information on a controversial topic is often a complex task, for both the user and the search engine. There are multiple subtleties involved with information seeking on controversial topics. Here we discuss some of the challenges in addressing these complex tasks, describing the spectrum between cases where there is a clear right answer, through fact disputes and moral debates, and discuss cases where search queries have a measurable effect on the well-being of people. We brie y survey the...

Publication details
Date: 29 March 2015
Type: Inproceeding
Publisher: Elsevier
Emre Kıcıman

While today’s structured knowledge bases (e.g., Freebase) contain a sizable collection of information about entities, from celebrities and locations to concepts and common objects, there is a class of knowledge that has minimal coverage: actions. A large-scale knowledge base of actions would provide an opportunity for computing devices to aid and support people’s reasoning about their own actions and outcomes, leading to improved decision-making and goal achievement. In this short paper, we...

Publication details
Date: 23 March 2015
Type: Inproceeding
Publisher: AAAI - Association for the Advancement of Artificial Intelligence
Bin Gao and Tie-Yan Liu

Advertisement (ad) selection plays an important role and will heavily influence the effectiveness of the subsequent methods regard ad selection as a relatively independent module, queries and keywords during the ad selection process. In this paper, Our proposal is to formulate ad selection as such an optimization downstream components (e.g., the auction mechanism) to achieve and search engine revenue (we call the combination of these objective reference). To this end, we 1) extract a bunch of features...

Publication details
Date: 1 March 2015
Type: Article
Publication details
Date: 1 February 2015
Type: Inproceeding
Publisher: ACM – Association for Computing Machinery
Elad Yom-Tov, Diana Borsa, Andrew C Hayward, Rachel A McKendry, and Ingemar J Cox

Background: The escalating cost of global health care is driving the development of new technologies to identify early indicators of an individual’s risk of disease. Traditionally, epidemiologists have identified such risk factors using medical databases and lengthy clinical studies but these are often limited in size and cost and can fail to take full account of diseases where there are social stigmas or to identify transient acute risk factors.

Objective: Here we report that Web...

Publication details
Date: 1 January 2015
Type: Article
Publisher: JMIR
Xian-Sheng Hua and Jin Li

With the advances in distributed computation, machine learning and deep neural networks, we enter into an era that it is possible to build a real world image recognition system. There are three essential components to build a real-world image recognition system: 1) creating representative features, 2) de-signing powerful learning approaches, and 3) identifying massive training data. While extensive researches have been done on the first two aspects, much less attention has been paid on the third. In...

Publication details
Date: 1 January 2015
Type: Inproceeding
Publisher: AAAI - Association for the Advancement of Artificial Intelligence
Nihar B. Shah and Dengyong Zhou

Human computation or crowdsourcing involves joint inference of the ground-truth-answers and the worker abilities by optimizing an objective function, for instance, by maximizing the data likelihood based on an assumed underlying model. A variety of methods have been proposed in the literature to address this inference problem. As far as we know, none of the objective functions in existing methods is convex. In machine learning and applied statistics, a convex function such as the objective function of...

Publication details
Date: 1 January 2015
Type: Inproceeding
Publisher: AAAI - Association for the Advancement of Artificial Intelligence
Bishan Yang, Wen-tau Yih, Xiaodong He, Jianfeng Gao, and Li Deng

In this paper we present a unified framework for modeling multi-relational representations, scoring, and learning, and conduct an empirical study of several recent multi-relational embedding models under the framework. We investigate the different choices of relation operators based on linear and bilinear transformations, and also the effects of entity representations by incorporating unsupervised vectors pre-trained on extra textual resources. Our results show several interesting findings, enabling the...

Publication details
Date: 12 December 2014
Type: Inproceeding
Fang Wang, Zhongyuan Wang, Senzhang Wang, and Zhoujun Li

Keyphrase extraction is essential for many IR and NLP tasks. Existing methods usually use the phrases of the document separately without distinguishing the potential semantic correlations among them, or other statistical features from knowledge bases such as WordNet and Wikipedia. However, the mutual semantic information between phrases is also important, and exploiting their correlations may potentially help us more effectively extract the keyphrases. Generally, phrases in the title are more likely to...

Publication details
Date: 1 December 2014
Type: Inproceeding
Larry Heck and Hongzhao Huang

This paper presents an unsupervised neural knowledge graph embedding model and a coherence-based approach for semantic parsing of Twitter dialogs. The approach learns embeddings directly from knowledge graphs and scales to all of Wikipedia. Experiments show a 23.6% reduction in semanticparsing errors compared to the previously best reported results.

Publication details
Date: 1 December 2014
Type: Inproceeding
Publisher: IEEE – Institute of Electrical and Electronics Engineers
Michael J. Paul, Ryen W. White, and Eric Horvitz

We seek to understand the evolving needs of people who are faced with a life-changing medical diagnosis based on analyses of queries extracted from an anonymized search query log. Focusing on breast cancer, we manually tag a set of Web searchers as showing disruptive shifts in focus of attention and long-term patterns of search behavior consistent with the diagnosis and treatment of breast cancer. We build and apply probabilistic classifiers to detect these searchers from multiple sessions and to detect...

Publication details
Date: 15 November 2014
Type: Technical report
Publisher: Microsoft Research
Number: MSR-TR-2014-144
Sreenivas Gollapudi and Debmalya Panigrahi

where A key characteristic of a successful online market is the large specific participation of agents (producers and consumers) on both definition sides of the market. While there has been a long line of tion problems, impressive work on understanding such markets in terms of main revenue maximizing (also called max-sum) objectives, par- • ticularly in the context of allocating online impressions to interested advertisers, fairness considerations have surprisingly not received much attention in online...

Publication details
Date: 4 November 2014
Type: Inproceeding
Publisher: ACM – Association for Computing Machinery
Rakesh Agrawal, Sreenivas Gollapudi, Anitha Kannan, and Krishnaram Kenthapadi

The rapid proliferation of hand-held devices has led to the development of rich, interactive and immersive applications, such as e-readers for electronic books. These applications motivate retrieval systems that can implicitly satisfy any information need of the reader by exploiting the context of the user’s interactions. Such retrieval systems differ from traditional search engines in that the queries constructed using the context are typically complex objects (including the document and its...

Publication details
Date: 4 November 2014
Type: Inproceeding
Publisher: ACM – Association for Computing Machinery
Emine Yilmaz, Manisha Verma, Nick Craswell, Filip Radlinski, and Peter Bailey

Relevance judgments sit at the core of test collection construction, and are assumed to model the utility of documents to real users. However, comparisons of judgments with signals of relevance obtained from real users, such as click counts and dwell time, have demonstrated a systematic mismatch.

In this paper, we study one important source of the mismatch between user data and relevance judgments: Those due to the high degree of effort required by users to identify and consume the information in...

Publication details
Date: 1 November 2014
Type: Inproceeding
Publisher: ACM – Association for Computing Machinery
Zhaohui Wu, Yuanhua Lv, and Ariel Fuxman

When consuming content, users typically encounter entities that they are not familiar with. A common scenario is when users want to find information about entities directly within the content they are consuming. For example, when reading the book "Adventures of Huckleberry Finn", a user may lose track of the character Mary Jane and want to find some paragraph in the book that gives relevant information about her. The way this is achieved today is by invoking the ubiquitous Find function ("Ctrl-F")....

Publication details
Date: 1 November 2014
Type: Inproceeding
Publisher: ACM
Katja Hofmann, Bhaskar Mitra, Filip Radlinski, and Milad Shokouhi

Query Auto Completion (QAC) suggests possible queries to web search users from the moment they start entering a query. This popular feature of web search engines is thought to reduce physical and cognitive effort when formulating a query.

Perhaps surprisingly, despite QAC being widely used, users’ interactions with it are poorly understood. This paper begins to address this gap. We present the results of an in-depth user study of user interactions with QAC in web search. While study participants...

Publication details
Date: 1 November 2014
Type: Inproceeding
Publisher: ACM – Association for Computing Machinery
Yelong Shen, Xiaodong He, Jianfeng Gao, Li Deng, and Gregoire Mesnil

In this paper, we propose a new latent semantic model that incorporates a convolutional-pooling structure over word sequences to learn low-dimensional, semantic vector representations for search queries and Web documents. In order to capture the rich contextual structures in a query or a document, we start with each word within a temporal context window in a word sequence to directly capture contextual features at the word n-gram level. Next, the salient word n-gram features in the word sequence are...

Publication details
Date: 1 November 2014
Type: Inproceeding
Publisher: CIKM
1–25 of 647
Sort
Show 25 | 50 | 100
1234567Next 
> Our research