Share on Facebook Tweet on Twitter Share on LinkedIn Share by email
Our research
Content type
+
Downloads (455)
+
Events (487)
 
Groups (150)
+
News (2850)
 
People (716)
 
Projects (1161)
+
Publications (13025)
+
Videos (6121)
Labs
Research areas
Algorithms and theory47205 (376)
Communication and collaboration47188 (251)
Computational linguistics47189 (275)
Computational sciences47190 (247)
Computer systems and networking47191 (848)
Computer vision208594 (953)
Data mining and data management208595 (168)
Economics and computation47192 (129)
Education47193 (91)
Gaming47194 (85)
Graphics and multimedia47195 (265)
Hardware and devices47196 (243)
Health and well-being47197 (117)
Human-computer interaction47198 (1018)
Machine learning and intelligence47200 (1034)
Mobile computing208596 (89)
Quantum computing208597 (45)
Search, information retrieval, and knowledge management47199 (757)
Security and privacy47202 (372)
Social media208598 (93)
Social sciences47203 (319)
Software development, programming principles, tools, and languages47204 (688)
Speech recognition, synthesis, and dialog systems208599 (176)
Technology for emerging markets208600 (64)
1–25 of 757
Sort
Show 25 | 50 | 100
1234567Next 
Fernando Diaz, Bhaskar Mitra, and Nick Craswell

Continuous space word embeddings have received a great deal of attention in the natural language processing and machine learning communities for their ability to model term similarity and other relationships. We study the use of term relatedness in the context of query expansion for information retrieval. We demonstrate that word embeddings such as word2vec and GloVe, when trained globally, underperform corpus and query specific embeddings for retrieval tasks. These results suggest that other tasks...

Publication details
Date: 7 August 2016
Type: Inproceeding
Publisher: ACL – Association for Computational Linguistics
Zhongyuan Wang and Haixun Wang

Billions of short texts are produced everyday, in the form of search queries, ad keywords, tags, tweets, messenger conversations, social network posts, etc. Unlike documents, short texts have some unique characteristics which make them difficult to handle. First, short texts, especially search queries, do not always observe the syntax of a written language. This means traditional NLP techniques, such as syntactic parsing, do not always apply to short texts. Second, short texts contain limited context....

Publication details
Date: 1 August 2016
Type: Inproceeding
Peter Bailey and Nick Craswell

Why do people start a search? Why do they stop? Why do they do what they do in-between? Our goal in this paper is to provide a simple yet general explanation for these acts that has its basis in neuropsychology and observed user behavior. We coin the term “ingram”, as an information counterpart to Richard Semon's “engram” or "memory trace". People search to create ingrams. People stop searching because they have created sufficient ingrams, or given up. We describe these acts through a pair of user...

Publication details
Date: 17 July 2016
Type: Inproceeding
Publisher: ACM – Association for Computing Machinery
Peter Bailey, Alistair Moffat, Falk Scholer, and Paul Thomas

We describe the UQV100 test collection, designed to incorporate variability from users. Information need “backstories” were written for 100 topics (or sub-topics) from the TREC 2013 and 2014 Web Tracks. Crowd workers were asked to read the backstories, and provide the queries they would use; plus effort estimates of how many useful documents they would have to read to satisfy the
need. A total of 10;835 queries were collected from 263 workers. After normalization and spell-correction, 5;764 unique...

Publication details
Date: 17 July 2016
Type: Inproceeding
Publisher: ACM – Association for Computing Machinery
Masrour Zoghi, Tomáš Tunys, Lihong Li, Damien Jose, Junyan Chen, Chun Ming Chin, and Maarten de Rijke
Publication details
Date: 1 July 2016
Type: Inproceeding
Publisher: ACM – Association for Computing Machinery
Yuval Marton and Kristina Toutanova

We present E-TIPSY, a search query corpus annotated with named Entities, Term Importance, POS tags, and SYntactic parses. This corpus contains crowdsourced (gold) annotations of the three most important terms in each query. In addition, it contains automatically produced annotations of named entities, part-of-speech tags, and syntactic parses for the same queries. This corpus comes in two formats: (1) Sober Subset: annotations that two or more crowd workers agreed upon, and (2) Full Glass: all...

Publication details
Date: 24 May 2016
Type: Inproceeding
Publisher: ELRA
Ayelet Ben-Sasson, Dan Pelleg, and Elad Yom-Tov

The growing diagnosis and public awareness of Autism Spectrum Disorders (ASD) leads more parents to seek answers to their suspicions for ASD in their child on Internet forums.

This study describes an analysis of the quality of content of 371 answers on Yahoo Answers (YA), a social question and answer forum, to parents querying whether their child has ASD. We contrasted the perceived quality of answers by clinicians with that of parents. The study tested the feasibility of automatically assisting...

Publication details
Date: 18 May 2016
Type: Inproceeding
Publisher: AAAI - Association for the Advancement of Artificial Intelligence
Royi Ronen, Gal Lavee, and Elad Yom-Tov

Collaborative filtering (CF) recommendation systems are one of the most popular and successful methods for recommending products to people. CF systems work by finding similarities between different people according to their past purchases, and using these similarities to suggest possible items of interest. Here we investigate how CF systems can be enhanced using Internet browsing data and search engine query logs, both of which represent a rich profile of individuals’ interests. We introduce two...

Publication details
Date: 16 May 2016
Type: Inproceeding
Publisher: IEEE – Institute of Electrical and Electronics Engineers
Elad Yom-Tov, Anat Brunstein-Klomek, Arie Hadas, Or Tamir, and Silvana Fennig

Background: There is a debate about the effects of pro-anorexia (colloquially referred to as pro-ana) websites. Research suggests that the effect of these websites is not straightforward. Indeed, the actual function of these sites is disputed, with studies indicating both negative and positive effects.

Aim: This is the first study which systematically examined the differences between pro-anorexia web communities in four main aspects: web language used (posts); web interests/search behaviors...

Publication details
Date: 10 May 2016
Type: Article
Publisher: Elsevier
Marta E. Cecchinato, Abigail Sellen, Milad Shokouhi, and Gavin Smyth

Email is far from dead; in fact the volume of messages exchanged daily, the number of accounts per user, and the number of devices on which email is accessed have been constantly “Email growing. Most previous studies on email have focused on management and retrieval behaviour within a single account and on a single device. In this paper, we examine how people retrieve email in today’s ecosystem through an in-depth qualitative diary study with 16 participants. We found that personal and work accounts are...

Publication details
Date: 1 May 2016
Type: Inproceeding
Publisher: ACM – Association for Computing Machinery
Jaime Teevan, Shamsi Iqbal, Carrie J. Cai, Jeffrey P. Bigham, Michael S. Bernstein, and Elizabeth M. Gerber

It is difficult to accomplish meaningful goals with limited time and attentional resources. However, recent research has shown that concrete plans with actionable steps allow people to complete tasks better and faster. With advances in techniques that can decompose larger tasks into smaller units, we envision that a transformation from larger tasks to smaller microtasks will impact when and how people perform complex information work, enabling efficient and easy completion of tasks that currently seem...

Publication details
Date: 1 May 2016
Type: Inproceeding
Publisher: ACM – Association for Computing Machinery
Jaime Teevan

I found it hard to start writing this document, and put it off until the last possible moment. Complex tasks like writing are difficult to do because they seem to require long, uninterrupted periods of deep engagement to make meaningful progress. My goal is to change this. My colleagues and I exploring the idea of selfsourcing as a way to help people easily perform large personal information tasks by breaking them all the way down into microtasks that only take a few seconds each to complete....

Publication details
Date: 1 May 2016
Type: Inproceeding
Publisher: ACM – Association for Computing Machinery
Jaime Teevan, Shamsi T. Iqbal, and Curtis von Veh

This paper presents the MicroWriter, a system that decomposes the task of writing into three types of microtasks to produce a single report: 1) generating ideas, 2) labeling ideas to organize them, and 3) writing paragraphs given a few related ideas. Because each microtask can be completed individually with limited awareness of what has been already done and what others are doing, this decomposition can change the experience of collaborative writing. Prior work has used microtasking to support...

Publication details
Date: 1 May 2016
Type: Inproceeding
Publisher: ACM – Association for Computing Machinery
Eric Nalisnick, Bhaskar Mitra, Nick Craswell, and Rich Caruana

This paper investigates the popular neural word embedding method Word2vec as a source of evidence in document ranking. In contrast to NLP applications of word2vec, which tend to use only the input embeddings, we retain both the input and the output embeddings, allowing us to calculate a different word similarity that may be more suitable for document ranking. We map the query words into the input space and the document words into the output space, and compute a relevance score by aggregating the cosine...

Publication details
Date: 11 April 2016
Type: Inproceeding
Publisher: WWW – World Wide Web Consortium (W3C)
Publication details
Date: 1 April 2016
Type: Inproceeding
Publisher: ACM – Association for Computing Machinery
Zhiyi Luo, Yuchen Sha, Kenny Zhu, Seung-Won Hwang, and Zhongyuan Wang
Publication details
Date: 1 April 2016
Type: Inproceeding
Publication details
Date: 1 April 2016
Type: Inproceeding
Publication details
Date: 1 April 2016
Type: Inproceeding
Publisher: WWW – World Wide Web Consortium (W3C)
Sandro Bauer, Filip Radlinski, and Ryen W. White

People commonly need to purchase things in person, from large garden supplies to home decor. Although modern search systems are very effective at finding online products, little research attention has been paid to helping users find places that sell a specific product offline. For instance, users searching for an apron are not typically directed to a nearby kitchen store by a standard search engine.

In this paper, we investigate "where can I buy"-style queries related to...

Publication details
Date: 1 April 2016
Type: Inproceeding
Publisher: WWW – World Wide Web Consortium (W3C)
Liu Yang, Qi Guo, Yang Song, Sha Meng, Milad Shokouhi, Kieran McDonald, and W. Bruce Croft

Proactive search systems like Google Now and Microsoft Cortana have gained increasing popularity with the growth of mobile Internet. Unlike traditional reactive search systems where search engines return results in response to queries issued by the users, proactive systems actively push information cards to the users on mobile devices based on the context around time, location, environment (e.g., weather), and user interests. A proactive system is a zero-query information retrieval system, which makes...

Publication details
Date: 1 March 2016
Type: Inproceeding
Rishiraj Saha Roy, Anusha Suresh, Niloy Ganguly, and Monojit Choudhury

In this research, we explore nested or hierarchical query segmentation4, where segments are defined recursively as consisting of contiguous sequences of segments or query words, as a more effective representation of a query. We design a lightweight and unsupervised nested segmentation scheme, and propose how to use the tree arising out of the nested representation of a query to improve ranking performance. We show that nested segmentation can lead to significant gains over state-of-the-art at...

Publication details
Date: 1 March 2016
Type: Inproceeding
Publisher: ECIR
Yun-Nung Chen, Dilek Hakkani-Tur, and Xiaodong He

The recent surge of intelligent personal assistants motivates spoken language understanding of dialogue systems. However, the domain constraint along with the inflexible intent schema remains a big issue. This paper focuses on the task of intent expansion, which helps remove the domain limit and make an intent schema flexible. A convolutional deep structured semantic model (CDSSM) is applied to jointly learn the representations for human intents and associated utterances. Then it can flexibly generate...

Publication details
Date: 1 March 2016
Type: Inproceeding
Publisher: IEEE – Institute of Electrical and Electronics Engineers
Bo Wu, Tao Mei, Wen-Huang Cheng, and Yongdong Zhang

Time information plays a crucial role on social media popularity. Existing research on popularity prediction, effective though, ignores temporal information which is highly related to user-item associations and thus often results in limited success. An essential way is to consider all these factors (user, item, and time), which capture the dynamic nature of photo popularity. In this paper, we present a novel approach to factorize the popularity into user-item context and time-sensitive context for...

Publication details
Date: 1 February 2016
Type: Inproceeding
Publisher: AAAI - Association for the Advancement of Artificial Intelligence
Zhaohui Wu, Yang Song, and C. Lee Giles

Continuously discovering novel entities in news and Web data is important for Knowledge Base (KB) maintenance. One of the key challenges is to decide whether an entity mention refers to an in-KB or out-of-KB entity. We propose a principled approach that learns a novel entity classifier by modeling mention and entity representation into multiple feature spaces, including contextual, topical, lexical, neural embedding and query spaces. Different from most previous studies that address novel entity...

Publication details
Date: 1 February 2016
Type: Inproceeding
Publisher: AAAI - Association for the Advancement of Artificial Intelligence
Milan Vojnovic and Se-Young Yun

We consider a team selection problem that requires to hire a team of individuals that maximizes a profit function defined as difference of the utility of production and the cost of hiring. We show that for any monotone submodular utility of production and any increasing cost function of the team size with increasing marginal costs, a natural greedy algorithm guarantees a 1 − log(a)/(a − 1)– approximation when a ≤ e and a 1 − a/e(a − 1)–approximation when a ≥ e, where a is the ratio of the utility of...

Publication details
Date: 1 February 2016
Type: Technical report
Publisher: Microsoft Research
Number: MSR-TR-2016-7
1–25 of 757
Sort
Show 25 | 50 | 100
1234567Next 
> Our research