Share on Facebook Tweet on Twitter Share on LinkedIn Share by email
Our research
Content type
+
Downloads (445)
+
Events (401)
 
Groups (150)
+
News (2608)
 
People (739)
 
Projects (1064)
+
Publications (12061)
+
Videos (5296)
Labs
Research areas
Algorithms and theory47205 (274)
Communication and collaboration47188 (189)
Computational linguistics47189 (189)
Computational sciences47190 (197)
Computer systems and networking47191 (686)
Computer vision208594 (875)
Data mining and data management208595 (72)
Economics and computation47192 (95)
Education47193 (79)
Gaming47194 (71)
Graphics and multimedia47195 (206)
Hardware and devices47196 (196)
Health and well-being47197 (78)
Human-computer interaction47198 (790)
Machine learning and intelligence47200 (753)
Mobile computing208596 (35)
Quantum computing208597 (19)
Search, information retrieval, and knowledge management47199 (623)
Security and privacy47202 (271)
Social media208598 (23)
Social sciences47203 (245)
Software development, programming principles, tools, and languages47204 (562)
Speech recognition, synthesis, and dialog systems208599 (76)
Technology for emerging markets208600 (25)
1–25 of 753
Sort
Show 25 | 50 | 100
1234567Next 
Yanjie Fu, Yong Ge, Yu Zheng, Yao, Yanchi Liu, Hui Xiong, and Nicholas Jing Yuan

Ranking residential real estates based on investment values can provide decision making support for home buyers and thus plays an important role in estate marketplace. In this paper, we aim to develop methods for ranking estates based on investment values by mining users opinions about estates from online user reviews and offline moving behaviors (e.g., taxi traces, smart card transactions, check-ins). While a variety of features could be extracted from these data, these features are intercorrelated and...

Publication details
Date: 1 December 2015
Type: Inproceeding
Publisher: IEEE – Institute of Electrical and Electronics Engineers
Publication details
Date: 1 December 2015
Type: Article
Fuzheng Zhang, Nicholas Jing Yuan, David Wilkie, Yu Zheng, and Xing Xie

Urban transportation is an important factor in energy consumption and pollution, and is of increasing concern due to its complexity and economic significance. Its importance will only increase as urbanization continues around the world. In this paper, we explore drivers’ refueling behavior in urban areas. Compared to questionnaire-based methods of the past, we propose a complete data-driven system that pushes towards real-time sensing of individual refueling behavior and citywide petrol consumption. Our...

Publication details
Date: 1 June 2015
Type: Article
Publisher: ACM – Association for Computing Machinery
Publication details
Date: 1 February 2015
Type: Inproceeding
Publisher: ACM – Association for Computing Machinery
Jason D. Williams, Nobal B. Niraula, Pradeep Dasigi, Aparna Lakshmiratan, Carlos Garcia Jurado Suarez, Mouni Reddy, and Geoff Zweig

In personal assistant dialog systems, intent models are classifiers that identify the intent of a user utterance, such as to add a meeting to a calendar, or get the director of a stated movie. Rapidly adding intents is one of the main bottlenecks to scaling — adding functionality to — personal assistants. In this paper we show how interactive learning can be applied to the creation of statistical intent models. Interactive learning [10] combines model definition, labeling, model...

Publication details
Date: 11 January 2015
Type: Inproceeding
Gao Huang, Jianwen Zhang, Shiji Song, and Zheng Chen

This paper proposes a new approach for discriminative clustering. The intuition is, for a good clustering, one should be able to learn a classifier from the clustering labels with high generalization accuracy. Thus we define a novel metric to evaluate the quality of a clustering labeling, named Minimum Separation Probability (MSP), which is a lower bound of the generalization accuracy of a classifier learnt from the clustering labeling. We take MSP as the objective to maximize and propose our...

Publication details
Date: 1 January 2015
Type: Inproceeding
Publisher: AAAI - Association for the Advancement of Artificial Intelligence
Nihar B. Shah and Dengyong Zhou

Human computation or crowdsourcing involves joint inference of the ground-truth-answers and the worker abilities by optimizing an objective function, for instance, by maximizing the data likelihood based on an assumed underlying model. A variety of methods have been proposed in the literature to address this inference problem. As far as we know, none of the objective functions in existing methods is convex. In machine learning and applied statistics, a convex function such as the objective function of...

Publication details
Date: 1 January 2015
Type: Inproceeding
Publisher: AAAI - Association for the Advancement of Artificial Intelligence
Shipra Agrawal and Nikhil R. Devanur

We introduce the online stochastic Convex Programming (CP) problem, a very general version of stochastic online problems which allows arbitrary concave objectives and convex feasibility constraints. Many well-studied problems like online stochastic packing and covering, online stochastic matching with concave returns, etc. form a special case of online stochastic CP. We present fast algorithms for these problems, which achieve near-optimal regret guarantees for both the i.i.d. and the...

Publication details
Date: 1 January 2015
Type: Inproceeding
Publisher: SIAM – Society for Industrial and Applied Mathematics
Xiaohu Liu and Ruhi Sarikaya

Spoken language understanding (SLU) systems use various features to detect the domain, intent and semantic slots of a query. In addition to n-grams, features generated from entity dictionaries are often used in model training. Clean or properly weighted dictionaries are critical to improve model’s coverage and accuracy for unseen entities during test time. However, clean dictionaries are hard to obtain for some applications since they are automatically generated and can potentially contain millions of...

Publication details
Date: 1 December 2014
Type: Proceedings
Publisher: IEEE – Institute of Electrical and Electronics Engineers
Publication details
Date: 1 December 2014
Type: Inproceeding
Publisher: IEEE – Institute of Electrical and Electronics Engineers
Purushottam Kar, Harikrishna Narasimhan, and Prateek Jain

Modern applications in sensitive domains such as biometrics and medicine frequently require the use of non-decomposable loss functions such as precision@k, F-measure etc. Compared to point loss functions such as hinge-loss, these offer much more fine grained control over prediction, but at the same time present novel challenges in terms of algorithm design and analysis. In this work we initiate a study of online learning techniques for such non-decomposable loss functions with an aim to enable...

Publication details
Date: 1 December 2014
Type: Inproceeding
Publisher: Neural Information Processing Systems
Kenton O'Hara, Gerardo Gonzalez, Abigail Sellen, Graeme Penney, Varnavas, Helena Mentis, Antonio Criminisi, Robert Corish, Mark Rouncefield, Neville Dastur, and Tom Carrell
Publication details
Date: 1 December 2014
Type: Article
Prateek Jain, Ambuj Tewari, and Purushottam Kar

The use of M-estimators in generalized linear regression models in high dimensional settings requires risk minimization with hard L0 constraints. Of the known methods, the class of projected gradient descent (also known as iterative hard thresholding (IHT)) methods is known to offer the fastest and most scalable solutions. However, the current state-of-the-art is only able to analyze these methods in very restrictive settings which do not hold in high dimensional statistical models. In this...

Publication details
Date: 1 December 2014
Type: Inproceeding
Publisher: Neural Information Processing Systems
Qi Li, Gokhan Tur, Dilek Hakkani-Tur, Xiang Li, Tim Paek, Asela Gunawardana, and Chris Quirk

Traditional spoken dialog systems are usually based on centralized architecture, in which the number of domains is predefined, and the provider is fixed for a given domain and intent. The spoken language understanding (SLU) component is responsible for detecting domain and intents, and filling domain-specific slots. It is expensive and time-consuming for this architecture to add new and/or competing domains, intents, or providers. The rapid growth of service providers in mobile computing market calls...

Publication details
Date: 1 December 2014
Type: Inproceeding
Publisher: IEEE – Institute of Electrical and Electronics Engineers
Michael J. Paul, Ryen W. White, and Eric Horvitz

We seek to understand the evolving needs of people who are faced with a life-changing medical diagnosis based on analyses of queries extracted from an anonymized search query log. Focusing on breast cancer, we manually tag a set of Web searchers as showing disruptive shifts in focus of attention and long-term patterns of search behavior consistent with the diagnosis and treatment of breast cancer. We build and apply probabilistic classifiers to detect these searchers from multiple sessions and to detect...

Publication details
Date: 15 November 2014
Type: Technical report
Publisher: Microsoft Research
Number: MSR-TR-2014-144
James D. McCaffrey

The Python language is well-suited for creating neural network systems in hybrid-technology environments.

Publication details
Date: 15 November 2014
Type: Article
Bojun Huang

In this paper we show that the alpha-beta algorithm and its successor MT-SSS*, as two classic minimax search algorithms, can be implemented as rollout algorithms, a generic algorithmic paradigm widely used in many domains. Specifically, we define a family of rollout algorithms, in which the rollout policy is restricted to select successor nodes only from a certain subset of the children list. We show that any rollout policy in this family (either deterministic or randomized) guarantees...

Publication details
Date: 1 November 2014
Type: Inproceeding
Publisher: AAAI - Association for the Advancement of Artificial Intelligence
Jie Bao, Yu Zheng, David Wilkie, and Mohamed F. Mokbel

Recent advances in position localization techniques have fundamentally enhanced social networking services, allowing users to share their locations and location-related content, such as geo-tagged photos and notes. We refer to these social networks as location-based social networks (LBSNs). Location data both bridges the gap between the physical and digital worlds and enables a deeper understanding of user preferences and behavior. This addition of vast geospatial datasets has stimulated research into...

Publication details
Date: 1 November 2014
Type: Article
Publisher: Springer
Publication details
Date: 1 November 2014
Type: Inproceeding
Publisher: CIKM
Katja Hofmann, Bhaskar Mitra, Filip Radlinski, and Milad Shokouhi

Query Auto Completion (QAC) suggests possible queries to web search users from the moment they start entering a query. This popular feature of web search engines is thought to reduce physical and cognitive effort when formulating a query.

Perhaps surprisingly, despite QAC being widely used, users’ interactions with it are poorly understood. This paper begins to address this gap. We present the results of an in-depth user study of user interactions with QAC in web search. While study participants...

Publication details
Date: 1 November 2014
Type: Inproceeding
Publisher: ACM – Association for Computing Machinery
Yelong Shen, Xiaodong He, Jianfeng Gao, Li Deng, and Gregoire Mesnil

In this paper, we propose a new latent semantic model that incorporates a convolutional-pooling structure over word sequences to learn low-dimensional, semantic vector representations for search queries and Web documents. In order to capture the rich contextual structures in a query or a document, we start with each word within a temporal context window in a word sequence to directly capture contextual features at the word n-gram level. Next, the salient word n-gram features in the word sequence are...

Publication details
Date: 1 November 2014
Type: Inproceeding
Publisher: CIKM
, , jibian, bingao, , and tyliu

Representing words into vectors in continuous space can form up a potentially powerful basis to generate high-quality textual features for many text mining and natural language processing tasks. Some recent efforts, such as the skip-gram model, have attempted to learn word representations that can capture both syntactic and semantic information among text corpus. However, they still lack the capability of encoding the properties of words and the complex relationships among words very well, since text...

Publication details
Date: 1 November 2014
Type: Inproceeding
Andrew D. Gordon, Claudio Russo, Marcin Szymczak, Johannes Borgstrom, Nicolas Rolland, Thore Graepel, and Daniel Tarlow

We describe the design, semantics, and implementation of a probabilistic programming language where programs are spreadsheet queries. Given an input database consisting of tables held in a spreadsheet, a query constructs a probabilistic model conditioned by the spreadsheet data, and returns an output database determined by inference. This work extends probabilistic programming systems in three novel aspects: (1) embedding in spreadsheets, (2) dependently-typed functions, and (3) typed distinction...

Publication details
Date: 1 November 2014
Type: Technical report
Publisher: Microsoft Research
Number: MSR-TR-2014-135
James D. McCaffrey

Simplex optimization is an old, geometry based numerical optimization technique that can be used to train a neural network.

Publication details
Date: 16 October 2014
Type: Article
Michael Auli, Michel Galley, and Jianfeng Gao

Recent work by Cherry (2013) has shown that directly optimizing phrase-based reordering models towards BLEU can lead to significant gains. Their approach is limited to small training sets of a few thousand sentences and a similar number of sparse features. We show how the expected BLEU objective allows us to train a simple linear discriminative reordering model with millions of sparse features on hundreds of thousands of sentences resulting in significant improvements. A comparison to likelihood...

Publication details
Date: 1 October 2014
Type: Proceedings
Publisher: EMNLP
1–25 of 753
Sort
Show 25 | 50 | 100
1234567Next 
> Our research