Our research
Content type
+
Downloads (438)
+
Events (394)
 
Groups (150)
+
News (2570)
 
People (820)
 
Projects (1054)
+
Publications (11920)
+
Videos (5173)
Labs
Research areas
Algorithms and theory47205 (254)
Communication and collaboration47188 (185)
Computational linguistics47189 (178)
Computational sciences47190 (184)
Computer systems and networking47191 (663)
Computer vision208594 (34)
Data mining and data management208595 (56)
Economics and computation47192 (94)
Education47193 (78)
Gaming47194 (67)
Graphics and multimedia47195 (196)
Hardware and devices47196 (192)
Health and well-being47197 (74)
Human-computer interaction47198 (772)
Machine learning and intelligence47200 (710)
Mobile computing208596 (26)
Quantum computing208597 (16)
Search, information retrieval, and knowledge management47199 (608)
Security and privacy47202 (262)
Social media208598 (19)
Social sciences47203 (239)
Software development, programming principles, tools, and languages47204 (542)
Speech recognition, synthesis, and dialog systems208599 (48)
Technology for emerging markets208600 (24)
1–25 of 710
Sort
Show 25 | 50 | 100
1234567Next 
Fuzheng Zhang, Nicholas Jing Yuan, David Wilkie, Yu Zheng, and Xing Xie

Urban transportation is an important factor in energy consumption and pollution, and is of increasing concern due to its complexity and economic significance. Its importance will only increase as urbanization continues around the world. In this paper, we explore drivers’ refueling behavior in urban areas. Compared to questionnaire-based methods of the past, we propose a complete data-driven system that pushes towards real-time sensing of individual refueling behavior and citywide petrol consumption. Our...

Publication details
Date: 1 June 2015
Type: Article
Publisher: ACM – Association for Computing Machinery
Shipra Agrawal and Nikhil R. Devanur

We introduce the online stochastic Convex Programming (CP) problem, a very general version of stochastic online problems which allows arbitrary concave objectives and convex feasibility constraints. Many well-studied problems like online stochastic packing and covering, online stochastic matching with concave returns, etc. form a special case of online stochastic CP. We present fast algorithms for these problems, which achieve near-optimal regret guarantees for both the i.i.d. and the...

Publication details
Date: 1 January 2015
Type: Inproceeding
Publisher: SIAM – Society for Industrial and Applied Mathematics
Publication details
Date: 1 December 2014
Type: Inproceeding
Publisher: IEEE – Institute of Electrical and Electronics Engineers
Purushottam Kar, Harikrishna Narasimhan, and Prateek Jain

Modern applications in sensitive domains such as biometrics and medicine frequently require the use of non-decomposable loss functions such as precision@k, F-measure etc. Compared to point loss functions such as hinge-loss, these offer much more fine grained control over prediction, but at the same time present novel challenges in terms of algorithm design and analysis. In this work we initiate a study of online learning techniques for such non-decomposable loss functions with an aim to enable...

Publication details
Date: 1 December 2014
Type: Inproceeding
Publisher: Neural Information Processing Systems
Prateek Jain, Ambuj Tewari, and Purushottam Kar

The use of M-estimators in generalized linear regression models in high dimensional settings requires risk minimization with hard L0 constraints. Of the known methods, the class of projected gradient descent (also known as iterative hard thresholding (IHT)) methods is known to offer the fastest and most scalable solutions. However, the current state-of-the-art is only able to analyze these methods in very restrictive settings which do not hold in high dimensional statistical models. In this...

Publication details
Date: 1 December 2014
Type: Inproceeding
Publisher: Neural Information Processing Systems
Yelong Shen, Xiaodong He, Jianfeng Gao, Li Deng, and Gregoire Mesnil

In this paper, we propose a new latent semantic model that incorporates a convolutional-pooling structure over word sequences to learn low-dimensional, semantic vector representations for search queries and Web documents. In order to capture the rich contextual structures in a query or a document, we start with each word within a temporal context window in a word sequence to directly capture contextual features at the word n-gram level. Next, the salient word n-gram features in the word sequence are...

Publication details
Date: 1 November 2014
Type: Inproceeding
Publisher: CIKM
, , jibian, bingao, , and tyliu

Representing words into vectors in continuous space can form up a potentially powerful basis to generate high-quality textual features for many text mining and natural language processing tasks. Some recent efforts, such as the skip-gram model, have attempted to learn word representations that can capture both syntactic and semantic information among text corpus. However, they still lack the capability of encoding the properties of words and the complex relationships among words very well, since text...

Publication details
Date: 1 November 2014
Type: Inproceeding
Katja Hofmann, Bhaskar Mitra, Filip Radlinski, and Milad Shokouhi

Query Auto Completion (QAC) suggests possible queries to web search users from the moment they start entering a query. This popular feature of web search engines is thought to reduce physical and cognitive effort when formulating a query.

Perhaps surprisingly, despite QAC being widely used, users’ interactions with it are poorly understood. This paper begins to address this gap. We present the results of an in-depth user study of user interactions with QAC in web search. While study participants...

Publication details
Date: 1 November 2014
Type: Inproceeding
Publisher: ACM – Association for Computing Machinery
Jianfeng Gao, Patrick Pantel, Michael Gamon, Xiaodong He, Li Deng, and Yelong Shen

This paper presents a deep semantic model (DSM) for recommending target documents to be of interest to a user based on a source document she is reading. We observe, identify, and detect naturally occurring signals of interestingness in click transitions on the Web between source and target documents, which we collect from commercial Web browser logs. The DSM is trained on millions of Web transitions, and maps source-target document pairs to feature vectors in a latent space in such a...

Publication details
Date: 1 October 2014
Type: Proceedings
Publisher: EMNLP
Michael Auli, Michel Galley, and Jianfeng Gao

Recent work by Cherry (2013) has shown that directly optimizing phrase-based reordering models towards BLEU can lead to significant gains. Their approach is limited to small training sets of a few thousand sentences and a similar number of sparse features. We show how the expected BLEU objective allows us to train a simple linear discriminative reordering model with millions of sparse features on hundreds of thousands of sentences resulting in significant improvements. A comparison to likelihood...

Publication details
Date: 1 October 2014
Type: Proceedings
Publisher: EMNLP
Zhen Wang, Jianwen Zhang, Jianlin Feng, and Zheng Chen

We examine the embedding approach to reason new relational facts from a large-scale knowledge graph and a text corpus. We propose a novel method of jointly embedding entities and words into the same continuous vector space. The embedding process attempts to preserve the relations between entities in the knowledge graph and the concurrences of words in the text corpus. Entity names and Wikipedia anchors are utilized to align the embeddings of entities and words in the same space. Large scale experiments...

Publication details
Date: 1 October 2014
Type: Inproceeding
Publisher: ACL – Association for Computational Linguistics
Kai-Wei Chang, Wen-tau Yih, Bishan Yang, and Christopher Meek

While relation extraction has traditionally been viewed as a task relying solely on textual data, recent work has shown that by taking as input existing facts in the form of entity-relation triples from both knowledge bases and textual data, the performance of relation extraction can be improved significantly. Following this new paradigm, we propose a tensor decomposition approach for knowledge base embedding that is highly scalable, and is especially suitable for relation extraction. By leveraging...

Publication details
Date: 1 October 2014
Type: Inproceeding
Publisher: ACL – Association for Computational Linguistics
Lihong Li, Rémi Munos, and Csaba Szepesvari

This paper studies the off-policy evaluation problem, where one aims to estimate the value of a target policy based on a sample of observations collected by another policy. We first consider the multi-armed bandit case, establish a minimax risk lower bound, and analyze the risk of two standard estimators. It is shown, and verified in simulation, that one is minimax optimal up to a constant, while another can be arbitrarily worse, despite its empirical success and popularity. The results are applied to...

Publication details
Date: 15 September 2014
Type: Technical report
Publisher: Microsoft Research
Number: MSR-TR-2014-124
Mohammed Shoaib, Jie Liu, and Matthai Phillipose

High functional complexity is leading us towards new architectures for sensing systems. Multi-tiered design is one among the many emerging alternatives. Such architectures bring new opportunities for effective system-level power management. For instance, varying one/more tier-level parameters can provide substantial end-to-end energy scaling. In this paper, we review an existing approach that shows how one such parameter, namely data compression, can help us scale energy at the cost of algorithmic...

Publication details
Date: 14 September 2014
Type: Inproceeding
Publisher: IEEE – Institute of Electrical and Electronics Engineers
Zhenghao Wang, Shengquan Yan, Huaming Wang, and Xuedong Huang

Question answering (QA) over an existing knowledge base (KB) such as Microsoft Satori or open Freebase is one of the most important natural language processing applications. There are approaches based on web-search motivated statistic techniques as well as linguistically oriented knowledge engineering. Both methods face the key challenge on how to handle diverse ways of naturally expressing predicates and entities existing in the KB. The domain independent web information extracted from the massive...

Publication details
Date: 3 September 2014
Type: Technical report
Publisher: Microsoft Research
Number: MSR-TR-2014-121
Puyang Xu and Ruhi Sarikaya

In slot filling with conditional random field (CRF), the strong current word and dictionary features tend to swamp the effect of contextual features, a phenomenon also known as feature undertraining. This is a dangerous tradeoff especially when training data is small and dictionaries are limited in its coverage of the entities observed during testing. In this paper, we propose a simple and effective solution that extends the feature dropout algorithm, directly aiming at boosting the...

Publication details
Date: 1 September 2014
Type: Proceedings
Publisher: ISCA - International Speech Communication Association
Yuchen Zhang and Lin Xiao

We consider a generic convex optimization problem associated with regularized empirical risk minimization of linear predictors. The problem structure allows us to reformulate it as a convex-concave saddle point problem. We propose a stochastic primal-dual coordinate (SPDC) method, which alternates between maximizing over a randomly chosen dual variable and minimizing over the primal variable. An extrapolation step on the primal variable is performed to obtain accelerated convergence rate. We also...

Publication details
Date: 1 September 2014
Type: Technical report
Number: MSR-TR-2014-123
Ben Glocker, Darko Zikic, and David R. Haynor

Accurate and reliable registration of longitudinal spine images is essential for assessment of disease progression and surgical outcome. Implementing a fully automatic and robust registration for clinical use, however, is challenging since standard registration techniques often fail due to poor initial alignment. The main causes of registration failure are the small overlap between scans which focus on different parts of the spine and/or substantial change in shape (e.g. after correction of abnormal...

Publication details
Date: 1 September 2014
Type: Inproceeding
Publisher: Springer
Jean-Philippe Robichaud, Paul A. Crook, Puyang Xu, Omar Zia Khan, and Ruhi Sarikaya

We present a novel application of hypothesis ranking (HR) for the task of domain detection in a multi-domain, multiturn dialog system. Alternate, domain dependent, semantic frames from a spoken language understanding (SLU) analysis are ranked using a gradient boosted decision trees (GBDT) ranker to determine the most likely domain. The ranker, trained using Lambda Rank, makes use of a range of signals derived from the SLU and previous turn context to improve domain detection. On a multi-turn corpus we...

Publication details
Date: 1 September 2014
Type: Inproceeding
Publisher: ISCA - International Speech Communication Association
Li Deng and John C. Platt

Deep learning systems have dramatically improved the accuracy of speech recognition, and various deep architectures and learning methods have been developed with distinct strengths and weaknesses in recent years. How can ensemble learning be applied to these varying deep learning systems to achieve greater recognition accuracy is the focus of this paper. We develop and report linear and log-linear stacking methods for ensemble learning with applications specifically to speechclass posterior...

Publication details
Date: 1 September 2014
Type: Inproceeding
Publisher: Proc. Interspeech
Jiang Bian, Bin Gao, and Tie-Yan Liu

Recent years have witnessed the increasing efforts that apply deep learning techniques to solve text mining and natural language processing tasks. The basis of these tasks is to obtain high-quality distributed representations of words, i.e., word embeddings, from large amounts of text data. However, text itself usually contains limited information, which makes necessity to leverage extra knowledge to understand it. Fortunately, since text is generated by human, it already contains well-defined...

Publication details
Date: 1 September 2014
Type: Inproceeding
Publisher: Springer
Darko Zikic, Ben Glocker, and Antonio Criminisi

We propose a segmentation method which transfers the advantages of multi-atlas label propagation (MALP) to correspondence-free scenarios. MALP is a branch of segmentation approaches with attractive properties, which is currently applicable only in correspondence-based regimes such as brain labeling, which assume correspondence between atlases and test image. This precludes its use for the large class of tasks without this property, such as tumor segmentation. In this work, we propose a method which...

Publication details
Date: 1 September 2014
Type: Inproceeding
Publisher: Springer
Eric Brachmann, Alexander Krull, Frank Michel, Stefan Gumhold, Jamie Shotton, and Carsten Rother

This work addresses the problem of estimating the 6D pose of specific objects from a single RGB-D image. We present a flexible approach that can deal with generic objects, both textured and texture-less. The key new concept is a learned, intermediate representation in form of a dense 3D object coordinate labelling paired with a dense class labelling. We are able to show that for a common dataset with texture-less objects, where template-based techniques are suitable and state of the art, our approach is...

Publication details
Date: 1 September 2014
Type: Inproceeding
Publisher: Springer
Alex Marin, Roman Holenstein, Ruhi Sarikaya, and Mari Ostendorf

This paper explores a novel method for learning phrase pattern features for text classification, employing a mapping of selected words into a knowledge graph and self-training over unlabeled data. Using Support Vector Machine classification, we obtain improvements over lexical and fully-supervised phrase pattern features in domain and intent detection for language understanding, particularly in conjunction with the use of unlabeled data. Our best results are obtained using unlabeled data filtered for...

Publication details
Date: 1 September 2014
Type: Proceedings
Publisher: ISCA - International Speech Communication Association
Publication details
Date: 1 August 2014
Type: Technical report
Publisher: Microsoft Research
Number: MSR-TR-2014-109
1–25 of 710
Sort
Show 25 | 50 | 100
1234567Next 
> Our research