Our research
Content type
+
Downloads (441)
+
Events (396)
 
Groups (150)
+
News (2591)
 
People (803)
 
Projects (1066)
+
Publications (11992)
+
Videos (5233)
Labs
Research areas
Algorithms and theory47205 (268)
Communication and collaboration47188 (187)
Computational linguistics47189 (186)
Computational sciences47190 (193)
Computer systems and networking47191 (678)
Computer vision208594 (47)
Data mining and data management208595 (62)
Economics and computation47192 (94)
Education47193 (79)
Gaming47194 (69)
Graphics and multimedia47195 (199)
Hardware and devices47196 (196)
Health and well-being47197 (77)
Human-computer interaction47198 (779)
Machine learning and intelligence47200 (721)
Mobile computing208596 (33)
Quantum computing208597 (19)
Search, information retrieval, and knowledge management47199 (614)
Security and privacy47202 (266)
Social media208598 (21)
Social sciences47203 (240)
Software development, programming principles, tools, and languages47204 (555)
Speech recognition, synthesis, and dialog systems208599 (72)
Technology for emerging markets208600 (25)
1–25 of 721
Sort
Show 25 | 50 | 100
1234567Next 
Fuzheng Zhang, Nicholas Jing Yuan, David Wilkie, Yu Zheng, and Xing Xie

Urban transportation is an important factor in energy consumption and pollution, and is of increasing concern due to its complexity and economic significance. Its importance will only increase as urbanization continues around the world. In this paper, we explore drivers’ refueling behavior in urban areas. Compared to questionnaire-based methods of the past, we propose a complete data-driven system that pushes towards real-time sensing of individual refueling behavior and citywide petrol consumption. Our...

Publication details
Date: 1 June 2015
Type: Article
Publisher: ACM – Association for Computing Machinery
Shipra Agrawal and Nikhil R. Devanur

We introduce the online stochastic Convex Programming (CP) problem, a very general version of stochastic online problems which allows arbitrary concave objectives and convex feasibility constraints. Many well-studied problems like online stochastic packing and covering, online stochastic matching with concave returns, etc. form a special case of online stochastic CP. We present fast algorithms for these problems, which achieve near-optimal regret guarantees for both the i.i.d. and the...

Publication details
Date: 1 January 2015
Type: Inproceeding
Publisher: SIAM – Society for Industrial and Applied Mathematics
Prateek Jain, Ambuj Tewari, and Purushottam Kar

The use of M-estimators in generalized linear regression models in high dimensional settings requires risk minimization with hard L0 constraints. Of the known methods, the class of projected gradient descent (also known as iterative hard thresholding (IHT)) methods is known to offer the fastest and most scalable solutions. However, the current state-of-the-art is only able to analyze these methods in very restrictive settings which do not hold in high dimensional statistical models. In this...

Publication details
Date: 1 December 2014
Type: Inproceeding
Publisher: Neural Information Processing Systems
Purushottam Kar, Harikrishna Narasimhan, and Prateek Jain

Modern applications in sensitive domains such as biometrics and medicine frequently require the use of non-decomposable loss functions such as precision@k, F-measure etc. Compared to point loss functions such as hinge-loss, these offer much more fine grained control over prediction, but at the same time present novel challenges in terms of algorithm design and analysis. In this work we initiate a study of online learning techniques for such non-decomposable loss functions with an aim to enable...

Publication details
Date: 1 December 2014
Type: Inproceeding
Publisher: Neural Information Processing Systems
Publication details
Date: 1 December 2014
Type: Inproceeding
Publisher: IEEE – Institute of Electrical and Electronics Engineers
Qi Li, Gokhan Tur, Dilek Hakkani-Tur, Xiang Li, Tim Paek, Asela Gunawardana, and Chris Quirk

Traditional spoken dialog systems are usually based on centralized architecture, in which the number of domains is predefined, and the provider is fixed for a given domain and intent. The spoken language understanding (SLU) component is responsible for detecting domain and intents, and filling domain-specific slots. It is expensive and time-consuming for this architecture to add new and/or competing domains, intents, or providers. The rapid growth of service providers in mobile computing market calls...

Publication details
Date: 1 December 2014
Type: Inproceeding
Publisher: IEEE – Institute of Electrical and Electronics Engineers
Yelong Shen, Xiaodong He, Jianfeng Gao, Li Deng, and Gregoire Mesnil

In this paper, we propose a new latent semantic model that incorporates a convolutional-pooling structure over word sequences to learn low-dimensional, semantic vector representations for search queries and Web documents. In order to capture the rich contextual structures in a query or a document, we start with each word within a temporal context window in a word sequence to directly capture contextual features at the word n-gram level. Next, the salient word n-gram features in the word sequence are...

Publication details
Date: 1 November 2014
Type: Inproceeding
Publisher: CIKM
Katja Hofmann, Bhaskar Mitra, Filip Radlinski, and Milad Shokouhi

Query Auto Completion (QAC) suggests possible queries to web search users from the moment they start entering a query. This popular feature of web search engines is thought to reduce physical and cognitive effort when formulating a query.

Perhaps surprisingly, despite QAC being widely used, users’ interactions with it are poorly understood. This paper begins to address this gap. We present the results of an in-depth user study of user interactions with QAC in web search. While study participants...

Publication details
Date: 1 November 2014
Type: Inproceeding
Publisher: ACM – Association for Computing Machinery
, , jibian, bingao, , and tyliu

Representing words into vectors in continuous space can form up a potentially powerful basis to generate high-quality textual features for many text mining and natural language processing tasks. Some recent efforts, such as the skip-gram model, have attempted to learn word representations that can capture both syntactic and semantic information among text corpus. However, they still lack the capability of encoding the properties of words and the complex relationships among words very well, since text...

Publication details
Date: 1 November 2014
Type: Inproceeding
Jie Bao, Yu Zheng, David Wilkie, and Mohamed F. Mokbel

Recent advances in position localization techniques have fundamentally enhanced social networking services, allowing users to share their locations and location-related content, such as geo-tagged photos and notes. We refer to these social networks as location-based social networks (LBSNs). Location data both bridges the gap between the physical and digital worlds and enables a deeper understanding of user preferences and behavior. This addition of vast geospatial datasets has stimulated research into...

Publication details
Date: 1 November 2014
Type: Article
Publisher: Springer
O. Abdel-Hamid, A. Mohamed, H. Jiang, L. Deng, G. Penn, and D. Yu
Publication details
Date: 1 October 2014
Type: Article
Number: 10
Kai-Wei Chang, Wen-tau Yih, Bishan Yang, and Christopher Meek

While relation extraction has traditionally been viewed as a task relying solely on textual data, recent work has shown that by taking as input existing facts in the form of entity-relation triples from both knowledge bases and textual data, the performance of relation extraction can be improved significantly. Following this new paradigm, we propose a tensor decomposition approach for knowledge base embedding that is highly scalable, and is especially suitable for relation extraction. By leveraging...

Publication details
Date: 1 October 2014
Type: Inproceeding
Publisher: ACL – Association for Computational Linguistics
Michael Auli, Michel Galley, and Jianfeng Gao

Recent work by Cherry (2013) has shown that directly optimizing phrase-based reordering models towards BLEU can lead to significant gains. Their approach is limited to small training sets of a few thousand sentences and a similar number of sparse features. We show how the expected BLEU objective allows us to train a simple linear discriminative reordering model with millions of sparse features on hundreds of thousands of sentences resulting in significant improvements. A comparison to likelihood...

Publication details
Date: 1 October 2014
Type: Proceedings
Publisher: EMNLP
Zhen Wang, Jianwen Zhang, Jianlin Feng, and Zheng Chen

We examine the embedding approach to reason new relational facts from a large-scale knowledge graph and a text corpus. We propose a novel method of jointly embedding entities and words into the same continuous vector space. The embedding process attempts to preserve the relations between entities in the knowledge graph and the concurrences of words in the text corpus. Entity names and Wikipedia anchors are utilized to align the embeddings of entities and words in the same space. Large scale experiments...

Publication details
Date: 1 October 2014
Type: Inproceeding
Publisher: ACL – Association for Computational Linguistics
Andrew D. Gordon, Claudio Russo, Marcin Szymczak, Johannes Borgstrom, Nicolas Rolland, Thore Graepel, and Daniel Tarlow

We describe the design, semantics, and implementation of a probabilistic programming language where programs are spreadsheet queries. Given an input database consisting of tables held in a spreadsheet, a query constructs a probabilistic model conditioned by the spreadsheet data, and returns an output database determined by inference. This work extends probabilistic programming systems in three novel aspects: (1) embedding in spreadsheets, (2) dependently-typed functions, and (3) typed distinction...

Publication details
Date: 1 October 2014
Type: Technical report
Publisher: Microsoft Research
Number: MSR-TR-2014-135
Jianfeng Gao, Patrick Pantel, Michael Gamon, Xiaodong He, Li Deng, and Yelong Shen

This paper presents a deep semantic model (DSM) for recommending target documents to be of interest to a user based on a source document she is reading. We observe, identify, and detect naturally occurring signals of interestingness in click transitions on the Web between source and target documents, which we collect from commercial Web browser logs. The DSM is trained on millions of Web transitions, and maps source-target document pairs to feature vectors in a latent space in such a...

Publication details
Date: 1 October 2014
Type: Proceedings
Publisher: EMNLP
Lihong Li, Rémi Munos, and Csaba Szepesvari

This paper studies the off-policy evaluation problem, where one aims to estimate the value of a target policy based on a sample of observations collected by another policy. We first consider the multi-armed bandit case, establish a minimax risk lower bound, and analyze the risk of two standard estimators. It is shown, and verified in simulation, that one is minimax optimal up to a constant, while another can be arbitrarily worse, despite its empirical success and popularity. The results are applied to...

Publication details
Date: 15 September 2014
Type: Technical report
Publisher: Microsoft Research
Number: MSR-TR-2014-124
Mohammed Shoaib, Jie Liu, and Matthai Phillipose

High functional complexity is leading us towards new architectures for sensing systems. Multi-tiered design is one among the many emerging alternatives. Such architectures bring new opportunities for effective system-level power management. For instance, varying one/more tier-level parameters can provide substantial end-to-end energy scaling. In this paper, we review an existing approach that shows how one such parameter, namely data compression, can help us scale energy at the cost of algorithmic...

Publication details
Date: 14 September 2014
Type: Inproceeding
Publisher: IEEE – Institute of Electrical and Electronics Engineers
Zhenghao Wang, Shengquan Yan, Huaming Wang, and Xuedong Huang

Question answering (QA) over an existing knowledge base (KB) such as Microsoft Satori or open Freebase is one of the most important natural language processing applications. There are approaches based on web-search motivated statistic techniques as well as linguistically oriented knowledge engineering. Both methods face the key challenge on how to handle diverse ways of naturally expressing predicates and entities existing in the KB. The domain independent web information extracted from the massive...

Publication details
Date: 3 September 2014
Type: Technical report
Publisher: Microsoft Research
Number: MSR-TR-2014-121
Jean-Philippe Robichaud, Paul A. Crook, Puyang Xu, Omar Zia Khan, and Ruhi Sarikaya

We present a novel application of hypothesis ranking (HR) for the task of domain detection in a multi-domain, multiturn dialog system. Alternate, domain dependent, semantic frames from a spoken language understanding (SLU) analysis are ranked using a gradient boosted decision trees (GBDT) ranker to determine the most likely domain. The ranker, trained using Lambda Rank, makes use of a range of signals derived from the SLU and previous turn context to improve domain detection. On a multi-turn corpus we...

Publication details
Date: 1 September 2014
Type: Inproceeding
Publisher: ISCA - International Speech Communication Association
Jiang Bian, Bin Gao, and Tie-Yan Liu

Recent years have witnessed the increasing efforts that apply deep learning techniques to solve text mining and natural language processing tasks. The basis of these tasks is to obtain high-quality distributed representations of words, i.e., word embeddings, from large amounts of text data. However, text itself usually contains limited information, which makes necessity to leverage extra knowledge to understand it. Fortunately, since text is generated by human, it already contains well-defined...

Publication details
Date: 1 September 2014
Type: Inproceeding
Publisher: Springer
Ben Glocker, Darko Zikic, and David R. Haynor

Accurate and reliable registration of longitudinal spine images is essential for assessment of disease progression and surgical outcome. Implementing a fully automatic and robust registration for clinical use, however, is challenging since standard registration techniques often fail due to poor initial alignment. The main causes of registration failure are the small overlap between scans which focus on different parts of the spine and/or substantial change in shape (e.g. after correction of abnormal...

Publication details
Date: 1 September 2014
Type: Inproceeding
Publisher: Springer
Li Deng and John C. Platt

Deep learning systems have dramatically improved the accuracy of speech recognition, and various deep architectures and learning methods have been developed with distinct strengths and weaknesses in recent years. How can ensemble learning be applied to these varying deep learning systems to achieve greater recognition accuracy is the focus of this paper. We develop and report linear and log-linear stacking methods for ensemble learning with applications specifically to speechclass posterior...

Publication details
Date: 1 September 2014
Type: Inproceeding
Publisher: Proc. Interspeech
Bjoern H. Menze, Andras Jakab, Stefan Bauer, Jayashree Kalpathy-Cramer, Keyvan Farahani, Justin Kirby, Yuliya Burren, Nicole Porz, Johannes Slotboom, Roland Wiest, Levente Lanczi, Elizabeth Gerstner, Marc-Andre Weber, Tal Arbel, Brian B. Avants, Nicholas Ayache, Patricia Buendia, D. Louis Collins, Nicolas Cordier, Jason J. Corso, Antonio Criminisi, Tilak Das, Herve Delingette, Cagatay Demiralp, Christopher R. Durst, Michel Dojat, Senan Doyle, Joana Festa, Florence Forbes, Ezequiel Geremia, Ben Glocker, Polina Golland, Xiaotao Guo, Andac Hamamci, Khan M. Iftekharuddin, Raj Jena, Nigel M. John, Ender Konukoglu, Danial Lashkari, Jose Antonio Mariz, Raphael Meier, Sergio Pereira, Doina Precup, Stephen J. Price, Tammy Riklin Raviv, Syed M. S. Reza, Michael Ryan, Duygu Sarikaya, Lawrence Schwartz, Hoo-Chang Shin, Jamie Shotton, Carlos A. Silva, Nuno Sousa, Nagesh K. Subbanna, Gabor Szekely, Thomas J. Taylor, Owen M. Thomas, Nicholas J. Tustison, Gozde Unal, Flor Vasseur, Max Wintermark, Dong Hye Ye, Liang Zhao, Binsheng Zhao, Darko Zikic, Marcel Prastawa, Mauricio Reyes, and Koen Van Leemput

In this paper we report the set-up and results of the Multimodal Brain Tumor Image Segmentation Benchmark (BRATS) organized in conjunction with the MICCAI 2012 and 2013 conferences. Twenty state-of-the-art tumor segmentation algorithms were applied to a set of 65 multi-contrast MR scans of low- and high-grade glioma patients – manually annotated by up to four raters – and to 65 comparable scans generated using tumor image simulation software. Quantitative evaluations revealed considerable disagreement...

Publication details
Date: 1 September 2014
Type: Article
Publication details
Date: 1 September 2014
Type: Article
Publisher: IEEE – Institute of Electrical and Electronics Engineers
1–25 of 721
Sort
Show 25 | 50 | 100
1234567Next 
> Our research