Our research
Content type
+
Downloads (425)
+
Events (356)
 
Groups (147)
+
News (2477)
 
People (826)
 
Projects (1017)
+
Publications (11449)
+
Videos (4867)
Labs
Research areas
Algorithms and theory47205 (210)
Communication and collaboration47188 (177)
Computational linguistics47189 (143)
Computational sciences47190 (161)
Computer systems and networking47191 (599)
Computer vision208594 (8)
Data mining and data management208595 (5)
Economics and computation47192 (81)
Education47193 (71)
Gaming47194 (63)
Graphics and multimedia47195 (178)
Hardware and devices47196 (173)
Health and well-being47197 (63)
Human-computer interaction47198 (716)
Machine learning and intelligence47200 (597)
Mobile computing208596 (7)
Quantum computing208597 (3)
Search, information retrieval, and knowledge management47199 (561)
Security and privacy47202 (218)
Social media208598 (6)
Social sciences47203 (220)
Software development, programming principles, tools, and languages47204 (497)
Speech recognition, synthesis, and dialog systems208599 (12)
Technology for emerging markets208600 (22)
1–25 of 143
Sort
Show 25 | 50 | 100
123456Next 
Ali El-Kahky, Derek Liu, Ruhi Sarikaya, Gokhan Tur, Dilek Hakkani-Tur, and Larry Heck
This paper proposes a new technique to enable Natural Language Understanding (NLU) systems to handle user queries beyond their original semantic schemas defined by their intents and slots. Knowledge graph and search query logs are used to extend NLU system’s coverage by transferring intents from other domains to a given domain. The transferred intents as well as existing intents are then applied to a set of new slots that they are not trained with. The knowledge graph and search click logs are used to...
Publication details
Date: 1 May 2014
Type: Inproceeding
Publisher: IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)
Yangfeng Ji, Dilek Hakkani-Tur, Asli Celikyilmaz, Larry Heck, and Gokhan Tur
State-of-the art spoken language understanding models that automatically capture user intents in human to machine dialogs are often trained with a small number of manually annotated examples collected from the application domain. Search query logs provide a large number of unlabeled queries that would be beneficial to improve such supervised classification. Furthermore, the contents of user queries as well as the URLs they click provide information about user’s intent. In this paper, we propose a...
Publication details
Date: 1 May 2014
Type: Inproceeding
Publisher: IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)
Yann Dauphin, Gokhan Tur, Dilek Hakkani-Tur, and Larry Heck
We propose a novel zero-shot learning method for semantic utterance classification (SUC). It learns a classifier f : X -> Y for problems where none of the semantic categories Y are present in the training set. The framework uncovers the link between categories and utterances through a semantic space. We show that this semantic space can be learned by deep neural networks trained on large amounts of search engine query log data. What’s more, we propose a novel method that can learn discriminative semantic...
Publication details
Date: 1 April 2014
Type: Inproceeding
Publisher: International Conference on Learning Representations (ICLR)
Lu Wang, Larry Heck, and Dilek Hakkani-Tur
Training statistical dialog models in spoken dialog systems (SDS) requires large amounts of annotated data. The lack of scalable methods for data mining and annotation poses a significant hurdle for state-of-the-art SDS. This paper presents an approach that directly leverage billions of web search and browse sessions to overcome this hurdle. The key insight is that task completion through web search and browse sessions is (a) predictable and (b) generalizes to spoken dialog task completion. The new...
Publication details
Date: 1 January 2014
Type: Inproceeding
Publisher: IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)
Seyed Omid Sadjadi and Larry Heck
Co-channel speech, which occurs in monaural audio recordings of two or more overlapping talkers, poses a great challenge for automatic speech applications. Automatic speech recognition (ASR) performance, in particular, has been shown to degrade significantly in the presence of a competing talker. In this paper, assuming a known target talker scenario, we present two different masking strategies based on speaker verification to alleviate the impact of the competing talker (a.k.a. masker) interference on ASR...
Publication details
Date: 1 January 2014
Type: Inproceeding
Publisher: IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)
Michael Gamon, Tae Yano, Xinying Song, Johnson Apacible, and Patrick Pantel
We propose a system that determines the salience of entities within web documents. Many recent advances in commercial search engines leverage the identification of entities in web pages. However, for many pages, only a small subset of entities are central to the document, which can lead to degraded relevance for entity triggered experiences. We address this problem by devising a system that scores each entity on a web page according to its centrality to the page content. We propose salience classification...
Publication details
Date: 1 November 2013
Type: Inproceeding
Publisher: ACM International Conference on Information and Knowledge Management (CIKM)
Michael Gamon, Tae Yano, Xinying Song, Johnson Apacible, and Patrick Pantel
We propose a system that determines the salience of entities within web documents. Many recent advances in commercial search engines leverage the identification of entities in web pages. However, for many pages, only a small subset of entities are important, or central, to the document, which can lead to degraded relevance for entity triggered experiences. We address this problem by devising a system that scores each entity on a web page according to its centrality to the page content. We propose salience...
Publication details
Date: 27 October 2013
Type: Technical report
Publisher: Microsoft Technical Report
Number: MSR-TR-2013-73
Publication details
Date: 1 October 2013
Type: Inproceeding
Publisher: Association for Computational Linguistics
Michel Galley, Chris Quirk, Colin Cherry, and Kristina Toutanova
Minimum Error Rate Training (MERT) remains one of the preferred methods for tuning linear parameters in machine translation systems, yet it faces significant issues. First, MERT is an unregularized learner and is therefore prone to overfitting. Second, it is commonly used on a noisy, non-convex loss function that becomes more difficult to optimize as the number of parameters increases. To address these issues, we study the addition of a regularization term to the MERT objective function. Since standard...
Publication details
Date: 1 October 2013
Type: Inproceeding
Publisher: Association for Computational Linguistics
Seyed Omid Sadjadi, Malcolm Slaney, and Larry Heck
This report serves as a user manual for the tools available in the Microsoft Research (MSR) Identity Toolbox. This toolbox contains a collection of MATLAB tools and routines that can be used for research and development in speaker recognition. It provides researchers with a test bed for developing new front-end and back-end techniques, allowing replicable evaluation of new advancements. It will also help newcomers in the field by lowering the “barrier to entry”, enabling them to quickly build baseline...
Publication details
Date: 1 September 2013
Type: Technical report
Publisher: Microsoft Research Technical Report
Number: MSR-TR-2013-133
Riham Hassan Mansour, Nesma Refaei, Michael Gamon, Khaled Sami, and Ahmed Abdel-Hamid
In this paper we undertake a large cross-domain investigation of sentiment domain adaptation, challenging the practical necessity of sentiment domain adaptation algorithms. We first show that across a wide set of domains, a simple “all-in-one” classifier that utilizes all available training data from all but the target domain tends to outperform published domain adaptation methods. A very simple ensemble classifier also performs well in these scenarios. Combined with the fact that labeled data nowadays is...
Publication details
Date: 1 September 2013
Type: Inproceeding
Publisher: ACL/SIGPARSE
Anoop Deoras and Ruhi Sarikaya
This paper investigates the use of deep belief networks (DBN) for semantic tagging, a sequence classification task, in spoken language understanding (SLU). We evaluate the performance of the DBN based sequence tagger on the well-studied ATIS task and compare our technique to conditional random fields (CRF), a state-of-the-art classifier for sequence classification. In conjunction with lexical and named entity features, we also use dependency parser based syntactic features and part of speech (POS) tags...
Publication details
Date: 1 September 2013
Type: Inproceeding
Publisher: ISCA
Xingxing Zhang, Jianwen Zhang, Junyu Zeng, Jun Yan, Zheng Chen, and Zhifang Sui
Distant supervision (DS) is an appealing learning method which learns from existing relational facts to extract more from a text corpus. However, the accuracy is still not satisfying. In this paper, we point out and analyze some critical factors in DS which have great impact on accuracy, including valid entity type detection, negative training examples construction and ensembles. We propose an approach to handle these factors. By experimenting on Wikipedia articles to extract the facts in Freebase (the top...
Publication details
Date: 1 August 2013
Type: Inproceeding
Publisher: Association for Computational Linguistics
Xingxing Zhang, Jianwen Zhang, Junyu Zeng, Jun Yan, and Zheng Chen
Distant supervision (DS) is an appealing learning method which learns from existing relational facts to extract more from a text corpus. However, the accuracy is still not satisfying. In this paper, we point out and analyze some critical factors in DS which have great impact on accuracy, including valid entity type detection, negative training examples construction and ensembles. We propose an approach to handle these factors. By experimenting on Wikipedia articles to extract the facts in Freebase (the top...
Publication details
Date: 1 August 2013
Type: Proceedings
Publisher: Association for Computational Linguistics
Asli Celikyilmaz, Gokhan Tur, and Dilek Hakkani-Tur
While data-driven methods for spoken language understanding (SLU) provide state of the art performances and reduce maintenance and model adaptation costs compared to handcrafted parsers, the collection and annotation of domain-specific natural language utterances for training remains a time-consuming task. A recent line of research has focused on enriching the training data with in-domain utterances by mining search engine query logs to improve the SLU tasks. However genre mismatch is a big obstacle as...
Publication details
Date: 1 August 2013
Type: Inproceeding
Publisher: Annual Conference of the International Speech Communication Association (Interspeech)
Jiahong Yuan, Neville Ryant, Mark Liberman, Andreas Stolcke, Vikramjit Mitra, and Wen Wang
This study attempts to improve automatic phonetic segmentation within the HMM framework. Experiments were conducted to investigate the use of phone boundary models, the use of precise phonetic segmentation for training HMMs, and the difference between context-dependent and contextindependent phone models in terms of forced alignment performance. Results show that the combination of special one-state phone boundary models and monophone HMMs can significantly improve forced alignment accuracy. HMM-based...
Publication details
Date: 1 August 2013
Type: Inproceeding
Publisher: International Speech Communication Association
Dilek Hakkani-Tur, Asli Celikyilmaz, Larry Heck, and Gokhan Tur
State-of-the art spoken language understanding models that automatically capture user intents in human to machine dialogs are trained with manually annotated data, which is cumbersome and time-consuming to prepare. For bootstrapping the learning algorithm that detects relations in natural language queries to a conversational system, one can rely on publicly available knowledge graphs, such as Freebase, and mine corresponding data from the web. In this paper, we present an unsupervised approach to discover...
Publication details
Date: 1 August 2013
Type: Inproceeding
Publisher: Annual Conference of the International Speech Communication Association (Interspeech)
Elizabeth Shriberg, Andreas Stolcke, and Suman Ravuri
As dialog systems evolve to handle unconstrained input and for use in open environments, addressee detection (detecting speech to the system versus to other people) becomes an increasingly important challenge. We study a corpus in which speakers talk both to a system and to each other, and model two dimensions of speaking style that talkers modify when changing addressee: speech rhythm and vocal effort. For each dimension we design features that do not require speech recognition output, session...
Publication details
Date: 1 August 2013
Type: Inproceeding
Publisher: International Speech Communication Association
Larry Heck, Dilek Hakkani-Tur, and Gokhan Tur
The past decade has seen the emergence of web-scale structured and linked semantic knowledge resources (e.g., Freebase, DBPedia). These semantic knowledge graphs provide a scalable “schema for the web”, representing a significant opportunity for the spoken language understanding (SLU) research community. This paper leverages these resources to bootstrap a web-scale semantic parser with no requirement for semantic schema design, no data collection, and no manual annotations. Our approach is based on an...
Publication details
Date: 1 August 2013
Type: Inproceeding
Publisher: International Speech Communication Association
Yan Xu, Yining Wang, Jian-Tao Sun, Jianwen Zhang, Junichi Tsujii, and Eric Chang
To build large collections of medical terms from semi-structured information sources (e.g. tables, lists, etc.) and encyclopedia sites on the web. The terms are classified into the three semantic categories, Medical Problems, Medications, and Medical Tests, which were used in i2b2 challenge tasks. We developed two systems, one for Chinese and another for English terms. The two systems share the same methodology and use the same software with minimum language dependent parts. We produced large collections...
Publication details
Date: 9 July 2013
Type: Article
Publisher: PLoS
Munmun De Choudhury, Scott Counts, Eric Horvitz, and Michael Gamon
Major depression constitutes a serious challenge in personal and public health. Tens of millions of people each year suf-fer from depression and only a fraction receives adequate treatment. We explore the potential to use social media to detect and diagnose major depressive disorder in individu-als. We first employ crowdsourcing to compile a set of Twitter users who report being diagnosed with clinical de-pression, based on a standard psychometric instrument. Through their social media postings over a year...
Publication details
Date: 1 July 2013
Type: Inproceeding
Publisher: AAAI
Rohan Ramanath, Monojit Choudhury, and Kalika Bali
Hierarchical or nested annotation of linguistic data often co-exists with simpler non-hierarchical or flat counterparts, a classic example being that of annotations used for parsing and chunking. In this work, we propose a general strategy for comparing across these two schemes of annotation using the concept of entailment that formalizes a correspondence between them. We use crowdsourcing to obtain query and sentence chunking and show that entailment can not only be used as an effective evaluation metric...
Publication details
Date: 1 July 2013
Type: Inproceeding
Publisher: Association for Computational Linguistics
Rohan Ramanath, Monojit Choudhury, Kalika Bali, and Rishiaj Saha Roy
Query segmentation, like text chunking, is the first step towards query understanding. In this study we explore the effectiveness of crowdsourcing for this task. Through carefully designed control experiments and Inter Annotator Agreement metrics for analysis of experimental data, we show that crowdsourcing may not be a suitable approach for query segmentation because the crowd seems to have a very strong bias towards dividing the query into roughly equal (often only two) parts. Similarly, in the case of...
Publication details
Date: 1 July 2013
Type: Inproceeding
Publisher: Association for Computational Linguistics
Heeyoung Lee, Andreas Stolcke, and Elizabeth Shriberg
Addressee detection (AD) is an important problem for dialog systems in human-humancomputer scenarios (contexts involving multiple people and a system) because systemdirected speech must be distinguished from human-directed speech. Recent work on AD (Shriberg et al., 2012) showed good results using prosodic and lexical features trained on in-domain data. In-domain data, however, is expensive to collect for each new domain. In this study we focus on lexical models and investigate how well out-of-domain data...
Publication details
Date: 1 June 2013
Type: Inproceeding
Publisher: Association for Computational Linguistics
Vikramjit Mitra, Wen Wang, Andreas Stolcke, Hosung Nam, Colleen Richey, Jiahong Yuan, and Mark Liberman
Studies have demonstrated that articulatory information can model speech variability effectively and can potentially help to improve speech recognition performance. Most of the studies involving articulatory information have focused on effectively estimating them from speech, and few studies have actually used such features for speech recognition. Speech recognition studies using articulatory information have been mostly confined to digit or medium vocabulary speech recognition, and efforts to incorporate...
Publication details
Date: 1 May 2013
Type: Inproceeding
Publisher: IEEE SPS
1–25 of 143
Sort
Show 25 | 50 | 100
123456Next 
> Our research