Share on Facebook Tweet on Twitter Share on LinkedIn Share by email
Our research
Content type
+
Downloads (454)
+
Events (444)
 
Groups (151)
+
News (2727)
 
People (739)
 
Projects (1102)
+
Publications (12528)
+
Videos (5665)
Labs
Research areas
Algorithms and theory47205 (845)
Communication and collaboration47188 (1460)
Computational linguistics47189 (534)
Computational sciences47190 (812)
Computer systems and networking47191 (2131)
Computer vision208594 (1143)
Data mining and data management208595 (271)
Economics and computation47192 (325)
Education47193 (805)
Gaming47194 (399)
Graphics and multimedia47195 (1206)
Hardware and devices47196 (1074)
Health and well-being47197 (503)
Human-computer interaction47198 (2305)
Machine learning and intelligence47200 (1852)
Mobile computing208596 (176)
Quantum computing208597 (80)
Search, information retrieval, and knowledge management47199 (1770)
Security and privacy47202 (822)
Social media208598 (133)
Social sciences47203 (856)
Software development, programming principles, tools, and languages47204 (1560)
Speech recognition, synthesis, and dialog systems208599 (211)
Technology for emerging markets208600 (70)
1–25 of 23810
Sort
Show 25 | 50 | 100
1234567Next 
Publication details
Date: 1 December 2015
Type: Article
Abram Hindle, Christian Bird, Thomas Zimmermann, and Nachiappan Nagappan

Large organizations like Microsoft tend to rely on formal requirements documentation in order to specify and design the software products that they develop. These documents are meant to be tightly coupled with the actual implementation of the features they describe. In this paper we evaluate the value of high-level topic-based requirements traceability and issue report traceability in the version control system, using Latent Dirichlet Allocation (LDA). We evaluate LDA topics on practitioners...

Publication details
Date: 1 December 2015
Type: Article
Publisher: Springer
Youshan Miao, Wentao Han, Kaiwei Li, Ming Wu, Fan Yang, Lidong Zhou, Vijayan Prabhakaran, Enhong Chen, and Wenguang Chen

Temporal graphs that capture graph changes over time are attracting increasing interest from research communities, for functions such as understanding temporal characteristics of social interactions on a time-evolving social graph. ImmortalGraph is a storage and execution engine designed and optimized specifically for temporal graphs. Locality is at the center of ImmortalGraph’s design: temporal graphs are carefully laid out in both persistent storage and memory, taking into account data locality in...

Publication details
Date: 1 December 2015
Type: Article
Publisher: ACM – Association for Computing Machinery
Yanjie Fu, Yong Ge, Yu Zheng, Yao, Yanchi Liu, Hui Xiong, and Nicholas Jing Yuan

Ranking residential real estates based on investment values can provide decision making support for home buyers and thus plays an important role in estate marketplace. In this paper, we aim to develop methods for ranking estates based on investment values by mining users opinions about estates from online user reviews and offline moving behaviors (e.g., taxi traces, smart card transactions, check-ins). While a variety of features could be extracted from these data, these features are intercorrelated and...

Publication details
Date: 1 December 2015
Type: Inproceeding
Publisher: IEEE – Institute of Electrical and Electronics Engineers
Type a summary that describes the content of your video.
Video details
Date: 5 November 2015
Duration: 00:03:15
Publisher: Microsoft
Jointly organized by Harvard University, Massachusetts Institute of Technology, and Microsoft Research New England, the Charles River Lectures on Probability and Related Topics is a one-day event for the benefit of the greater Boston area mathematics community.
Event details
Date: 2 October 2015
Location: Cambridge, Mass.
Type: Conference
H. Lombaert, A. Criminisi, and N. Ayache

This paper presents a new method for classifying surface data via spectral representations of shapes. Our approach benefits classification problems that involve data living on surfaces, such as in cortical parcellation. For instance, current methods for labeling cortical points into surface parcels often involve a slow mesh deformation toward pre-labeled atlases, requiring as much as 4 hours with the established FreeSurfer. This may burden neuroscience studies involving region-specific measurements....

Publication details
Date: 1 October 2015
Type: Inproceeding
Publisher: Springer
Dongwook Yoon, Nicholas Chen, François Guimbretière, and Abigail Sellen

This paper introduces a novel document annotation system that aims to enable the kinds of rich communication that usually only occur in face-to-face meetings. Our system, RichReview, lets users create annotations on top of digital documents using three main modalities: freeform inking, voice for narration, and deictic gestures in support of voice. RichReview uses novel visual representations and timesynchronization between modalities to simplify annotation access and navigation. Moreover, RichReview’s...

Publication details
Date: 1 October 2015
Type: Inproceeding
Publisher: ACM – Association for Computing Machinery
Vasileios Lampos, Elad Yom-Tov, Richard Pebody, and Ingemar J. Cox

Assessing the effect of a health-oriented intervention by traditional epidemiological methods is commonly based only on population segments that use healthcare services. Here we introduce a complementary framework for evaluating the impact of a targeted intervention, such as a vaccination campaign against an infectious disease, through a statistical analysis of usergenerated content submitted on web platforms. Using supervised learning, we derive a nonlinear regression model for estimating the...

Publication details
Date: 7 September 2015
Type: Article
Publisher: Springer
Publication details
Date: 1 September 2015
Type: Inproceeding
Publisher: Springer
Sree Harsha Yella and Andreas Stolcke

Speaker diarization finds contiguous speaker segments in an audio stream and clusters them by speaker identity, without using a-priori knowledge about the number of speakers or enrollment data. Diarization typically clusters speech segments based on short-term spectral features. In prior work, we showed that neural networks can serve as discriminative feature transformers for diarization by training them to perform same/different speaker comparisons on speech segments, yielding improved diarization...

Publication details
Date: 1 September 2015
Type: Inproceeding
Publisher: ISCA - International Speech Communication Association
Dilek Hakkani-Tur, Yun-Cheng Ju, Geoffrey Zweig, and Gokhan Tur

Spoken language understanding (SLU) in today’s conversational systems focuses on recognizing a set of domains, intents, and associated arguments, that are determined by application developers. User requests that are not covered by these are usually directed to search engines, and may remain unhandled. We propose a method that aims to find common user intents amongst these uncovered, out-of-domain utterances, with the goal of supporting future phases of dialog system design. Our approach relies on...

Publication details
Date: 1 September 2015
Type: Inproceeding
Publisher: Interspeech 2015 Conference
M. Levit, A. Stolcke, R. Subba, S. Parthasarathy, S. Chang, S. Xie, T. Anastasakos, and B. Dumoulin

We continue our investigations of Word-Phrase-Entity (WPE) Language Models that unify words, phrases and classes, such as named entities, into a single probabilistic framework for the purpose of language modeling. In the present study we show how WPE LMs can be adapted to work in a personalized scenario where class definitions change from user to user or even from utterance to utterance. Compared to traditional classbased LMs in various conditions, WPE LMs exhibited comparable or better...

Publication details
Date: 1 September 2015
Type: Inproceeding
Publisher: ISCA - International Speech Communication Association
Rui Ding, Qiang Wang, Yingnong Dang, Qiang Fu, Haidong Zhang, and Dongmei Zhang

Fast and scalable analysis techniques are becoming increasingly important in the era of big data, because they are the enabling techniques to create real-time and interactive experiences in data analysis. Time series are widely available in diverse application areas. Due to the large number of time series instances (e.g., millions) and the high dimensionality of each time series instance (e.g., thousands), it is challenging to conduct clustering on largescale time series, and it is even more challenging...

Publication details
Date: 1 September 2015
Type: Proceedings
Publisher: VLDB – Very Large Data Bases
Yoli Shavit, Boyan Yordanov, Sara-Jane Dunn, Christoph M. Wintersteiger, Youssef Hamadi, and Hillel Kugler
Publication details
Date: 1 September 2015
Type: Inproceeding
Publisher: Springer
Suman Ravuri and Andreas Stolcke

Utterance classification is a critical pre-processing step for many speech understanding and dialog systems. In multi-user settings, one needs to first identify if an utterance is even directed at the system, followed by another level of classification to determine the intent of the user’s input. In this work, we propose RNN and LSTM models for both these tasks. We show how both models outperform baselines based on ngram-based language models (LMs), feedforward neural network LMs, and boosting...

Publication details
Date: 1 September 2015
Type: Inproceeding
Publisher: ISCA - International Speech Communication Association
TJ Tsai and Andreas Stolcke

This paper proposes a robust and efficient way to temporally align a set of unsynchronized meeting recordings, such as might be collected by participants’ cell phones. We propose an adaptive audio fingerprint which is learned on-the-fly in a completely unsupervised manner to adapt to the characteristics of a given set of unaligned recordings. The design of the adaptive audio fingerprint is formulated as a series of optimization problems which can be solved very efficiently using eigenvector routines. We...

Publication details
Date: 1 September 2015
Type: Inproceeding
Publisher: ISCA - International Speech Communication Association
Yu Zheng

The advances in location-acquisition and mobile computing techniques have generated massive spatial trajectory data, which represent the mobility of a diversity of moving objects, such as people, vehicles and animals. Many techniques have been proposed for processing, managing and mining trajectory data in the past decade, fostering a broad range of applications. In this article, we conduct a systematic survey on the major research into trajectory data mining, providing a panorama of the field...

Publication details
Date: 1 September 2015
Type: Article
Publisher: ACM – Association for Computing Machinery
Ahmed Kharrufa, James Nicholson, Paul Dunphy, Steve Hodges, Pam Briggs, and Patrick Olivier

In addition to their popularity as personal devices, tablets, are becoming increasingly prevalent in work and public settings. In many of these newly-established application domains a supervisor user – such as the teacher in a classroom – oversees the function of one or more devices. Access to supervisory functions is typically controlled through the use of a passcode, but experience shows that keeping this passcode secret can be problematic. In this paper we introduce SwipeID, a method of identifying...

Publication details
Date: 1 September 2015
Type: Inproceeding
Publisher: IFIP
Young-Bum Kim, Ruhi Sarikaya, and Minwoo Jeong

In natural language understanding (NLU), a user utterance can be labeled differently depending on the domain or application (e.g., weather vs. calendar). Standard domain adaptation techniques are not directly applicable to take advantage of the existing annotations because they assume that the label set is invariant. We propose a solution based on label embeddings induced from canonical correlation analysis (CCA) that reduces the problem to a standard domain adaptation task and allows use of a number of...

Publication details
Date: 29 August 2015
Type: Proceedings
Publisher: ACL – Association for Computational Linguistics
Young-Bum Kim and Ruhi Sarikaya

In this paper, we apply the concept of pre-training to hidden-unit conditional random
fields (HUCRFs) to enable learning on unlabeled data. We present a simple yet effective pre-training technique that learns to associate words with their clusters, which are obtained in an unsupervised manner. The learned parameters are then used to initialize the supervised learning process. We also propose a word clustering technique based on canonical correlation analysis (CCA) that is sensitive to multiple word...

Publication details
Date: 28 August 2015
Type: Proceedings
Publisher: ACL – Association for Computational Linguistics
Young-Bum Kim, Xiaohu Liu, and Ruhi Sarikaya

In this paper, we introduce the task of selecting compact lexicon from large, noisy gazetteers.
This scenario arises often in practice, in particular spoken language understanding (SLU).
We propose a simple and effective solution based on matrix decomposition techniques:
canonical correlation analysis (CCA) and rank-revealing QR (RRQR) factorization. CCA is first used to derive low-dimensional gazetteer embeddings from domain-specific search logs. Then RRQR is used to find a subset of...

Publication details
Date: 27 August 2015
Type: Proceedings
Publisher: ACL – Association for Computational Linguistics
Paolo Costa, Hitesh Ballani, Kaveh Razavi, and Ian Kash

Rack-scale computers, comprising a large number of micro-servers connected by a direct-connect topology, are poised to replace servers as the building block in data centers. We focus on the problem of routing and congestion control across the rack's network, and find that high path diversity in rack topologies, in combination with workload diversity across it, means that traditional solutions are inadequate.

We present R2C2, a network stack for rack-scale computers providing flexible and...

Publication details
Date: 1 August 2015
Type: Inproceeding
Publisher: ACM – Association for Computing Machinery
He Zhu, Aditya V. Nori, and Suresh Jagannathan

We propose the integration of a random test generation system(capable of discovering program bugs) and a refinement type system (capable of expressing and verifying program invariants), for higher-order functional programs, using a novel lightweight learning algorithm as an effective intermediary between the two. Our approach is based on the well-understood intuition that useful, but difficult to infer, program properties can often be observed from concrete program states generated by tests; these...

Publication details
Date: 1 August 2015
Type: Inproceeding
Publisher: ACM – Association for Computing Machinery
Darko Makreshanski, Justin Levandoski, and Ryan Stutsman

The release of hardware transactional memory (HTM) in commodity CPUs has major implications on the design and implementation of main-memory databases, especially on the architecture of highperformance lock-free indexing methods at the core of several of these systems. This paper studies the interplay of HTM and lockfree indexing methods. First, we evaluate whether HTM will obviate the need for crafty lock-free index designs by integrating it in a traditional B-tree architecture. HTM performs well for...

Publication details
Date: 1 August 2015
Type: Article
Number: 11
1–25 of 23810
Sort
Show 25 | 50 | 100
1234567Next 
> Our research