Share on Facebook Tweet on Twitter Share on LinkedIn Share by email
Our research
Content type
+
Downloads (454)
+
Events (444)
 
Groups (151)
+
News (2727)
 
People (740)
 
Projects (1102)
+
Publications (12529)
+
Videos (5665)
Labs
Research areas
Algorithms and theory47205 (845)
Communication and collaboration47188 (1460)
Computational linguistics47189 (535)
Computational sciences47190 (812)
Computer systems and networking47191 (2132)
Computer vision208594 (1143)
Data mining and data management208595 (271)
Economics and computation47192 (325)
Education47193 (805)
Gaming47194 (399)
Graphics and multimedia47195 (1206)
Hardware and devices47196 (1074)
Health and well-being47197 (503)
Human-computer interaction47198 (2305)
Machine learning and intelligence47200 (1852)
Mobile computing208596 (176)
Quantum computing208597 (80)
Search, information retrieval, and knowledge management47199 (1771)
Security and privacy47202 (822)
Social media208598 (134)
Social sciences47203 (856)
Software development, programming principles, tools, and languages47204 (1560)
Speech recognition, synthesis, and dialog systems208599 (211)
Technology for emerging markets208600 (70)
1–25 of 23812
Sort
Show 25 | 50 | 100
1234567Next 
Yanjie Fu, Yong Ge, Yu Zheng, Yao, Yanchi Liu, Hui Xiong, and Nicholas Jing Yuan

Ranking residential real estates based on investment values can provide decision making support for home buyers and thus plays an important role in estate marketplace. In this paper, we aim to develop methods for ranking estates based on investment values by mining users opinions about estates from online user reviews and offline moving behaviors (e.g., taxi traces, smart card transactions, check-ins). While a variety of features could be extracted from these data, these features are intercorrelated and...

Publication details
Date: 1 December 2015
Type: Inproceeding
Publisher: IEEE – Institute of Electrical and Electronics Engineers
Youshan Miao, Wentao Han, Kaiwei Li, Ming Wu, Fan Yang, Lidong Zhou, Vijayan Prabhakaran, Enhong Chen, and Wenguang Chen

Temporal graphs that capture graph changes over time are attracting increasing interest from research communities, for functions such as understanding temporal characteristics of social interactions on a time-evolving social graph. ImmortalGraph is a storage and execution engine designed and optimized specifically for temporal graphs. Locality is at the center of ImmortalGraph’s design: temporal graphs are carefully laid out in both persistent storage and memory, taking into account data locality in...

Publication details
Date: 1 December 2015
Type: Article
Publisher: ACM – Association for Computing Machinery
Abram Hindle, Christian Bird, Thomas Zimmermann, and Nachiappan Nagappan

Large organizations like Microsoft tend to rely on formal requirements documentation in order to specify and design the software products that they develop. These documents are meant to be tightly coupled with the actual implementation of the features they describe. In this paper we evaluate the value of high-level topic-based requirements traceability and issue report traceability in the version control system, using Latent Dirichlet Allocation (LDA). We evaluate LDA topics on practitioners...

Publication details
Date: 1 December 2015
Type: Article
Publisher: Springer
Publication details
Date: 1 December 2015
Type: Article
Type a summary that describes the content of your video.
Video details
Date: 5 November 2015
Duration: 00:03:15
Publisher: Microsoft
Jointly organized by Harvard University, Massachusetts Institute of Technology, and Microsoft Research New England, the Charles River Lectures on Probability and Related Topics is a one-day event for the benefit of the greater Boston area mathematics community.
Event details
Date: 2 October 2015
Location: Cambridge, Mass.
Type: Conference
H. Lombaert, A. Criminisi, and N. Ayache

This paper presents a new method for classifying surface data via spectral representations of shapes. Our approach benefits classification problems that involve data living on surfaces, such as in cortical parcellation. For instance, current methods for labeling cortical points into surface parcels often involve a slow mesh deformation toward pre-labeled atlases, requiring as much as 4 hours with the established FreeSurfer. This may burden neuroscience studies involving region-specific measurements....

Publication details
Date: 1 October 2015
Type: Inproceeding
Publisher: Springer
Dongwook Yoon, Nicholas Chen, François Guimbretière, and Abigail Sellen

This paper introduces a novel document annotation system that aims to enable the kinds of rich communication that usually only occur in face-to-face meetings. Our system, RichReview, lets users create annotations on top of digital documents using three main modalities: freeform inking, voice for narration, and deictic gestures in support of voice. RichReview uses novel visual representations and timesynchronization between modalities to simplify annotation access and navigation. Moreover, RichReview’s...

Publication details
Date: 1 October 2015
Type: Inproceeding
Publisher: ACM – Association for Computing Machinery
Vasileios Lampos, Elad Yom-Tov, Richard Pebody, and Ingemar J. Cox

Assessing the effect of a health-oriented intervention by traditional epidemiological methods is commonly based only on population segments that use healthcare services. Here we introduce a complementary framework for evaluating the impact of a targeted intervention, such as a vaccination campaign against an infectious disease, through a statistical analysis of usergenerated content submitted on web platforms. Using supervised learning, we derive a nonlinear regression model for estimating the...

Publication details
Date: 7 September 2015
Type: Article
Publisher: Springer
Ahmed Kharrufa, James Nicholson, Paul Dunphy, Steve Hodges, Pam Briggs, and Patrick Olivier

In addition to their popularity as personal devices, tablets, are becoming increasingly prevalent in work and public settings. In many of these newly-established application domains a supervisor user – such as the teacher in a classroom – oversees the function of one or more devices. Access to supervisory functions is typically controlled through the use of a passcode, but experience shows that keeping this passcode secret can be problematic. In this paper we introduce SwipeID, a method of identifying...

Publication details
Date: 1 September 2015
Type: Inproceeding
Publisher: IFIP
Sree Harsha Yella and Andreas Stolcke

Speaker diarization finds contiguous speaker segments in an audio stream and clusters them by speaker identity, without using a-priori knowledge about the number of speakers or enrollment data. Diarization typically clusters speech segments based on short-term spectral features. In prior work, we showed that neural networks can serve as discriminative feature transformers for diarization by training them to perform same/different speaker comparisons on speech segments, yielding improved diarization...

Publication details
Date: 1 September 2015
Type: Inproceeding
Publisher: ISCA - International Speech Communication Association
Dilek Hakkani-Tur, Yun-Cheng Ju, Geoffrey Zweig, and Gokhan Tur

Spoken language understanding (SLU) in today’s conversational systems focuses on recognizing a set of domains, intents, and associated arguments, that are determined by application developers. User requests that are not covered by these are usually directed to search engines, and may remain unhandled. We propose a method that aims to find common user intents amongst these uncovered, out-of-domain utterances, with the goal of supporting future phases of dialog system design. Our approach relies on...

Publication details
Date: 1 September 2015
Type: Inproceeding
Publisher: Interspeech 2015 Conference
Rui Ding, Qiang Wang, Yingnong Dang, Qiang Fu, Haidong Zhang, and Dongmei Zhang

Fast and scalable analysis techniques are becoming increasingly important in the era of big data, because they are the enabling techniques to create real-time and interactive experiences in data analysis. Time series are widely available in diverse application areas. Due to the large number of time series instances (e.g., millions) and the high dimensionality of each time series instance (e.g., thousands), it is challenging to conduct clustering on largescale time series, and it is even more challenging...

Publication details
Date: 1 September 2015
Type: Proceedings
Publisher: VLDB – Very Large Data Bases
Suman Ravuri and Andreas Stolcke

Utterance classification is a critical pre-processing step for many speech understanding and dialog systems. In multi-user settings, one needs to first identify if an utterance is even directed at the system, followed by another level of classification to determine the intent of the user’s input. In this work, we propose RNN and LSTM models for both these tasks. We show how both models outperform baselines based on ngram-based language models (LMs), feedforward neural network LMs, and boosting...

Publication details
Date: 1 September 2015
Type: Inproceeding
Publisher: ISCA - International Speech Communication Association
TJ Tsai and Andreas Stolcke

This paper proposes a robust and efficient way to temporally align a set of unsynchronized meeting recordings, such as might be collected by participants’ cell phones. We propose an adaptive audio fingerprint which is learned on-the-fly in a completely unsupervised manner to adapt to the characteristics of a given set of unaligned recordings. The design of the adaptive audio fingerprint is formulated as a series of optimization problems which can be solved very efficiently using eigenvector routines. We...

Publication details
Date: 1 September 2015
Type: Inproceeding
Publisher: ISCA - International Speech Communication Association
Publication details
Date: 1 September 2015
Type: Inproceeding
Publisher: Springer
Yoli Shavit, Boyan Yordanov, Sara-Jane Dunn, Christoph M. Wintersteiger, Youssef Hamadi, and Hillel Kugler
Publication details
Date: 1 September 2015
Type: Inproceeding
Publisher: Springer
Yu Zheng

The advances in location-acquisition and mobile computing techniques have generated massive spatial trajectory data, which represent the mobility of a diversity of moving objects, such as people, vehicles and animals. Many techniques have been proposed for processing, managing and mining trajectory data in the past decade, fostering a broad range of applications. In this article, we conduct a systematic survey on the major research into trajectory data mining, providing a panorama of the field...

Publication details
Date: 1 September 2015
Type: Article
Publisher: ACM – Association for Computing Machinery
M. Levit, A. Stolcke, R. Subba, S. Parthasarathy, S. Chang, S. Xie, T. Anastasakos, and B. Dumoulin

We continue our investigations of Word-Phrase-Entity (WPE) Language Models that unify words, phrases and classes, such as named entities, into a single probabilistic framework for the purpose of language modeling. In the present study we show how WPE LMs can be adapted to work in a personalized scenario where class definitions change from user to user or even from utterance to utterance. Compared to traditional classbased LMs in various conditions, WPE LMs exhibited comparable or better...

Publication details
Date: 1 September 2015
Type: Inproceeding
Publisher: ISCA - International Speech Communication Association
Young-Bum Kim, Ruhi Sarikaya, and Minwoo Jeong

In natural language understanding (NLU), a user utterance can be labeled differently depending on the domain or application (e.g., weather vs. calendar). Standard domain adaptation techniques are not directly applicable to take advantage of the existing annotations because they assume that the label set is invariant. We propose a solution based on label embeddings induced from canonical correlation analysis (CCA) that reduces the problem to a standard domain adaptation task and allows use of a number of...

Publication details
Date: 29 August 2015
Type: Proceedings
Publisher: ACL – Association for Computational Linguistics
Young-Bum Kim and Ruhi Sarikaya

In this paper, we apply the concept of pre-training to hidden-unit conditional random
fields (HUCRFs) to enable learning on unlabeled data. We present a simple yet effective pre-training technique that learns to associate words with their clusters, which are obtained in an unsupervised manner. The learned parameters are then used to initialize the supervised learning process. We also propose a word clustering technique based on canonical correlation analysis (CCA) that is sensitive to multiple word...

Publication details
Date: 28 August 2015
Type: Proceedings
Publisher: ACL – Association for Computational Linguistics
Young-Bum Kim, Xiaohu Liu, and Ruhi Sarikaya

In this paper, we introduce the task of selecting compact lexicon from large, noisy gazetteers.
This scenario arises often in practice, in particular spoken language understanding (SLU).
We propose a simple and effective solution based on matrix decomposition techniques:
canonical correlation analysis (CCA) and rank-revealing QR (RRQR) factorization. CCA is first used to derive low-dimensional gazetteer embeddings from domain-specific search logs. Then RRQR is used to find a subset of...

Publication details
Date: 27 August 2015
Type: Proceedings
Publisher: ACL – Association for Computational Linguistics
Publication details
Date: 1 August 2015
Type: Inproceeding
Publisher: KDD
Ravi Mangal, Xin Zhang, Aditya V. Nori, and Mayur Naik

Program analysis tools often produce undesirable output due to various approximations.
We present an approach and a system \system\ that allows user feedback
to guide such approximations towards producing the desired output.
We formulate the problem of user-guided program analysis in terms of solving a
combination of hard rules and soft rules: hard rules capture soundness while soft rules
capture degrees of approximations and preferences of users.
Our technique solves the...

Publication details
Date: 1 August 2015
Type: Inproceeding
Publisher: ACM – Association for Computing Machinery
Virajith Jalaparti, Peter Bodik, Ishai Menache, Sriram Rao, Konstantin Makarychev, and Matt Caesar

To reduce the impact of network congestion on big data jobs, cluster management frameworks use various heuristics to schedule compute tasks and/or network flows. Most of these schedulers consider the job input data fixed and greedily schedule the tasks and flows that are ready to run. However, a large fraction of production jobs are recurring with predictable characteristics, which allows us to plan ahead for them. Coordinating the placement of data and tasks of these jobs allows for significantly...

Publication details
Date: 1 August 2015
Type: Inproceeding
Publisher: ACM SIGCOMM
1–25 of 23812
Sort
Show 25 | 50 | 100
1234567Next 
> Our research