Share on Facebook Tweet on Twitter Share on LinkedIn Share by email
Our research
Content type
+
Downloads (457)
+
Events (451)
 
Groups (152)
+
News (2759)
 
People (735)
 
Projects (1111)
+
Publications (12626)
+
Videos (5808)
Labs
Research areas
Algorithms and theory47205 (342)
Communication and collaboration47188 (215)
Computational linguistics47189 (249)
Computational sciences47190 (224)
Computer systems and networking47191 (767)
Computer vision208594 (911)
Data mining and data management208595 (120)
Economics and computation47192 (105)
Education47193 (86)
Gaming47194 (79)
Graphics and multimedia47195 (235)
Hardware and devices47196 (216)
Health and well-being47197 (92)
Human-computer interaction47198 (899)
Machine learning and intelligence47200 (908)
Mobile computing208596 (63)
Quantum computing208597 (35)
Search, information retrieval, and knowledge management47199 (699)
Security and privacy47202 (317)
Social media208598 (53)
Social sciences47203 (267)
Software development, programming principles, tools, and languages47204 (625)
Speech recognition, synthesis, and dialog systems208599 (138)
Technology for emerging markets208600 (32)
1–25 of 908
Sort
Show 25 | 50 | 100
1234567Next 
Publication details
Date: 1 November 2015
Type: Technical report
Publisher: USENIX – Advanced Computing Systems Association
Number: MSR-TR-2015-59
Yu Zheng, Huichu Zhang, and Yong Yu

The collective anomaly denotes a collection of nearby locations that are anomalous during a few consecutive time intervals in terms of phenomena collectively witnessed by multiple datasets. The collective anomalies suggest there are underlying problems that may not be identified based on a single data source or in a single location. It also associates individual locations and time intervals, formulating a panoramic view of an event. To detect a collective anomaly is very challenging, however, as...

Publication details
Date: 1 November 2015
Type: Inproceeding
Publisher: ACM – Association for Computing Machinery
Zhongyuan Wang, Haixun Wang, Ji-Rong Wen, and Yanghua Xiao

Humans understand the world by classifying objects into an appropriate level of categories. This process is often automatic and subconscious. Psychologists and linguists call it as Basic-level Categorization (BLC). BLC can benefit lots of applications such as knowledge panel, advertising and recommendation. However, how to quantify basic-level concepts is still an open problem. Recently, much work focuses on constructing knowledge bases or semantic networks from web scale text corpora, which makes it...

Publication details
Date: 1 October 2015
Type: Inproceeding
Publisher: ACM – Association for Computing Machinery
M. D'Souza, J. Burggraaff, P. Kontschieder, J. Dorn, C.P.Kamm, S. Seinheimer, P. Tewarie, C. Morrison, A. Sellen, A. Criminisi, F. Dahlke, B Uitdehaag, and L. Kappos
Publication details
Date: 1 October 2015
Type: Inproceeding
Jianpeng Cheng, Zhongyuan Wang, Ji-Rong Wen, Jun Yan, and Zheng Chen

Representing discrete words in a continuous vector space turns out to be useful for natural language applications related to text understanding. Meanwhile, it poses extensive challenges, one of which is due to the polysemous nature of human language. A common solution (a.k.a word sense induction) is to separate each word into multiple senses and create a representation for each sense respectively. However, this approach is usually computationally expensive and prone to data sparsity, since each sense...

Publication details
Date: 1 October 2015
Type: Inproceeding
Publisher: ACM – Association for Computing Machinery
H. Lombaert, A. Criminisi, and N. Ayache

This paper presents a new method for classifying surface data via spectral representations of shapes. Our approach benefits classification problems that involve data living on surfaces, such as in cortical parcellation. For instance, current methods for labeling cortical points into surface parcels often involve a slow mesh deformation toward pre-labeled atlases, requiring as much as 4 hours with the established FreeSurfer. This may burden neuroscience studies involving region-specific measurements....

Publication details
Date: 1 October 2015
Type: Inproceeding
Publisher: Springer
Bhaskar Mitra and Nick Craswell

Query auto-completion (QAC) systems typically suggest queries that have previously been observed in search logs. Given a partial user query, the system looks up this query prefix against a precomputed set of candidates, then orders them using ranking signals such as popularity. Such systems can only recommend queries for prefixes that have been previously seen by the search engine with adequate frequency. They fail to recommend if the prefix is sufficiently rare such that it has no matches in the...

Publication details
Date: 1 October 2015
Type: Inproceeding
Publisher: ACM – Association for Computing Machinery
C. Morrison, K. Huckvale, A. Sakar, P. Kontschieder, J. Dorn, S. Steinheimer, C. P. Kamm, J. Burggraaff, M. D'Souza, F. Dahlke, L. Kappos, B. Uitdehaag, A. Criminisi, and A. Sellen
Publication details
Date: 1 October 2015
Type: Inproceeding
J. Burggraaff, J. Dorn, M. D'Souza, C. P. Kamm, P. Tewarie, P. Kontschieder, C. Morrison, A. Sellen, A. Criminisi, F. Dahlke, L. Kappos, and B. M. J. Uitdehaag
Publication details
Date: 1 October 2015
Type: Inproceeding
Yi Yang, Wen-tau Yih, and Christopher Meek

We describe the WikiQA dataset, a new publicly available set of question and sentence pairs, collected and annotated for research on open-domain question answering. Most previous work on answer sentence selection focuses on a dataset created using the TREC-QA data, which includes editor-generated questions and candidate answer sentences selected by matching content words in the question. WikiQA is constructed using a more natural process and is more than an order of magnitude larger than the previous...

Publication details
Date: 21 September 2015
Type: Inproceeding
Publisher: ACL – Association for Computational Linguistics
Kristina Toutanova, Danqi Chen, Patrick Pantel, Hoifung Poon, Pallavi Choudhury, and Michael Gamon

Models that learn to represent textual and knowledge base relations in the same continuous latent space are able to perform joint inferences among the two kinds of relations and obtain high accuracy on knowledge base completion (Riedel et al. 2013). In this paper we propose a model that captures the compositional structure of textual relations, and jointly optimizes entity, knowledge base, and textual relation representations. The proposed model significantly improves performance over a model that...

Publication details
Date: 17 September 2015
Type: Inproceeding
Publisher: ACL – Association for Computational Linguistics
Vasileios Lampos, Elad Yom-Tov, Richard Pebody, and Ingemar J. Cox

Assessing the effect of a health-oriented intervention by traditional epidemiological methods is commonly based only on population segments that use healthcare services. Here we introduce a complementary framework for evaluating the impact of a targeted intervention, such as a vaccination campaign against an infectious disease, through a statistical analysis of usergenerated content submitted on web platforms. Using supervised learning, we derive a nonlinear regression model for estimating the...

Publication details
Date: 7 September 2015
Type: Article
Publisher: Springer
Yu Zheng and Senior Member

Traditional data mining usually deals with data from a datasets from different sources in different domains. These datasets representation, distribution, scale and density. How to unlock the connected) datasets is paramount in the big data research, essentially This calls for advanced techniques that can fuse the knowledge from mining task. This paper summarizes the data fusion methodologies, feature level-based, and the semantic meaning-based data fusion divided into four groups: multi-view...

Publication details
Date: 1 September 2015
Type: Article
Publisher: IEEE – Institute of Electrical and Electronics Engineers
Publication details
Date: 1 September 2015
Type: Proceedings
Publisher: ACL – Association for Computational Linguistics
Dilek Hakkani-Tur, Yun-Cheng Ju, Geoffrey Zweig, and Gokhan Tur

Spoken language understanding (SLU) in today’s conversational systems focuses on recognizing a set of domains, intents, and associated arguments, that are determined by application developers. User requests that are not covered by these are usually directed to search engines, and may remain unhandled. We propose a method that aims to find common user intents amongst these uncovered, out-of-domain utterances, with the goal of supporting future phases of dialog system design. Our approach relies on...

Publication details
Date: 1 September 2015
Type: Inproceeding
Publisher: Interspeech 2015 Conference
A.J. Bernheim Brush, John Krumm, Sidhant Gupta, and Shwetak Patel

Time of use tiered pricing schedules encourage shifting electricity demand from peak to off-peak hours. Charging times for electric vehicles (EV) can be shifted into overnight hours, which are usually off-peak. EVs can also be used as energy storage devices, available during certain peak hours to power a house with electricity stored during off-peak hours. Studies suggest both techniques are practical, but were based on simulated demand patterns or large commercial fleets. To investigate feasibility on...

Publication details
Date: 1 September 2015
Type: Inproceeding
Publisher: ACM – Association for Computing Machinery
Yu Zheng

The advances in location-acquisition and mobile computing techniques have generated massive spatial trajectory data, which represent the mobility of a diversity of moving objects, such as people, vehicles and animals. Many techniques have been proposed for processing, managing and mining trajectory data in the past decade, fostering a broad range of applications. In this article, we conduct a systematic survey on the major research into trajectory data mining, providing a panorama of the field...

Publication details
Date: 1 September 2015
Type: Article
Publisher: ACM – Association for Computing Machinery
Young-Bum Kim, Karl Stratos, Ruhi Sarikaya, and Minwoo Jeong

In natural language understanding (NLU), a user utterance can be labeled differently depending on the domain or application (e.g., weather vs. calendar). Standard domain adaptation techniques are not directly applicable to take advantage of the existing annotations because they assume that the label set is invariant. We propose a solution based on label embeddings induced from canonical correlation analysis (CCA) that reduces the problem to a standard domain adaptation task and allows use of a number of...

Publication details
Date: 29 August 2015
Type: Proceedings
Publisher: ACL – Association for Computational Linguistics
Young-Bum Kim, Karl Stratos, and Ruhi Sarikaya

In this paper, we apply the concept of pre-training to hidden-unit conditional random
fields (HUCRFs) to enable learning on unlabeled data. We present a simple yet effective pre-training technique that learns to associate words with their clusters, which are obtained in an unsupervised manner. The learned parameters are then used to initialize the supervised learning process. We also propose a word clustering technique based on canonical correlation analysis (CCA) that is sensitive to multiple word...

Publication details
Date: 28 August 2015
Type: Proceedings
Publisher: ACL – Association for Computational Linguistics
Young-Bum Kim, Karl Stratos, Xiaohu Liu, and Ruhi Sarikaya

In this paper, we introduce the task of selecting compact lexicon from large, noisy gazetteers.
This scenario arises often in practice, in particular spoken language understanding (SLU).
We propose a simple and effective solution based on matrix decomposition techniques:
canonical correlation analysis (CCA) and rank-revealing QR (RRQR) factorization. CCA is first used to derive low-dimensional gazetteer embeddings from domain-specific search logs. Then RRQR is used to find a subset of...

Publication details
Date: 27 August 2015
Type: Proceedings
Publisher: ACL – Association for Computational Linguistics
Hsun-Ping Hsieh, Shou-De Lin, and Yu Zheng

This paper tries to answer two questions. First, how to infer real-time air quality of any arbitrary location given environmental data and historical air quality data from very sparse monitoring locations. Second, if one needs to establish few new monitoring stations to improve the inference quality, how to determine the best locations for such purpose? The problems are challenging since for most of the locations (>99%) in a city we do not have any air quality data to train a model from. We design a...

Publication details
Date: 12 August 2015
Type: Inproceeding
Publisher: ACM – Association for Computing Machinery
Timothy Baldwin, Marie Catherine de Marneffe, Bo Han, Young-Bum Kim, Alan Ritter, and Wei Xu

This paper presents the results of the two shared tasks associated with W-NUT 2015: (1) a text normalization task with 10 participants; and (2) a named entity tagging task with 8 participants. We outline the task, annotation process and dataset statistics, and provide a high-level overview of the participating systems for each shared task.

Publication details
Date: 1 August 2015
Type: Proceedings
Publisher: ACL – Association for Computational Linguistics
Publication details
Date: 1 August 2015
Type: Inproceeding
Publisher: KDD
J. Valentin, V. Vineet, M.-M. Cheng, D. Kim, J. Shotton, P. Kohli, M. Niessner, A. Criminisi, S. Izadi, and P. Torr

We present a new interactive and online approach to 3D scene understanding. Our system, SemanticPaint, allows users to simultaneously scan their environment, whilst interactively segmenting the scene simply by reaching out and touching any desired object or surface. Our system continuously learns from these segmentations, and labels new unseen parts of the environment. Unlike offline systems, where capture, labeling and batch learning often takes hours or even days to perform, our approach is fully...

Publication details
Date: 1 August 2015
Type: Article
Publisher: ACM – Association for Computing Machinery
Jian Tang, Meng Qu, and Qiaozhu Mei

Unsupervised text embedding methods, such as Skip-gram and Paragraph Vector, have been attracting increasing attention due to their simplicity, scalability, and effectiveness. However, comparing to sophisticated deep learning architectures such as convolutional neural networks, these methods usually yield inferior results when applied to particular machine learning tasks. One possible reason is that these text embedding methods learn the representation of text in a fully unsupervised way, without...

Publication details
Date: 1 August 2015
Type: Inproceeding
Publisher: ACM – Association for Computing Machinery
1–25 of 908
Sort
Show 25 | 50 | 100
1234567Next 
> Our research