Papers are listed in the order they were submitted. You can download all the papers in one zip file.
|
Title: |
MCMC for Hierarchical Semi-Markov Conditional Random Fields |
|
|
Abstract: |
Deep architecture such as hierarchical semi-Markov models is an important class of models for nested sequential data. However, the inference can be expensive for problems with arbitrary sequence length and depth. In this contribution, we propose a new approximation technique that may have the potential to achieve sub-cubic time complexity in both length and depth, at the cost of some controllable loss of quality. The idea is based on two well-known methods: Gibbs sampling and Rao-Blackwellisation. We provide some simulation-based evaluation of the quality of the RGBS with respect to run time and sequence length. |
|
|
Author Names: |
Truyen Tran*, Curtin Uni of Technology |
|
|
Files: |
|
|
Title: |
Learning in the Deep-Structured Conditional Random Fields |
|
|
Abstract: |
We have proposed the deep-structured conditional random fields (CRFs) for sequential labeling and classification recently. The core of this model is its deep structure and its discriminative nature. This paper outlines the learning strategies and algorithms we have developed for the deep-structured CRFs, with a focus on the new strategy that combines the layer-wise unsupervised pre-training using entropy-based multi-objective optimization and the conditional likelihood-based back-propagation fine tuning, as inspired by the recent development in learning deep belief networks. |
|
|
Author Names: |
Dong Yu*, Microsoft Research |
|
|
Files: |
|
|
Title: |
A Hierarchy of Recurrent Networks for Speech Recognition |
|
|
Abstract: |
Generative models for sequential data based
on directed graphs of Restricted Boltzmann
Machines (RBMs) are able to accurately model
high dimensional sequences as recently shown. In
these models, temporal dependencies in the input
are discovered by either buffering previous
visible variables or by recurrent connections of
the hidden variables. Here we propose a
modification of these models, the Temporal
Reservoir Machine (TRM). It utilizes a recurrent
artificial neural network (ANN) for integrating
information from the input over |
|
|
Author Names: |
Benjamin Schrauwen*, Ghent University |
|
|
Files: |
|
|
Title: |
Competitive Learning for Deep Temporal Networks |
|
|
Abstract: |
We propose the use of competitive learning in deep networks for understanding sequential data. Hierarchies of competitive learning algorithms have been found in the brain [1] and their use in deep vision networks has been validated [2]. The algorithm is simple to comprehend and yet provides fast, sparse learning. To understand temporal patterns we use the depth of the network and delay blocks to encode time. The delayed feedback from higher layers provides meaningful predictions to lower layers. We evaluate a multi-factor network design by using it to predict frames in movies it has never seen before. At this task our system outperforms the prediction of the Recurrent Temporal Restricted Boltzmann Machine [3] on novel frame changes. |
|
|
Author Names: |
Robert Gens*, University of Washington |
|
|
Files: |
|
|
Title: |
A Deep Learning Architecture Comprising Homogeneous Cortical Circuits for Scalable Spatiotemporal Pattern Inference |
|
|
Abstract: |
A key challenge associated with the design of scalable deep learning architectures pertains to efficiently capturing spatiotemporal dependencies in a scalable framework that is modality independent. This paper presents a novel discriminative deep learning architecture, which relies on an identical cortical circuit populating the hierarchical structure. Belief states formed across the hierarchy intrinsically capture sequences of patterns, rather than static patterns, thereby facilitating the embedding of temporal dependencies. At the core of the adaptation mechanism are two learned constructs, one of which relies on a fast and stable incremental clustering. Moreover, the proposed methodology does not require layer-by-layer training and lends itself naturally to massively-parallel processing platforms. A simple test case demonstrates the validity of the architecture and learning algorithm. The system can be efficiently applied to various modalities, including those associated with complex visual and audio information representation. |
|
|
Author Names: |
Itamar Arel*, University of Tennessee |
|
|
Files: |
|
|
Title: |
Deep Belief Networks for phone recognition |
|
|
Abstract: |
Hidden Markov Models (HMMs) have been the
state-of-the-art techniques for acoustic
modeling despite their unrealistic independence
assumptions and the very limited
representational capacity of their hidden
states. There are many proposals in the research
community for deeper models that are capable of
modeling the many types of variability present
in the speech generation process. Deep Belief
Networks (DBNs) have recently proved to be very
effective for a variety of machine learning
problems and this paper applies DBNs to acoustic
modeling. On the standard TIMIT corpus, DBNs
consistently |
|
|
Author Names: |
Abdel-rahman Mohamed*, University of Toronto |
|
|
Files: |
|
|
Title: |
Deep Learning For Semantic Parsing |
|
|
Abstract: |
Recently, Poon and Domingos (2009) developed the first approach for unsupervised semantic parsing, the USP system. They applied it to extracting a knowledge base from biomedical abstracts for question answering and showed that it substantially outperforms state-of-the-art systems such as TextRunner and DIRT. In this paper, we show that USP can be viewed as learning a deep network for semantic parsing. The hidden units in the network represent clusters of meaning expressions, whereas the visible units represent dependency trees of input sentences. USP starts with a network where each atomic expression has its own cluster, and learns the final architecture by incrementally combining hidden units to abstract away syntactic and lexical variations of the same meaning. USP can be naturally generalized to a new approach for deep learning based on structure search; we discuss the implications of this. |
|
|
Author Names: |
Hoifung Poon*, Univ. of Washington (CSE) |
|
|
Files: |
|
|
Title: |
Neural conditional random fields |
|
|
Abstract: |
We propose a non-linear graphical model for
structured prediction. It combines the power of
deep networks to extract high level features
with the graphical framework of Markov networks,
yielding a powerful and scalable model that we
apply to signal labeling tasks. |
|
|
Author Names: |
Trinh-Minh-Tri Do, LIP6-UPMC |
|
|
Files: |
|
|
Title: |
A Multi-Objective Programming-Based Approach to Language Model Adaptation |
|
Abstract: |
The overall objective function of a MAP-based
language model (LM) adaptation technique is
implicitly a composition of two objective |
|
Author Names: |
Sibel Yaman*, ICSI |
| Files: | full paper in pdf |
|
Title: |
Deep learning for spoken language identification |
|
Abstract: |
Empirical results have shown that many spoken language identification systems based on hand-coded features perform poorly on small speech samples where a human would be successful. A hypothesis for this low performance is that the set of extracted features is insufficient. A deep architecture that learns features automatically is implemented and evaluated on several datasets. |
|
Author Names: |
SGrégoire Montavon, Berlin Institute of Technology |
| Files: | full paper in pdf |
|
Title: |
Unsupervised feature learning for audio classification using convolutional deep belief networks |
|
Abstract: |
In recent years, deep learning approaches have gained significant interest as a way of building hierarchical representations from unlabeled data. However, to our knowledge, these deep learning approaches have not been extensively studied for auditory data. In this paper, we apply convolutional deep belief networks to audio data and empirically evaluate them on various audio classification tasks. In the case of speech data, we show that the learned features correspond to phones/phonemes. In addition, our feature representations learned from unlabeled audio data show very good performance for multiple audio classification tasks. We hope that this paper will inspire more research on deep learning approaches applied to a wide range of audio recognition tasks. |
|
Author Names: |
Honglak Lee Yan Largman Peter Pham Andrew Y. Ng, Stanford University |
| Files: | full paper in pdf |