Bayesian Nonparametric Hidden Markov Models

The Bayesian approach to statistical modelling is a consistent and intuitive framework for dealing with uncertainty about the world. In this approach, we encode any prior knowledge about variables (observed or unobserved) with the goal of inferring a posterior distribution over unobserved variables. The most common approaches to Bayesian modelling to date are the so-called parametric Bayesian models: these are specified with a finite number of unobserved variables. With vast amounts of data readily available today, these models generally fail to leverage a learning opportunity: no additional structure beyond that which was defined in the prior can be learned. Any increase in data passed into the model will only affect the accuracy of the inferred posteriors. Non-parametric Bayesian models address this problem: they are probabilistic models whose additional flexibility allows for learning the structure of complex datasets.

In this thesis we present new models and inference algorithms for non-parametric Bayesian models in the context of hidden Markov models. Our contribution is three-fold: we introduce for the first time, a family of algorithms for efficient and exact Monte Carlo inference in non-parametric Bayesian Markov models. Secondly, we apply non-parametric Bayesian hidden Markov models to the part-of-speech tagging problem in natural language processing. Thirdly, we introduce a new family of non-parametric Bayesian hidden Markov models with a factorial latent Markov chain structure.

More specifically, in chapter 1 we motivate nonparametric Bayesian models using a simple mixture model example and give an overview of the literature on Bayesian approaches to hidden Markov modelling. Chapter 2 presents an overview of the foundations for Bayesian non-parametric modelling by introducing a number of fundamental and well understood Bayesian non-parametric building blocks.

Using the building blocks introduced in chapter 2, chapter 3 describes a non-parametric extension to the hidden Markov model, called the infinite hidden Markov model (iHMM) and introduces a family of fast and exact Monte Carlo inference algorithms for this model. We also present an overview of extensions for the iHMM which exist in the literature while introducing some new ones.

Chapter 4 presents a case study on the iHMM in the area of natural language processing. In particular, we look at the task of unsupervised part-of-speech tagging. We compare the non-parametric Bayesian approach against its parametric counterpart and introduce an alternative way of evaluating any unsupervised part-of-speech tagger.

Our final chapter 5 introduces a new Bayesian non-parametric building block called the Markov IBP which we then use to build a non-parametric extension of the factorial hidden Markov model, called the infinite factorial hidden Markov model (iFHMM). We apply this model to the well-known cocktail party problem, where we separate the audio from an arbitrary number of speakers using a limited number of microphones.

Given the important role of hidden Markov models in time series and sequence modeling, and the flexibility of nonparametric approaches, there is great potential for many future applications and extensions of non-parametric Bayesian hidden Markov models.

Details

TypePhdThesis
InstitutionUniversity of Cambridge
> Publications > Bayesian Nonparametric Hidden Markov Models