Julia Aurélie Lasserre
In machine learning, probabilistic models are described as belonging to one of two categories: generative or discriminative. Generative models are built to understand how samples from a particular category were generated. The category chosen for a new data-point is the category whose model fits the point best. Discriminative models are concerned with defining the boundaries between the categories. The category chosen for a new data-point then depends on which side of the boundary it belongs to.
While both methods have different / complementary advantages, they cannot be merged in a straightforward way. The challenge we wish to undertake in this thesis is to find rigorous models blending these two approaches, and to show that it can help find good solutions to various problems.
Firstly, we will describe an original hybrid model that allows an elegant blend of generative and discriminative approaches. We will show:
- how a hybrid approach can lead to better classification performance when most of the available data is unlabelled,
- how to make the optimal trade-off between the generative and discriminative extremes,
- and how the amount of labelled data influences the best model,
by applying this framework on various data-sets to perform semi supervised classification.
Secondly, we will present a hybrid approximation of the belief propagation algorithm, that helps optimise a Markov random field of high cardinality. The contributions of this thesis on this issue are two-folded:
- a hybrid generative / discriminative method Hybrid BP that significantly reduces the state space of each node in the Markov random field,
- and an effective method for learning the parameterswhich exploits the memory savings provided by Hybrid BP.
We will then see how reducing the memory needs and allowing the Markov random field to take higher cardinality was useful.
|Institution||University of Cambridge|