|
Infer.NET user guide : Tutorials and examples
Factor analysis exampleThis tutorial shows how to implement Bayesian Factor Analysis in Infer.NET. Factor Analysis explains a high dimensional data matrix by linearly mapping it into a lower dimensional manifold. We will construct a probabilistic Factor Analysis model and illustrate how Automated Relevance Determination (ARD) can be used to automatically determine the most probable number of latent factors. Inference on Toy DataThe data in this example is purely fictitious. In fact the data matrix Y has been drawn from the generative model of a Factor Analysis model. Y = WX + ε. The purpose of the inference is to reconstruct the unknown mixing matrix W and the factor activations X. An important parameter for the inference is the number of factors which corresponds to the dimensionality of the lower dimensional manifold. Automatic Relevance DeterminationAn important component of the model definition is the prior on the mixing matrix W:
The entries of the RandomVariableArray W are Gaussian with mean 0 and precision Alpha[K], shared for each latent factor K. The prior parameters of the Gamma distribution Alpha[K] are chosen such that the prior expected mean of Alpha[K] is 1E6 which drives the variance of the entries in the mixing matrix to 0 and hence encourages factors to be switched off. In the light of data the posterior distribution of some of the Alpha[K] will decrease hence allowing for the factor to explain variance.
This mechanism automatically chooses a suitable number of active factors. Alternatively we could have determined the most probable number of factors using the model evidence, rerunning the inference with different numbers of factors. Matrix ProductThe remaining components of the model definition specify the latent factor activations x as well as the observation noise model tau.
Note in the last two lines InitialiseTo is used to set a user-defined, random initialization for W and x. This step is necessary to break the symmetry of the system. Using the standard uniform initialization W and x would not change. Running the ExampleAfter performing inference, the example code first evaluates the reconstruction error of the data (Mean Squared Error), then the posterior mean values of Alpha[K] are printed for each K. From the output it should be apparent that 6 or 7 of the factors are used (low precision) while the remaining ones are switched off (high precision).
|


