Results 1 
2 of
2
Graphical Models and Variational Methods
, 2001
"... We review the use of variational methods of approximating inference and learning in probabilistic graphical models. In particular, we focus on variational approximations to the integrals required for Bayesian learning. For models in the conjugateexponential family, a generalisation of the EM algori ..."
Abstract

Cited by 37 (2 self)
 Add to MetaCart
We review the use of variational methods of approximating inference and learning in probabilistic graphical models. In particular, we focus on variational approximations to the integrals required for Bayesian learning. For models in the conjugateexponential family, a generalisation of the EM algorithm is derived that iterates between optimising hyperparameters of the distribution over parameters, and inferring the hidden variable distributions. These approximations make use of available propagation algorithms for probabilistic graphical models. We give two case studies of how the variational Bayesian approach can be used to learn model structure: inferring the number of clusters and dimensionalities in a mixture of factor analysers, and inferring the dimension of the state space of a linear dynamical system. Finally, importance sampling corrections to the variational approximations are discussed, along with their limitations.
Products of hidden markov models
 In Proceedings of Artificial Intelligence and Statistics
, 2001
"... We present products of hidden Markov models (PoHMM's), a way of combining HMM's to form a distributed state time series model. Inference in a PoHMM is tractable and efficient. Learning of the parameters, although intractable, can be effectively done using the Product of Experts learning rule. The di ..."
Abstract

Cited by 18 (4 self)
 Add to MetaCart
We present products of hidden Markov models (PoHMM's), a way of combining HMM's to form a distributed state time series model. Inference in a PoHMM is tractable and efficient. Learning of the parameters, although intractable, can be effectively done using the Product of Experts learning rule. The distributed state helps the model to explain data which has multiple causes, and the fact that each model need only explain part of the data means a PoHMM can capture longer range structure than an HMM is capable of. We show some results on modelling character strings, a simple language task and the symbolic family trees problem, which highlight these advantages.