Results 1 - 10
of
12
Variational Approximations between Mean Field Theory and the Junction Tree Algorithm
- In Uncertainty in Artificial Intelligence
, 2000
"... Recently, variational approximations such as the mean field approximation have received much interest. We extend the standard mean field method by using an approximating distribution that factorises into cluster potentials. This includes undirected graphs, directed acyclic graphs and junction ..."
Abstract
-
Cited by 33 (1 self)
- Add to MetaCart
Recently, variational approximations such as the mean field approximation have received much interest. We extend the standard mean field method by using an approximating distribution that factorises into cluster potentials. This includes undirected graphs, directed acyclic graphs and junction trees. We derive generalised mean field equations to optimise the cluster potentials. We show that the method bridges the gap between the standard mean field approximation and the exact junction tree algorithm. In addition, we address the problem of how to choose the structure and the free parameters of the approximating distribution. From the generalised mean field equations we derive rules to simplify the approximation in advance without affecting the potential accuracy of the model class. We also show how the method fits into some other variational approximations that are currently popular. 1 INTRODUCTION Graphical models, such as Bayesian networks, Markov fields, and Bolt...
A variational approach for approximating bayesian networks by edge deletion
- In Proceedings of the Twenty Second Conference on Uncertainty in Artificial Intelligence (UAI’06
"... We consider in this paper the formulation of approximate inference in Bayesian networks as a problem of exact inference on an approximate network that results from deleting edges (to reduce treewidth). We have shown in earlier work that deleting edges calls for introducing auxiliary network paramete ..."
Abstract
-
Cited by 9 (3 self)
- Add to MetaCart
We consider in this paper the formulation of approximate inference in Bayesian networks as a problem of exact inference on an approximate network that results from deleting edges (to reduce treewidth). We have shown in earlier work that deleting edges calls for introducing auxiliary network parameters to compensate for lost dependencies, and proposed intuitive conditions for determining these parameters. We have also shown that our earlier method corresponds to Iterative Belief Propagation (IBP) when enough edges are deleted to yield a polytree, and corresponds to some generalizations of IBP when fewer edges are deleted. In this paper, we propose a different criteria for determining auxiliary parameters based on optimizing the KL– divergence between the original and approximate networks. We discuss the relationship between the two methods for selecting parameters, shedding new light on IBP and its generalizations. We also discuss the application of our new method to approximating inference problems which are exponential in constrained treewidth, including MAP and nonmyopic value of information. 1
Positional Entropy During Pigeon Homing I: Application Of Bayesian Latent State Modelling
- J. Theor. Biol
"... Running headline: Positional Entropy in Bird Navigation I. ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
Running headline: Positional Entropy in Bird Navigation I.
Mean Field Inference in a General Probabilistic Setting
- Proceedings of The 7th International Workshop on Artificial Intelligence and
, 1999
"... We present a systematic, model-independent formulation of mean field theory (MFT) as an inference method in probabilistic models. "Model-independent" means that we do not assume a particular type of dependency among the variables of a domain but instead work in a general probabilistic setting. In a ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
We present a systematic, model-independent formulation of mean field theory (MFT) as an inference method in probabilistic models. "Model-independent" means that we do not assume a particular type of dependency among the variables of a domain but instead work in a general probabilistic setting. In a Bayesian network, for example, you may use arbitrary tables to specify conditional dependencies and thus run MFT in any Bayesian network. Furthermore, the general mean field equations derived here shed a light on the essence of MFT. MFT can be interpreted as a local iteration scheme which relaxes in a consistent state (a solution of the mean field equations). Iterating the mean field equations means propagating information through the network. In general, however, there are multiple solutions to the mean field equations. We show that improved approximations can be obtained by forming a weighted mixture of the multiple mean field solutions. Simple approximate expressions for the mixture weig...
Variational Bayesian learning of cooperative vector quantizer model – theory
, 2002
"... This is the first part of a two-parted report on development of a statistical learning algorithm for a latent variable model referred to as cooperative vector quantizer model. This part presents the theory and mathematical derivations of a variational Bayesian learning algorithm for the model. The m ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
This is the first part of a two-parted report on development of a statistical learning algorithm for a latent variable model referred to as cooperative vector quantizer model. This part presents the theory and mathematical derivations of a variational Bayesian learning algorithm for the model. The model has general applications in the field of machine learning and signal processing. For example it can be used to solve the problem of blind source separation or image separation. Our special interest is in its potential biological application in that we can use the model to simulate signal transduction components regulating gene expression as latent variables. The algorithm is capable of automatically and efficiently determining the number of latent variables of the model, estimating the distribution of the parameters and latent variables. Thus, we can use the model to address following biological questions regarding gene expression regulation: (1) What are the key signal transduction components regulating gene expression in a given kind of cell; (2) How many key components are needed to efficiently encode information for gene expression regulation; (3) What are the states of the key components for a given gene expression data point. Such information will provide insight for understanding the mechanism of information organization of cells, mechanism of diseases and drug effect/toxicity. 2
Ensemble Coupled Hidden Markov Models for Joint Characterisation of Dynamic Signals
- In Ninth International Workshop on Artificial Intelligence and Statistics
, 2002
"... How does one model data with the aid of labels, when the labels themselves are noisy, unreliable and have their own dynamics? How does one measure interactions between variables that are so different in their nature that a direct comparison using, say cross-correlations, is meaningless ? In th ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
How does one model data with the aid of labels, when the labels themselves are noisy, unreliable and have their own dynamics? How does one measure interactions between variables that are so different in their nature that a direct comparison using, say cross-correlations, is meaningless ? In this paper these problems are approached using Coupled Hidden Markov Models which are estimated in the Variational Bayesian framework. Signals can be diverse since each chain has its own observation model. Signals can have their own dynamics and may temporally lag or lead one another by allowing linking edges in the network topology to be estimated and chosen according to the most probable posterior model. Integrated feature extraction and modelling is accomplished by providing the Markov models models with linear observations models. We derive Coupled Hidden Markov Models estimators, apply and compare them with sampling based approaches found in the literature.
Ensemble Hidden Markov Models for Biosignal Analysis
- 14th International Conference on Digital Signal Processing
, 2002
"... Variational Learning theory allows the estimation of posterior probability distributions of model parameters, rather than the parameters themselves. We demonstrate the use of variational learning methods on Hidden Markov models with different observation models and apply the HMM to a range of biomed ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Variational Learning theory allows the estimation of posterior probability distributions of model parameters, rather than the parameters themselves. We demonstrate the use of variational learning methods on Hidden Markov models with different observation models and apply the HMM to a range of biomedical signals, such as EEG, periodic breathing and RR-interval series.
Mean-field methods for a special class of Belief Networks
- Journal of Artificial Intelligence
, 2001
"... The chief aim of this paper is to propose mean-field approximations for a broad class of Belief networks, of which sigmoid and noisy-or networks can be seen as special cases. The approximations are based on a powerful mean-field theory suggested by Plefka. We show that Saul, Jaakkola, and Jordan's a ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
The chief aim of this paper is to propose mean-field approximations for a broad class of Belief networks, of which sigmoid and noisy-or networks can be seen as special cases. The approximations are based on a powerful mean-field theory suggested by Plefka. We show that Saul, Jaakkola, and Jordan's approach is the first order approximation in Plefka 's approach, via a variational derivation. The application of Plefka's theory to belief networks is not computationally tractable. To tackle this problem we propose new approximations based on Taylor series. Small scale experiments show that the proposed schemes are attractive. 1.
Controlled hierarchical filtering: Model of neocortical sensory processing
- http://www.arxiv.org/abs/cs.NE/0308025. THE HC AND ITS ENVIRONMENT 23
, 2003
"... Abstract. A model of sensory information processing is presented. The model assumes that learning of internal (hidden) generative models, which can predict the future and evaluate the precision of that prediction, is of central importance for information extraction. Furthermore, the model makes a br ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Abstract. A model of sensory information processing is presented. The model assumes that learning of internal (hidden) generative models, which can predict the future and evaluate the precision of that prediction, is of central importance for information extraction. Furthermore, the model makes a bridge to goal-oriented systems and builds upon the structural similarity between the architecture of a robust controller and that of the hippocampal entorhinal loop. This generative control architecture is mapped to the neocortex and to the hippocampal entorhinal loop. Implicit memory phenomena; priming and prototype learning are emerging features of the model. Mathematical theorems ensure stability and attractive learning properties of the architecture. Connections to reinforcement learning are also established: both the control network, and the network with a hidden model converge to (near) optimal policy under suitable conditions. Falsifying predictions, including the role of the feedback connections between neocortical areas are made.
On Similarities between Inference in Game Theory and Machine Learning
"... In this paper, we elucidate the equivalence between inference in game theory and machine learning. Our aim in so doing is to establish an equivalent vocabulary between the two domains so as to facilitate developments at the intersection of both fields, and as proof of the usefulness of this approach ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
In this paper, we elucidate the equivalence between inference in game theory and machine learning. Our aim in so doing is to establish an equivalent vocabulary between the two domains so as to facilitate developments at the intersection of both fields, and as proof of the usefulness of this approach, we use recent developments in each field to make useful improvements to the other. More specifically, we consider the analogies between smooth best responses in fictitious play and Bayesian inference methods. Initially, we use these insights to develop and demonstrate an improved algorithm for learning in games based on probabilistic moderation. That is, by integrating over the distribution of opponent strategies (a Bayesian approach within machine learning) rather than taking a simple empirical average (the approach used in standard fictitious play) we derive a novel moderated fictitious play algorithm and show that it is more likely than standard fictitious play to converge to a payoff-dominant but risk-dominated Nash equilibrium in a simple coordination game. Furthermore we consider the converse case, and show how insights from game theory can be used to derive two improved mean field variational learning algorithms. We

