Results 1  10
of
14
Variational Approximations between Mean Field Theory and the Junction Tree Algorithm
 In Uncertainty in Artificial Intelligence
, 2000
"... Recently, variational approximations such as the mean field approximation have received much interest. We extend the standard mean field method by using an approximating distribution that factorises into cluster potentials. This includes undirected graphs, directed acyclic graphs and junction ..."
Abstract

Cited by 48 (1 self)
 Add to MetaCart
Recently, variational approximations such as the mean field approximation have received much interest. We extend the standard mean field method by using an approximating distribution that factorises into cluster potentials. This includes undirected graphs, directed acyclic graphs and junction trees. We derive generalised mean field equations to optimise the cluster potentials. We show that the method bridges the gap between the standard mean field approximation and the exact junction tree algorithm. In addition, we address the problem of how to choose the structure and the free parameters of the approximating distribution. From the generalised mean field equations we derive rules to simplify the approximation in advance without affecting the potential accuracy of the model class. We also show how the method fits into some other variational approximations that are currently popular. 1 INTRODUCTION Graphical models, such as Bayesian networks, Markov fields, and Bolt...
A variational approach for approximating bayesian networks by edge deletion
 In Proceedings of the Twenty Second Conference on Uncertainty in Artificial Intelligence (UAI’06
"... We consider in this paper the formulation of approximate inference in Bayesian networks as a problem of exact inference on an approximate network that results from deleting edges (to reduce treewidth). We have shown in earlier work that deleting edges calls for introducing auxiliary network paramete ..."
Abstract

Cited by 13 (4 self)
 Add to MetaCart
We consider in this paper the formulation of approximate inference in Bayesian networks as a problem of exact inference on an approximate network that results from deleting edges (to reduce treewidth). We have shown in earlier work that deleting edges calls for introducing auxiliary network parameters to compensate for lost dependencies, and proposed intuitive conditions for determining these parameters. We have also shown that our earlier method corresponds to Iterative Belief Propagation (IBP) when enough edges are deleted to yield a polytree, and corresponds to some generalizations of IBP when fewer edges are deleted. In this paper, we propose a different criteria for determining auxiliary parameters based on optimizing the KL– divergence between the original and approximate networks. We discuss the relationship between the two methods for selecting parameters, shedding new light on IBP and its generalizations. We also discuss the application of our new method to approximating inference problems which are exponential in constrained treewidth, including MAP and nonmyopic value of information. 1
Positional Entropy During Pigeon Homing I: Application Of Bayesian Latent State Modelling
 J. Theor. Biol
"... Running headline: Positional Entropy in Bird Navigation I. ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
Running headline: Positional Entropy in Bird Navigation I.
Mean Field Inference in a General Probabilistic Setting
 Proceedings of The 7th International Workshop on Artificial Intelligence and
, 1999
"... We present a systematic, modelindependent formulation of mean field theory (MFT) as an inference method in probabilistic models. "Modelindependent" means that we do not assume a particular type of dependency among the variables of a domain but instead work in a general probabilistic setting. In a ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
We present a systematic, modelindependent formulation of mean field theory (MFT) as an inference method in probabilistic models. "Modelindependent" means that we do not assume a particular type of dependency among the variables of a domain but instead work in a general probabilistic setting. In a Bayesian network, for example, you may use arbitrary tables to specify conditional dependencies and thus run MFT in any Bayesian network. Furthermore, the general mean field equations derived here shed a light on the essence of MFT. MFT can be interpreted as a local iteration scheme which relaxes in a consistent state (a solution of the mean field equations). Iterating the mean field equations means propagating information through the network. In general, however, there are multiple solutions to the mean field equations. We show that improved approximations can be obtained by forming a weighted mixture of the multiple mean field solutions. Simple approximate expressions for the mixture weig...
Variational Bayesian learning of cooperative vector quantizer model – theory
, 2002
"... This is the first part of a twoparted report on development of a statistical learning algorithm for a latent variable model referred to as cooperative vector quantizer model. This part presents the theory and mathematical derivations of a variational Bayesian learning algorithm for the model. The m ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
This is the first part of a twoparted report on development of a statistical learning algorithm for a latent variable model referred to as cooperative vector quantizer model. This part presents the theory and mathematical derivations of a variational Bayesian learning algorithm for the model. The model has general applications in the field of machine learning and signal processing. For example it can be used to solve the problem of blind source separation or image separation. Our special interest is in its potential biological application in that we can use the model to simulate signal transduction components regulating gene expression as latent variables. The algorithm is capable of automatically and efficiently determining the number of latent variables of the model, estimating the distribution of the parameters and latent variables. Thus, we can use the model to address following biological questions regarding gene expression regulation: (1) What are the key signal transduction components regulating gene expression in a given kind of cell; (2) How many key components are needed to efficiently encode information for gene expression regulation; (3) What are the states of the key components for a given gene expression data point. Such information will provide insight for understanding the mechanism of information organization of cells, mechanism of diseases and drug effect/toxicity. 2
Mean field inference in dependency networks: An empirical study
 In Proceedings of the TwentyFifth National Conference on Artificial Intelligence
, 2011
"... Dependency networks are a compelling alternative to Bayesian networks for learning joint probability distributions from data and using them to compute probabilities. A dependency network consists of a set of conditional probability distributions, each representing the probability of a single variabl ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
Dependency networks are a compelling alternative to Bayesian networks for learning joint probability distributions from data and using them to compute probabilities. A dependency network consists of a set of conditional probability distributions, each representing the probability of a single variable given its Markov blanket. Running Gibbs sampling with these conditional distributions produces a joint distribution that can be used to answer queries, but suffers from the traditional slowness of samplingbased inference. In this paper, we observe that the mean field update equation can be applied to dependency networks, even though the conditional probability distributions may be inconsistent with each other. In experiments with learning and inference on 12 datasets, we demonstrate that mean field inference in dependency networks offers similar accuracy to Gibbs sampling but with orders of magnitude improvements in speed. Compared to Bayesian networks learned on the same data, dependency networks offer higher accuracy at greater amounts of evidence. Furthermore, mean field inference is consistently more accurate in dependency networks than in Bayesian networks learned on the same data.
On Similarities between Inference in Game Theory and Machine Learning
"... In this paper, we elucidate the equivalence between inference in game theory and machine learning. Our aim in so doing is to establish an equivalent vocabulary between the two domains so as to facilitate developments at the intersection of both fields, and as proof of the usefulness of this approach ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
In this paper, we elucidate the equivalence between inference in game theory and machine learning. Our aim in so doing is to establish an equivalent vocabulary between the two domains so as to facilitate developments at the intersection of both fields, and as proof of the usefulness of this approach, we use recent developments in each field to make useful improvements to the other. More specifically, we consider the analogies between smooth best responses in fictitious play and Bayesian inference methods. Initially, we use these insights to develop and demonstrate an improved algorithm for learning in games based on probabilistic moderation. That is, by integrating over the distribution of opponent strategies (a Bayesian approach within machine learning) rather than taking a simple empirical average (the approach used in standard fictitious play) we derive a novel moderated fictitious play algorithm and show that it is more likely than standard fictitious play to converge to a payoffdominant but riskdominated Nash equilibrium in a simple coordination game. Furthermore we consider the converse case, and show how insights from game theory can be used to derive two improved mean field variational learning algorithms. We
Meanfield methods for a special class of Belief Networks
 Journal of Artificial Intelligence
, 2001
"... The chief aim of this paper is to propose meanfield approximations for a broad class of Belief networks, of which sigmoid and noisyor networks can be seen as special cases. The approximations are based on a powerful meanfield theory suggested by Plefka. We show that Saul, Jaakkola, and Jordan's a ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
The chief aim of this paper is to propose meanfield approximations for a broad class of Belief networks, of which sigmoid and noisyor networks can be seen as special cases. The approximations are based on a powerful meanfield theory suggested by Plefka. We show that Saul, Jaakkola, and Jordan's approach is the first order approximation in Plefka 's approach, via a variational derivation. The application of Plefka's theory to belief networks is not computationally tractable. To tackle this problem we propose new approximations based on Taylor series. Small scale experiments show that the proposed schemes are attractive. 1.
Ensemble Coupled Hidden Markov Models for Joint Characterisation of Dynamic Signals
 In Ninth International Workshop on Artificial Intelligence and Statistics
, 2002
"... How does one model data with the aid of labels, when the labels themselves are noisy, unreliable and have their own dynamics? How does one measure interactions between variables that are so different in their nature that a direct comparison using, say crosscorrelations, is meaningless ? In th ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
How does one model data with the aid of labels, when the labels themselves are noisy, unreliable and have their own dynamics? How does one measure interactions between variables that are so different in their nature that a direct comparison using, say crosscorrelations, is meaningless ? In this paper these problems are approached using Coupled Hidden Markov Models which are estimated in the Variational Bayesian framework. Signals can be diverse since each chain has its own observation model. Signals can have their own dynamics and may temporally lag or lead one another by allowing linking edges in the network topology to be estimated and chosen according to the most probable posterior model. Integrated feature extraction and modelling is accomplished by providing the Markov models models with linear observations models. We derive Coupled Hidden Markov Models estimators, apply and compare them with sampling based approaches found in the literature.
Ensemble Hidden Markov Models for Biosignal Analysis
 14th International Conference on Digital Signal Processing
, 2002
"... Variational Learning theory allows the estimation of posterior probability distributions of model parameters, rather than the parameters themselves. We demonstrate the use of variational learning methods on Hidden Markov models with different observation models and apply the HMM to a range of biomed ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Variational Learning theory allows the estimation of posterior probability distributions of model parameters, rather than the parameters themselves. We demonstrate the use of variational learning methods on Hidden Markov models with different observation models and apply the HMM to a range of biomedical signals, such as EEG, periodic breathing and RRinterval series.