Results 1  10
of
11
A Guide to the Literature on Learning Probabilistic Networks From Data
, 1996
"... This literature review discusses different methods under the general rubric of learning Bayesian networks from data, and includes some overlapping work on more general probabilistic networks. Connections are drawn between the statistical, neural network, and uncertainty communities, and between the ..."
Abstract

Cited by 172 (0 self)
 Add to MetaCart
This literature review discusses different methods under the general rubric of learning Bayesian networks from data, and includes some overlapping work on more general probabilistic networks. Connections are drawn between the statistical, neural network, and uncertainty communities, and between the different methodological communities, such as Bayesian, description length, and classical statistics. Basic concepts for learning and Bayesian networks are introduced and methods are then reviewed. Methods are discussed for learning parameters of a probabilistic network, for learning the structure, and for learning hidden variables. The presentation avoids formal definitions and theorems, as these are plentiful in the literature, and instead illustrates key concepts with simplified examples. Keywords Bayesian networks, graphical models, hidden variables, learning, learning structure, probabilistic networks, knowledge discovery. I. Introduction Probabilistic networks or probabilistic gra...
Inferring Parameters and Structure of Latent Variable Models by Variational Bayes
, 1999
"... Current methods for learning graphical models with latent variables and a fixed structure estimate optimal values for the model parameters. Whereas this approach usually produces overfitting and suboptimal generalization performance, carrying out the Bayesian program of computing the full posterior ..."
Abstract

Cited by 136 (1 self)
 Add to MetaCart
Current methods for learning graphical models with latent variables and a fixed structure estimate optimal values for the model parameters. Whereas this approach usually produces overfitting and suboptimal generalization performance, carrying out the Bayesian program of computing the full posterior distributions over the parameters remains a difficult problem. Moreover, learning the structure of models with latent variables, for which the Bayesian approach is crucial, is yet a harder problem. In this paper I present the Variational Bayes framework, which provides a solution to these problems. This approach approximates full posterior distributions over model parameters and structures, as well as latent variables, in an analytical manner without resorting to sampling methods. Unlike in the Laplace approximation, these posteriors are generally nonGaussian and no Hessian needs to be computed. The resulting algorithm generalizes the standard Expectation Maximization a...
Ensemble Learning For Independent Component Analysis
, 1999
"... In this paper, a recently developed Bayesian method called ensemble learning is applied to independent component analysis (ICA). Ensemble learning is a computationally efficient approximation for exact Bayesian analysis. In general, the posterior probability density function (pdf) is a complex high ..."
Abstract

Cited by 45 (4 self)
 Add to MetaCart
In this paper, a recently developed Bayesian method called ensemble learning is applied to independent component analysis (ICA). Ensemble learning is a computationally efficient approximation for exact Bayesian analysis. In general, the posterior probability density function (pdf) is a complex high dimensional function whose exact treatment is diffucult. In ensemble learning, the posterior pdf is approximated by a more simple function and KullbackLeibler information is used as the criterion for minimising the misfit between the actual posterior pdf and its parametric approximation. In this paper, the posterior pdf is approximated by a diagonal Gaussian pdf. According to the ICAmodel used in this paper, the measurements are generated by a linear mapping from mutually independent source signals whose distributions are mixtures of Gaussians. The measurements are also assumed to have additive Gaussian noise with diagonal covariance. The model structure and all parameters of the distribution...
Computing Upper and Lower Bounds on Likelihoods in Intractable Networks
, 1996
"... We present techniques for computing upper and lower bounds on the likelihoods of partial instantiations of variables in sigmoid and noisyOR networks. The bounds determine confidence intervals for the desired likelihoods and become useful when the size of the network (or clique size) precludes exa ..."
Abstract

Cited by 41 (10 self)
 Add to MetaCart
We present techniques for computing upper and lower bounds on the likelihoods of partial instantiations of variables in sigmoid and noisyOR networks. The bounds determine confidence intervals for the desired likelihoods and become useful when the size of the network (or clique size) precludes exact computations.
Learning linear, sparse, factorial codes
 A.I. Memo 1580, MIT Artificial Intelligence Lab
, 1996
"... In previous work (Olshausen & Field 1996), an algorithm was described for learning linear sparse codes which, when trained on natural images, produces a set of basis functions that are spatially localized, oriented, and bandpass (i.e., waveletlike). This note shows how the algorithm maybeinterpre ..."
Abstract

Cited by 24 (1 self)
 Add to MetaCart
In previous work (Olshausen & Field 1996), an algorithm was described for learning linear sparse codes which, when trained on natural images, produces a set of basis functions that are spatially localized, oriented, and bandpass (i.e., waveletlike). This note shows how the algorithm maybeinterpreted within a maximumlikelihood framework. Several useful insights emerge from this connection: it makes explicit the relation to statistical independence (i.e., factorial coding), it shows a formal relationship to the algorithm of Bell and Sejnowski (1995), and it suggests how to adapt parameters that were previously fixed.
Learning Generative Models with the UpPropagation Algorithm
, 1998
"... Uppropagation is an algorithm for inverting and learning neural network generative models. Sensory input is processed by inverting a model that generates patterns from hidden variables using topdown connections. The inversion process is iterative, utilizing a negative feedback loop that depends on ..."
Abstract

Cited by 6 (1 self)
 Add to MetaCart
Uppropagation is an algorithm for inverting and learning neural network generative models. Sensory input is processed by inverting a model that generates patterns from hidden variables using topdown connections. The inversion process is iterative, utilizing a negative feedback loop that depends on an error signal propagated by bottomup connections. The error signal is also used to learn the generative model from examples. The algorithm is benchmarked against principal component analysis in experiments on images of handwritten digits. In his doctrine of unconscious inference, Helmholtz argued that perceptions are formed by the interaction of bottomup sensory data with topdown expectations. According to one interpretation of this doctrine, perception is a procedure of sequential hypothesis testing. We propose a new algorithm, called uppropagation, that realizes this interpretation in layered neural networks. It uses topdown connections to generate hypotheses, and bottomup connect...
Recognition in hierarchical models
 Foundations of Computational Mathematics
, 1997
"... Abstract. Various proposals have recently been made which cast cortical processing in terms of hierarchical statistical generative models ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
Abstract. Various proposals have recently been made which cast cortical processing in terms of hierarchical statistical generative models
Towards the Understanding of Information Dynamics in Large Scale Networked Systems
"... Abstract – Large networks of human and machine systems have staggeringly complex properties which make them difficult to analyze. This resistance to characterization is due to the fact that the number of possible interactions between the system nodes is exponential in the number of nodes. This combi ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Abstract – Large networks of human and machine systems have staggeringly complex properties which make them difficult to analyze. This resistance to characterization is due to the fact that the number of possible interactions between the system nodes is exponential in the number of nodes. This combinatorial complexity makes such systems resistant to both formal analysis and empirical exploration. The goal of this work is to analyze a particular complex system, a system of agents that fuse information to develop shared beliefs. Primarily we seek to understand the detrimental emergent effects that might result in convergence to an incorrect belief, due to the large scale interactions of individuals. We achieve this through a two stage approach that combines the formal analysis of an abstracted version of the system with direct simulation of the system itself. The analysis of the abstraction of the system gives us a qualitative description of the system state space which can be used to guide and limit the parameter ranges over which the empirical evaluation is conducted. The particular abstraction that we use is to develop a mean field description of the system. Specifically, we assume that the influence of the remainder of the system on an individual can be replaced, for analysis purposes, with a system wide average influence. In our information propagation and fusion model, the team is connected via a network with some team members having access to sensors and others relying solely on neighbors in the network to inform their beliefs. Each agent uses Bayesian reasoning to maintain a belief about a single fact which can be true, false or unknown. Through our analysis we found that for certain parameter values, the system can converge to an incorrect belief despite primarily accurate information due to emergent effects. 1
Approximating Methods for
"... This thesis investigates various methods for carrying out approximate inference in intractable probabilistic models. By capturing the relationships between random variables, the framework of graphical models hints at which sets of random variables pose a problem to the inferential step. The approxim ..."
Abstract
 Add to MetaCart
This thesis investigates various methods for carrying out approximate inference in intractable probabilistic models. By capturing the relationships between random variables, the framework of graphical models hints at which sets of random variables pose a problem to the inferential step. The approximating techniques used in this thesis originate from the eld of statistical physics which for decades has been facing the same type of intractable computations when analyzing large systems of interacting variables e.g. magnetic spin systems. In general, these approximating techniques are known as mean eld methods.