Results 1 - 10
of
12
Continuous Sigmoidal Belief Networks Trained Using Slice Sampling
- Advances in Neural Information Processing Systems 9
"... Real-valued random hidden variables can be useful for modelling latent structure that explains correlations among observed variables. I propose a simple unit that adds zero-mean Gaussian noise to its input before passing it through a sigmoidal squashing function. Such units can produce a variety ..."
Abstract
-
Cited by 9 (2 self)
- Add to MetaCart
Real-valued random hidden variables can be useful for modelling latent structure that explains correlations among observed variables. I propose a simple unit that adds zero-mean Gaussian noise to its input before passing it through a sigmoidal squashing function. Such units can produce a variety of useful behaviors, ranging from deterministic to binary stochastic to continuous stochastic. I show how "slice sampling" can be used for inference and learning in top-down networks of these units and demonstrate learning on two simple problems. 1 Introduction A variety of unsupervised connectionist models containing discrete-valued hidden units have been developed. These include Boltzmann machines (Hinton and Sejnowski 1986), binary sigmoidal belief networks (Neal 1992) and Helmholtz machines (Hinton et al. 1995; Dayan et al. 1995). However, some hidden variables, such as translation or scaling in images of shapes, are best represented using continuous values. Continuous-valued Bolt...
Mix-nets: Factored Mixtures of Gaussians in Bayesian Networks with Mixed Continuous And Discrete Variables
, 2000
"... Recently developed techniques have made it possible to quickly learn accurate probability density functions from data in low-dimensional continuous spaces. In particular, mixtures of Gaussians can be fitted to data very quickly using an accelerated EM algorithm that employs multiresolution kd-trees ..."
Abstract
-
Cited by 7 (2 self)
- Add to MetaCart
Recently developed techniques have made it possible to quickly learn accurate probability density functions from data in low-dimensional continuous spaces. In particular, mixtures of Gaussians can be fitted to data very quickly using an accelerated EM algorithm that employs multiresolution kd-trees (Moore, 1999). In this paper, we propose a kind of Bayesian network in which low-dimensional mixtures of Gaussians over different subsets of the domain’s variables are combined into a coherent joint probability model over the entire domain. The network is also capable of modeling complex dependencies between discrete variables and continuous variables without requiring discretization of the continuous variables. We present efficient heuristic algorithms for automatically learning these networks from data, and perform comparative experiments illustrating how well these networks model real scientific data and synthetic data. We also briefly discuss some possible improvements to the networks, as well as possible applications.
Representing Probabilistic Rules with Networks of Gaussian Basis Functions
- MACHINE LEARNING
, 1995
"... There is great interest in understanding the intrinsic knowledge neural networks have acquired during training. Most work in this direction is focussed on the multi-layer perceptron architecture. The topic of this paper is networks of Gaussian basis functions which are used extensively as learning s ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
There is great interest in understanding the intrinsic knowledge neural networks have acquired during training. Most work in this direction is focussed on the multi-layer perceptron architecture. The topic of this paper is networks of Gaussian basis functions which are used extensively as learning systems in neural computation. We show that networks of Gaussian basis functions can be generated from simple probabilistic rules. Also, if appropriate learning rules are used, probabilistic rules can be extracted from trained networks. We present methods for the reduction of network complexity with the goal of obtaining concise and meaningful rules. We show how prior knowledge can be refined or supplemented using data by employing either a Bayesian approach, by a weighted combination of knowledge bases, or by generating artificial training data representing the prior knowledge. We validate our approach using a standard statistical data set.
Learning Bayesian belief networks with neural network estimators
- In Neural Information Processing Systems 9
, 1997
"... In this paper we propose a method for learning Bayesian belief networks from data. The method uses artificial neural networks as probability estimators, thus avoiding the need for making prior assumptions on the nature of the probability distributions governing the relationships among the participat ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
In this paper we propose a method for learning Bayesian belief networks from data. The method uses artificial neural networks as probability estimators, thus avoiding the need for making prior assumptions on the nature of the probability distributions governing the relationships among the participating variables. This new method has the potential for being applied to domains containing both discrete and continuous variables arbitrarily distributed. We compare the learning performance of this new method with the performance of the method proposed by Cooper and Herskovits in [10]. The experimental results show that, although the learning scheme based on the use of ANN estimators is slower, the learning accuracy of the two methods is comparable. y To appear in Advances in Neural Information Processing Systems, 1996. 1 Introduction Bayesian belief networks (BBN), often referred to as probabilistic networks, are a powerful formalism for representing and reasoning under uncertainty. This...
Gene Expression Data Analysis and Modeling
- Pacific Symposium on Biocomputing ’99 (PSB’99
, 1999
"... computational models may serve as a test bed for the development of these inference techniques. Only in such models can the dynamic behavior of many elements (trajectories, attractors) be unequivocally linked to a selected network architecture (wiring and rules). Beyond cluster analysis lies the mor ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
computational models may serve as a test bed for the development of these inference techniques. Only in such models can the dynamic behavior of many elements (trajectories, attractors) be unequivocally linked to a selected network architecture (wiring and rules). Beyond cluster analysis lies the more ambitious realm of genetic network inference: complete reverse D'HAESELEER, LIANG, SOMOGYI PSB99 TUTORIAL: GENE EXPRESSION ANALYSIS AND MODELING 3 engineering of the underlying regulatory interactions from the expression data, either using idealized models such as Boolean networks, or using more realistic continuous models. 1. Introduction to principles of network behavior in the Boolean network model 1.1. Multigenic & pleiotropic regulation: the basis of genetic networks From a strictly reductionist viewpoint, we may begin by asking "which gene underlies this disease?" or "with which molecule does this protein interact?". The resulting investigations have shown us that more often tha...
Fast Factored Density Estimation and Compression with Bayesian Networks
, 2002
"... my family-- especially my father, Donald. iv Abstract Many important data analysis tasks can be addressed by formulating them as probability estimation problems. For example, a popular general approach to automatic classification problems is to learn a probabilistic model of each class from data in ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
my family-- especially my father, Donald. iv Abstract Many important data analysis tasks can be addressed by formulating them as probability estimation problems. For example, a popular general approach to automatic classification problems is to learn a probabilistic model of each class from data in which the classes are known, and then use Bayes's rule with these models to predict the correct classes of other data for which they are not known. Anomaly detection and scientific discovery tasks can often be addressed by learning probability models over possible events and then looking for events to which these models assign low probabilities. Many data compression algorithms such as Huffman coding and arithmetic coding rely on probabilistic models of the data stream in order achieve high compression rates.
Multi-View 3-D Object Description with Uncertain Reasoning and Machine Learning
, 2001
"... xi Chapter 1. ..."
Variational inference for continuous sigmoidal Bayesian networks
- In Sixth International Workshop on Artificial Intelligence and Statistics
, 1996
"... Latent random variables can be useful for modelling covariance relationships between observed variables. The choice of whether specific latent variables ought to be continuous or discrete is often an arbitrary one. In a previous paper, I presented a "unit" that could adapt to be continuous or binary ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
Latent random variables can be useful for modelling covariance relationships between observed variables. The choice of whether specific latent variables ought to be continuous or discrete is often an arbitrary one. In a previous paper, I presented a "unit" that could adapt to be continuous or binary, as appropriate for the current problem, and showed how a Markov chain Monte Carlo method could be used for inference and parameter estimation in Bayesian networks of these units. In this paper, I develop a variational inference technique in the hope that it will prove to be more computationally efficient than Monte Carlo methods. After presenting promising inference results on a toy problem, I discuss why the variational technique does not work well for parameter estimation as compared to Monte Carlo.
Inference in Markov Blanket Networks
, 2000
"... Bayesian networks have been successfully used to model joint probabilities in many cases. When dealing with continuous variables and nonlinear relationships neural networks can be used to model conditional densities as part of a Bayesian network. However, doing inference can then be computational ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Bayesian networks have been successfully used to model joint probabilities in many cases. When dealing with continuous variables and nonlinear relationships neural networks can be used to model conditional densities as part of a Bayesian network. However, doing inference can then be computationally expensive. Also, information is implicitly passed backwards through neural networks, i.e. from their output to the input. Used in this "inverse" mode neural networks often perform suboptimal. We suggest a different type of model called Markov blanket model (MBM). Here the neural networks are used in the forward direction only. This gives advantages in speed and guarantees to match the performance of the underlying neural network on complete data. 1 Introduction Bayes nets (e.g. Heckerman (1995)) are models of the joint probability distribution of a set of variables fx i g N i=1 of the form p(x) = N Y i=1 p(x i jP i ): (1) where P i ` fx 1 ; : : : ; x i\Gamma1 g are the par...
Interpolating conditional density trees
- A. Darwiche, N. Friedman (Eds.), Uncertainty in Artificial Intelligence
, 2002
"... Joint distributions over many variables are frequently modeled by decomposing them into products of simpler, lower-dimensional conditional distributions, such as in sparsely connected Bayesian networks. However, automatically learning such models can be very computationally expensive when there are ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Joint distributions over many variables are frequently modeled by decomposing them into products of simpler, lower-dimensional conditional distributions, such as in sparsely connected Bayesian networks. However, automatically learning such models can be very computationally expensive when there are many datapoints and many continuous variables with complex nonlinear relationships, particularly when no good ways of decomposing the joint distribution are known a priori. In such situations, previous research has generally focused on the use of discretization techniques in which each continuous variable has a single discretization that is used throughout the entire network. In this paper, we present and compare a wide variety of tree-based algorithms for learning and evaluating conditional density estimates over continuous variables. These trees can be thought of as discretizations that vary according to the particular interactions being modeled; however, the density within a given leaf of the tree need not be assumed constant, and we show that such nonuniform leaf densities lead to more accurate density estimation. We have developed Bayesian network structure-learning algorithms that employ these tree-based conditional density representations, and we show that they can be used to practically learn complex joint probability models over dozens of continuous variables from thousands of datapoints. We focus on nding models that are simultaneously accurate, fast to learn, and fast to evaluate once they are learned.

