Results 11 - 20
of
110
Local Learning in Probabilistic Networks With Hidden Variables
, 1995
"... Probabilistic networks, which provide compact descriptions of complex stochastic relationships among several random variables, are rapidly becoming the tool of choice for uncertain reasoning in artificial intelligence. We show that networks with fixed structure containing hidden variables can be lea ..."
Abstract
-
Cited by 68 (4 self)
- Add to MetaCart
Probabilistic networks, which provide compact descriptions of complex stochastic relationships among several random variables, are rapidly becoming the tool of choice for uncertain reasoning in artificial intelligence. We show that networks with fixed structure containing hidden variables can be learned automatically from data using a gradient-descent mechanism similar to that used in neural networks. We also extend the method to networks with intensionally represented distributions, including networks with continuous variables and dynamic probabilistic networks. Because probabilistic networks provide explicit representations of causal structure, human experts can easily contribute prior knowledge to the training process, thereby significantly improving the learning rate. Adaptive probabilistic networks (APNs) may soon compete directly with neural networks as models in computational neuroscience as well as in industrial and financial applications. 1 Introduction Intelligent systems, ...
ANCESTRAL GRAPH MARKOV MODELS
, 2002
"... This paper introduces a class of graphical independence models that is closed under marginalization and conditioning but that contains all DAG independence models. This class of graphs, called maximal ancestral graphs, has two attractive features: there is at most one edge between each pair of verti ..."
Abstract
-
Cited by 58 (16 self)
- Add to MetaCart
This paper introduces a class of graphical independence models that is closed under marginalization and conditioning but that contains all DAG independence models. This class of graphs, called maximal ancestral graphs, has two attractive features: there is at most one edge between each pair of vertices; every missing edge corresponds to an independence relation. These features lead to a simple parameterization of the corresponding set of distributions in the Gaussian case.
Stable Local Computation with Conditional Gaussian Distributions
- Statistics and Computing
, 1999
"... : This article describes a propagation scheme for Bayesian networks with conditional Gaussian distributions that does not have the numerical weaknesses of the scheme derived in Lauritzen (1992). The propagation architecture is that of Lauritzen and Spiegelhalter (1988). In addition to the means and ..."
Abstract
-
Cited by 48 (0 self)
- Add to MetaCart
: This article describes a propagation scheme for Bayesian networks with conditional Gaussian distributions that does not have the numerical weaknesses of the scheme derived in Lauritzen (1992). The propagation architecture is that of Lauritzen and Spiegelhalter (1988). In addition to the means and variances provided by the previous algorithm, the new propagation scheme yields full local marginal distributions. The new scheme also handles linear deterministic relationships between continuous variables in the network specification. The new propagation scheme is in many ways faster and simpler than previous schemes and the method has been implemented in the most recent version of the HUGIN software. Key words: Artificial intelligence, Bayesian networks, CG distributions, Gaussian mixtures, probabilistic expert systems, propagation of evidence. 1 Introduction Bayesian networks have developed into an important tool for building systems for decision support in environments characterized by...
Nonuniform Dynamic Discretization in Hybrid Networks
- In Proc. UAI
, 1997
"... We consider probabilistic inference in general hybrid networks, which include continuous and discrete variables in an arbitrary topology. We reexamine the question of variable discretization in a hybrid network aiming at minimizing the information loss induced by the discretization. We show that a n ..."
Abstract
-
Cited by 47 (3 self)
- Add to MetaCart
We consider probabilistic inference in general hybrid networks, which include continuous and discrete variables in an arbitrary topology. We reexamine the question of variable discretization in a hybrid network aiming at minimizing the information loss induced by the discretization. We show that a nonuniform partition across all variables as opposed to uniform partition of each variable separately reduces the size of the data structures needed to represent a continuous function. We also provide a simple but efficient procedure for nonuniform partition. To represent a nonuniform discretization in the computer memory, we introduce a new data structure, which we call a Binary Split Partition (BSP) tree. We show that BSP trees can be an exponential factor smaller than the data structures in the standard uniform discretization in multiple dimensions and show how the BSP trees can be used in the standard join tree algorithm. We show that the accuracy of the inference process can be significa...
Stratified Exponential Families: Graphical Models and Model Selection
- Annals of Statistics
, 1998
"... We provide a classification of graphical models according to their representation as exponential families. Undirected graphical models with no hidden variables are linear exponential families (LEFs), directed acyclic graphical (DAG) models and chain graphs with no hidden variables, including DAG mod ..."
Abstract
-
Cited by 41 (3 self)
- Add to MetaCart
We provide a classification of graphical models according to their representation as exponential families. Undirected graphical models with no hidden variables are linear exponential families (LEFs), directed acyclic graphical (DAG) models and chain graphs with no hidden variables, including DAG models with several families of local distributions, are curved exponential families (CEFs) and graphical models with hidden variables are stratified exponential families (SEFs). A SEF is a finite union of CEFs of various dimensions satisfying some regularity conditions. The main results of this paper are that graphical models are SEFs and that many graphical models are not CEFs. That is, roughly speaking, graphical models when viewed as exponential families correspond to a set of smooth manifolds of various dimensions and usually not to a single smooth manifold. These results are discussed in the context of model selection. Keywords : Bayesian networks, graphical models, hidden variables, cur...
A variational approximation for Bayesian networks with discrete and continuous latent variables
- In UAI
, 1999
"... We show how to use a variational approximation to the logistic function to perform approximate inference in Bayesian networks containing discrete nodes with continuous parents. Essentially, we convert the logistic function to a Gaussian, which facilitates exact inference, and then iteratively adjust ..."
Abstract
-
Cited by 39 (6 self)
- Add to MetaCart
We show how to use a variational approximation to the logistic function to perform approximate inference in Bayesian networks containing discrete nodes with continuous parents. Essentially, we convert the logistic function to a Gaussian, which facilitates exact inference, and then iteratively adjust the variational parameters to improve the quality of the approximation. We demonstrate experimentally that this approximation is much faster than sampling, but comparable in accuracy. We also introduce a simple new technique for handling evidence, which allows us to handle arbitrary distributionson observed nodes, as well as achieving a significant speedup in networks with discrete variables of large cardinality. 1
An Alternative Markov Property for Chain Graphs
- Scand. J. Statist
, 1996
"... Graphical Markov models use graphs, either undirected, directed, or mixed, to represent possible dependences among statistical variables. Applications of undirected graphs (UDGs) include models for spatial dependence and image analysis, while acyclic directed graphs (ADGs), which are especially conv ..."
Abstract
-
Cited by 36 (4 self)
- Add to MetaCart
Graphical Markov models use graphs, either undirected, directed, or mixed, to represent possible dependences among statistical variables. Applications of undirected graphs (UDGs) include models for spatial dependence and image analysis, while acyclic directed graphs (ADGs), which are especially convenient for statistical analysis, arise in such fields as genetics and psychometrics and as models for expert systems and Bayesian belief networks. Lauritzen, Wermuth, and Frydenberg (LWF) introduced a Markov property for chain graphs, which are mixed graphs that can be used to represent simultaneously both causal and associative dependencies and which include both UDGs and ADGs as special cases. In this paper an alternative Markov property (AMP) for chain graphs is introduced, which in some ways is a more direct extension of the ADG Markov property than is the LWF property for chain graph. 1 INTRODUCTION Graphical Markov models use graphs, either undirected, directed, or mixed, to represent...
Chain Graph Models and their Causal Interpretations
- B
, 2001
"... Chain graphs are a natural generalization of directed acyclic graphs (DAGs) and undirected graphs. However, the apparent simplicity of chain graphs belies the subtlety of the conditional independence hypotheses that they represent. There are a number of simple and apparently plausible, but ultim ..."
Abstract
-
Cited by 32 (4 self)
- Add to MetaCart
Chain graphs are a natural generalization of directed acyclic graphs (DAGs) and undirected graphs. However, the apparent simplicity of chain graphs belies the subtlety of the conditional independence hypotheses that they represent. There are a number of simple and apparently plausible, but ultimately fallacious interpretations of chain graphs that are often invoked, implicitly or explicitly. These interpretations also lead to awed methods for applying background knowledge to model selection. We present a valid interpretation by showing how the distribution corresponding to a chain graph may be generated as the equilibrium distribution of dynamic models with feedback. These dynamic interpretations lead to a simple theory of intervention, extending the theory developed for DAGs. Finally, we contrast chain graph models under this interpretation with simultaneous equation models which have traditionally been used to model feedback in econometrics. Keywords: Causal model; cha...
Chain Graphs for Learning
- In Uncertainty in Artificial Intelligence
, 1995
"... Chain graphs combine directed and undirected graphs and their underlying mathematics combines properties of the two. This paper gives a simplified definition of chain graphs based on a hierarchical combination of Bayesian (directed) and Markov (undirected) networks. Examples of a chain graph are mul ..."
Abstract
-
Cited by 24 (2 self)
- Add to MetaCart
Chain graphs combine directed and undirected graphs and their underlying mathematics combines properties of the two. This paper gives a simplified definition of chain graphs based on a hierarchical combination of Bayesian (directed) and Markov (undirected) networks. Examples of a chain graph are multivariate feed-forward networks, clustering with conditional interaction between variables, and forms of Bayes classifiers. Chain graphs are then extended using the notation of plates so that samples and data analysis problems can be represented in a graphical model as well. Implications for learning are discussed in the conclusion. 1 Introduction Probabilistic networks are a notational device that allow one to abstract forms of probabilistic reasoning without getting lost in the mathematical detail of the underlying equations. They offer a framework whereby many forms of probabilistic reasoning can be combined and performed on probabilistic models without careful hand programming. Efforts ...
Automated Rhythm Transcription
- In Proc. Int. Symposium on Music Inform. Retriev. (ISMIR
, 2001
"... We present a technique that, given a sequence of musical note onset times, performs simultaneous identification of the norated rhythm and the variable tempo associated with the times. Our formulation is probabilistic: We develop a stochastic model for the interconnected evolution of a rhythm process ..."
Abstract
-
Cited by 23 (0 self)
- Add to MetaCart
We present a technique that, given a sequence of musical note onset times, performs simultaneous identification of the norated rhythm and the variable tempo associated with the times. Our formulation is probabilistic: We develop a stochastic model for the interconnected evolution of a rhythm process, a tempo process, and an observable process. This model allows the globally optimal identification of the most likely rhythm and tempo sequence, given the observed onset times. We demonstrate applications to a sequence of times derived from a sampled audio file and to MIDI data.

