Results 1  10
of
22
Operations for Learning with Graphical Models
 Journal of Artificial Intelligence Research
, 1994
"... This paper is a multidisciplinary review of empirical, statistical learning from a graphical model perspective. Wellknown examples of graphical models include Bayesian networks, directed graphs representing a Markov chain, and undirected networks representing a Markov field. These graphical models ..."
Abstract

Cited by 247 (12 self)
 Add to MetaCart
This paper is a multidisciplinary review of empirical, statistical learning from a graphical model perspective. Wellknown examples of graphical models include Bayesian networks, directed graphs representing a Markov chain, and undirected networks representing a Markov field. These graphical models are extended to model data analysis and empirical learning using the notation of plates. Graphical operations for simplifying and manipulating a problem are provided including decomposition, differentiation, and the manipulation of probability models from the exponential family. Two standard algorithm schemas for learning are reviewed in a graphical framework: Gibbs sampling and the expectation maximization algorithm. Using these operations and schemas, some popular algorithms can be synthesized from their graphical specification. This includes versions of linear regression, techniques for feedforward networks, and learning Gaussian and discrete Bayesian networks from data. The paper conclu...
Student Assessment Using Bayesian Nets
 International Journal of HumanComputer Studies
, 1995
"... This paper will focus exclusively on the problem solving activity. The other activities are described in Martin and VanLehn (1993, in press). This section describes OLAE's input (student behavior) and output (assessment presentation), and the way that OLAE uses the behavioral data to calculate the a ..."
Abstract

Cited by 68 (9 self)
 Add to MetaCart
This paper will focus exclusively on the problem solving activity. The other activities are described in Martin and VanLehn (1993, in press). This section describes OLAE's input (student behavior) and output (assessment presentation), and the way that OLAE uses the behavioral data to calculate the assessments
Probabilistic conflicts in a search algorithm for estimating posterior probabilities in Bayesian networks
, 1996
"... This paper presents a search algorithm for estimating posterior probabilities in discrete Bayesian networks. It shows how conflicts (as used in consistencybased diagnosis) can be adapted to speed up the search. This algorithm is especially suited to the case where there are skewed distributions, al ..."
Abstract

Cited by 23 (6 self)
 Add to MetaCart
This paper presents a search algorithm for estimating posterior probabilities in discrete Bayesian networks. It shows how conflicts (as used in consistencybased diagnosis) can be adapted to speed up the search. This algorithm is especially suited to the case where there are skewed distributions, although nothing about the algorithm or the definitions depends on skewness of distributions. The general idea is to forward simulate the network, based on the `normal' values for each variable (the value with high probability given its parents). When a predicted value is at odds with the observations, we analyse which variables were responsible for the expectation failure  these form a conflict  and continue forward simulation considering different values for these variables. This results in a set of possible worlds from which posterior probabilities  together with error bounds  can be 1 derived. Empirical results with Bayesian networks having tens of thousands of nodes are presented.
Probabilistic Knowledge Bases
, 1992
"... We define a new fixpoint semantics for rulebased reasoning in the presence of imprecise information. We first demonstrate the need to have such a rulebased semantics by showing a realworld application requiring such reasoning. We then define this semantics. Optimizations and approximations of the ..."
Abstract

Cited by 22 (9 self)
 Add to MetaCart
We define a new fixpoint semantics for rulebased reasoning in the presence of imprecise information. We first demonstrate the need to have such a rulebased semantics by showing a realworld application requiring such reasoning. We then define this semantics. Optimizations and approximations of the semantics are shown so as to make the semantics amenable to very large scale realworld applications. We finally prove that the semantics is probabilistic and reduces to the usual fixpoint semantics of stratified Datalog if all information is certain. 2 Index Terms axiomatic probability theory, incomplete information, knowledge discovery in databases, logic programming, query optimization and approximation, stratified Datalog 3 1 Introduction Many real world problems cannot be described or solved by deterministic information because of inherent vagueness (e.g., see [4, 10]). We demonstrate the truth of this statement in a realworld application to which we have successfully applied th...
Probabilistic networks and explanatory coherence
 Cognitive Science Quarterly
, 2000
"... Causal reasoning can be understood qualitatively in terms of explanatory coherence or quantitatively in terms of probability theory. Comparison of these approaches can be done by looking at computational models, using my explanatory coherence networks and Pearl’s probabilistic ones. The explanatory ..."
Abstract

Cited by 15 (0 self)
 Add to MetaCart
Causal reasoning can be understood qualitatively in terms of explanatory coherence or quantitatively in terms of probability theory. Comparison of these approaches can be done by looking at computational models, using my explanatory coherence networks and Pearl’s probabilistic ones. The explanatory coherence program ECHO can be given a probabilistic interpretation, but there are many conceptual and computational problems that make it difficult to replace coherence networks by probabilistic ones. On the other hand, ECHO provides a psychologically plausible and computationally efficient model of some kinds of probabilistic causal reasoning. Hence coherence theory need not give way to probability theory as the basis for epistemology and decision making.
Propagating Imprecise Probabilities In Bayesian Networks
 Artificial Intelligence
, 1996
"... Often experts are incapable of providing `exact' probabilities; likewise, samples on which the probabilities in networks are based must often be small and preliminary. In such cases the probabilities in the networks are imprecise. The imprecision can be handled by secondorder probability distribu ..."
Abstract

Cited by 15 (5 self)
 Add to MetaCart
Often experts are incapable of providing `exact' probabilities; likewise, samples on which the probabilities in networks are based must often be small and preliminary. In such cases the probabilities in the networks are imprecise. The imprecision can be handled by secondorder probability distributions. It is convenient to use beta or Dirichlet distributions to express the uncertainty about probabilities. The problem of how to propagate point probabilities in a Bayesian network now is transformed into the problem of how to propagate Dirichlet distributions in Bayesian networks. It is shown that the propagation of Dirichlet distributions in Bayesian networks with incomplete data results in a system of probability mixtures of betabinomial and Dirichlet distributions. Approximate first order probabilities and their second order probability density functions are be obtained by stochastic simulation. A number of properties of the propagation of imprecise probabilities are discuss...
The Posterior Probability of Bayes Nets with Strong Dependences
 Soft Computing
, 1999
"... Stochastic independence is an idealized relationship located at one end of a continuum of values measuring degrees of dependence. Modeling real world systems, we are often not interested in the distinction between exact independence and any degree of dependence, but between weak ignorable and strong ..."
Abstract

Cited by 14 (1 self)
 Add to MetaCart
Stochastic independence is an idealized relationship located at one end of a continuum of values measuring degrees of dependence. Modeling real world systems, we are often not interested in the distinction between exact independence and any degree of dependence, but between weak ignorable and strong substantial dependence. Good models map significant deviance from independence and neglect approximate independence or dependence weaker than a noise threshold. This intuition is applied to learning the structure of Bayes nets from data. We determine the conditional posterior probabilities of structures given that the degree of dependence at each of their nodes exceeds a critical noise level. Deviance from independence is measured by mutual information. Arc probabilities are determined by the amount of mutual information the neighbors contribute to a node, is greater than a critical minimum deviance from independence. A Ø 2 approximation for the probability density function of mutual info...
Markov Chain MonteCarlo Algorithms for the Calculation of DempsterShafer Belief , technical report, in preparation
, 1994
"... A simple MonteCarlo algorithm can be used to calculate DempsterShafer belief very efficiently unless the conflict between the evidences is very high. This paper introduces and explores Markov Chain MonteCarlo algorithms for calculating DempsterShafer belief that can also work well when the confl ..."
Abstract

Cited by 10 (6 self)
 Add to MetaCart
A simple MonteCarlo algorithm can be used to calculate DempsterShafer belief very efficiently unless the conflict between the evidences is very high. This paper introduces and explores Markov Chain MonteCarlo algorithms for calculating DempsterShafer belief that can also work well when the conflict is high. 1
Time Series Learning with Probabilistic Network Composites
 University of Illinois
, 1998
"... The purpose of this research is to extend the theory of uncertain reasoning over time through integrated, multistrategy learning. Its focus is on decomposable, concept learning problems for classification of spatiotemporal sequences. Systematic methods of task decomposition using attributedriven m ..."
Abstract

Cited by 9 (9 self)
 Add to MetaCart
The purpose of this research is to extend the theory of uncertain reasoning over time through integrated, multistrategy learning. Its focus is on decomposable, concept learning problems for classification of spatiotemporal sequences. Systematic methods of task decomposition using attributedriven methods, especially attribute partitioning, are investigated. This leads to a novel and important type of unsupervised learning in which the feature construction (or extraction) step is modified to account for multiple sources of data and to systematically search for embedded temporal patterns. This modified technique is combined with traditional cluster definition methods to provide an effective mechanism for decomposition of time series learning problems. The decomposition process interacts with model selection from a collection of probabilistic models such as temporal artificial neural networks and temporal Bayesian networks. Models are chosen using a new quantitative (metricbased) approach that estimates expected performance of a learning architecture, algorithm, and mixture model on a newly defined subproblem. By mapping subproblems to customized configurations of probabilistic networks for time series learning, a hierarchical, supervised learning system with enhanced generalization quality can be automatically built. The system can improve data fusion
Optimal Monte Carlo Estimation of Belief Network Inference
 In Proceedings of the 12th Conference on Uncertainty in Artificial Intelligence
, 1996
"... We present two Monte Carlo sampling algorithms for probabilistic inference that guarantee polynomialtime convergence for a larger class of network than current sampling algorithms provide. These new methods are variants of the known likelihood weighting algorithm. We use of recent advances in ..."
Abstract

Cited by 9 (1 self)
 Add to MetaCart
We present two Monte Carlo sampling algorithms for probabilistic inference that guarantee polynomialtime convergence for a larger class of network than current sampling algorithms provide. These new methods are variants of the known likelihood weighting algorithm. We use of recent advances in the theory of optimal stopping rules for Monte Carlo simulation to obtain an inference approximation with relative error and a small failure probability . We present an empirical evaluation of the algorithms which demonstrates their improved performance. 1