Results 11  20
of
51
Asymptotic model selection for directed networks with hidden variables
, 1996
"... We extend the Bayesian Information Criterion (BIC), an asymptotic approximation for the marginal likelihood, to Bayesian networks with hidden variables. This approximation can be used to select models given large samples of data. The standard BIC as well as our extension punishes the complexity of a ..."
Abstract

Cited by 46 (13 self)
 Add to MetaCart
We extend the Bayesian Information Criterion (BIC), an asymptotic approximation for the marginal likelihood, to Bayesian networks with hidden variables. This approximation can be used to select models given large samples of data. The standard BIC as well as our extension punishes the complexity of a model according to the dimension of its parameters. We argue that the dimension of a Bayesian network with hidden variables is the rank of the Jacobian matrix of the transformation between the parameters of the network and the parameters of the observable variables. We compute the dimensions of several networks including the naive Bayes model with a hidden root node. 1
Learning Bayesian Networks: A unification for discrete and Gaussian domains
 PROCEEDINGS OF ELEVENTH CONFERENCE ON UNCERTAINTY INARTI CIAL INTELLIGENCE
, 1995
"... We examine Bayesian methods for learning Bayesian networks from a combination of prior knowledge and statistical data. In particular, we unify the approaches we presented at last year's conference for discrete and Gaussian domains. We derive a general Bayesian scoring metric, appropriate for both do ..."
Abstract

Cited by 43 (5 self)
 Add to MetaCart
We examine Bayesian methods for learning Bayesian networks from a combination of prior knowledge and statistical data. In particular, we unify the approaches we presented at last year's conference for discrete and Gaussian domains. We derive a general Bayesian scoring metric, appropriate for both domains. We then use this metric in combination with wellknown statistical facts about the Dirichlet and normal{Wishart distributions to derive our metrics for discrete and Gaussian domains.
Active Learning of Causal Bayes Net Structure
, 2001
"... We propose a decision theoretic approach for deciding which interventions to perform so as to learn the causal structure of a model as quickly as possible. Without such interventions, it is impossible to distinguish between Markov equivalent models, even given infinite data. We perform online MCMC t ..."
Abstract

Cited by 36 (2 self)
 Add to MetaCart
We propose a decision theoretic approach for deciding which interventions to perform so as to learn the causal structure of a model as quickly as possible. Without such interventions, it is impossible to distinguish between Markov equivalent models, even given infinite data. We perform online MCMC to estimate the posterior over graph structures, and use importance sampling to find the best action to perform at each step. We assume the data is discretevalued and fully observed.
Userexpertise modeling with empirically derived probabilistic implication networks
, 1996
"... ..."
Theorybased causal inference
 In
, 2003
"... People routinely make sophisticated causal inferences unconsciously, effortlessly, and from very little data – often from just one or a few observations. We argue that these inferences can be explained as Bayesian computations over a hypothesis space of causal graphical models, shaped by strong top ..."
Abstract

Cited by 27 (2 self)
 Add to MetaCart
People routinely make sophisticated causal inferences unconsciously, effortlessly, and from very little data – often from just one or a few observations. We argue that these inferences can be explained as Bayesian computations over a hypothesis space of causal graphical models, shaped by strong topdown prior knowledge in the form of intuitive theories. We present two case studies of our approach, including quantitative models of human causal judgments and brief comparisons with traditional bottomup models of inference. 1
Likelihoods and Parameter Priors for Bayesian Networks
, 1995
"... We develop simple methods for constructing likelihoods and parameter priors for learning about the parameters and structure of a Bayesian network. In particular, we introduce several assumptions that permit the construction of likelihoods and parameter priors for a large number of Bayesiannetwork s ..."
Abstract

Cited by 23 (0 self)
 Add to MetaCart
We develop simple methods for constructing likelihoods and parameter priors for learning about the parameters and structure of a Bayesian network. In particular, we introduce several assumptions that permit the construction of likelihoods and parameter priors for a large number of Bayesiannetwork structures from a small set of assessments. The most notable assumption is that of likelihood equivalence, which says that data can not help to discriminate network structures that encode the same assertions of conditional independence. We describe the constructions that follow from these assumptions, and also present a method for directly computing the marginal likelihood of a random sample with no missing observations. Also, we show how these assumptions lead to a general framework for characterizing parameter priors of multivariate distributions. Keywords: Bayesian network, learning, likelihood equivalence, Dirichlet, normalWishart. 1 Introduction A Bayesian network is a graphical repres...
Learning Causal Networks from Data: A survey and a new algorithm for recovering possibilistic causal networks
, 1997
"... Introduction Reasoning in terms of cause and effect is a strategy that arises in many tasks. For example, diagnosis is usually defined as the task of finding the causes (illnesses) from the observed effects (symptoms). Similarly, prediction can be understood as the description of a future plausible ..."
Abstract

Cited by 19 (5 self)
 Add to MetaCart
Introduction Reasoning in terms of cause and effect is a strategy that arises in many tasks. For example, diagnosis is usually defined as the task of finding the causes (illnesses) from the observed effects (symptoms). Similarly, prediction can be understood as the description of a future plausible situation where observed effects will be in accordance with the known causal structure of the phenomenon being studied. Causal models are a summary of the knowledge about a phenomenon expressed in terms of causation. Many areas of the ap # This work has been partially supported by the Spanish Comission Interministerial de Ciencia y Tecnologia Project CICYTTIC96 0878. plied sciences (econometry, biomedics, engineering, etc.) have used such a term to refer to models that yield explanations, allow for prediction and facilitate planning and decision making. Causal reasoning can be viewed as inference guided by a causation theory. That kind of inference can be further specialised into induc
The markov assumption in spoken dialogue management
 In 6th SIGDial Workshop on Discourse and Dialogue
, 2005
"... The goal of dialogue management in a spoken dialogue system is to take actions based on observations and inferred beliefs. To ensure that the actions optimize the performance or robustness of the system, researchers have turned to reinforcement learning methods to learn policies for action selection ..."
Abstract

Cited by 13 (0 self)
 Add to MetaCart
The goal of dialogue management in a spoken dialogue system is to take actions based on observations and inferred beliefs. To ensure that the actions optimize the performance or robustness of the system, researchers have turned to reinforcement learning methods to learn policies for action selection. To derive an optimal policy from data, the dynamics of the system is often represented as a Markov Decision Process (MDP), which assumes that the state of the dialogue depends only on the previous state and action. In this paper, we investigate whether constraining the state space by the Markov assumption, especially when the structure of the state space may be unknown, truly affords the highest reward. In a simulation experiment conducted in the context of a dialogue system for interacting with a speechenabled web browser, models under the Markov assumption did not perform as well as an alternative model which attempts to classify the total reward with accumulating features. We discuss the implications of the study as well as limitations. 1
Knowing and reasoning in
 in College: Gender Related Patterns in Student’s Intellectual Development
, 1992
"... Modelling a decision support system for ..."
Fusion of Domain Knowledge with Data for Structural Learning in Object Oriented Domains
, 2003
"... When constructing a Bayesian network, it can be advantageous to employ structural learning algorithms to combine knowledge captured in databases with prior information provided by domain experts. Unfortunately, conventional learning algorithms do not easily incorporate prior information, if this inf ..."
Abstract

Cited by 9 (0 self)
 Add to MetaCart
When constructing a Bayesian network, it can be advantageous to employ structural learning algorithms to combine knowledge captured in databases with prior information provided by domain experts. Unfortunately, conventional learning algorithms do not easily incorporate prior information, if this information is too vague to be encoded as properties that are local to families of variables. For instance, conventional algorithms do not exploit prior information about repetitive structures, which are often found in object oriented domains such as computer networks, large pedigrees and genetic analysis.