A Minimum Relative Entropy Principle for Learning and Acting
 J. Artif. Intell. Res. 2010
This paper proposes a method to construct an adaptive agent that is universal with respect to a given class of experts, where each expert is designed specifically for a particular environment. This adaptive control problem is formalized as the problem of minimizing the relative entropy of the adaptive agent from the expert that is most suitable for the unknown environment. If the agent is a passive observer, then the optimal solution is the wellknown Bayesian predictor. However, if the agent is active, then its past actions need to be treated as causal interventions on the I/O stream rather than normal probability conditions. Here it is shown that the solution to this new variational problem is given by a stochastic controller called the Bayesian control rule, which implements adaptive behavior as a mixture of experts. Furthermore, it is shown that under mild assumptions, the Bayesian control rule converges to the control law of the most suitable expert. 1.
Graphical models for inference under outcomedependent sampling
 STAT SCI 2010;25:368–87
, 2010
We consider situations where data have been collected such that the sampling depends on the outcome of interest and possibly further covariates, as for instance in casecontrol studies. Graphical models represent assumptions about the conditional independencies among the variables. By including a node for the sampling indicator, assumptions about sampling processes can be made explicit. We demonstrate how to read off such graphs whether consistent estimation of the association between exposure and outcome is possible. Moreover, we give sufficient graphical conditions for testing and estimating the causal effect of exposure on outcome. The practical use is illustrated with a number of examples.
Linking Granger Causality and the Pearl Causal Model with Settable Systems
Editor: The causal notions embodied in the concept of Granger causality have been argued to belong to a different category than those of Judea Pearl’s Causal Model, and so far their relation has remained obscure. Here, we demonstrate that these concepts are in fact closely linked by showing how each relates to straightforward notions of direct causality embodied in settable systems, an extension and refinement of the Pearl Causal Model designed to accommodate optimization, equilibrium, and learning. We then provide straightforward practical methods to test for direct causality using tests for Granger causality.
Exact Estimation of Multiple Directed Acyclic Graphs
Probability models based on directed acyclic graphs (DAGs) are widely used to make inferences and predictions concerning interplay in multivariate systems. In many applications, data are collected from related but nonidentical units whose DAGs may differ but are likely to share many features. Statistical estimation for multiple related DAGs appears extremely challenging since all graphs must be simultaneously acyclic. Recent work by Oyen and Lane (2013) avoids this problem by making the strong assumption that all units share a common ordering of the variables. In this paper we propose a novel Bayesian formulation for multiple DAGs and, requiring no assumptions on any ordering of the variables, we prove that the maximum a posteriori estimate is characterised as the solution to an integer linear program (ILP). Consequently exact estimation may be achieved using highly optimised techniques for ILP instances, including constraint propagation and cutting plane algorithms. Our framework permits a complex dependency structure on the collection of units, including group and subgroup structure. This dependency structure can itself be efficiently learned from data and a special case of our methodology provides a novel analogue of kmeans clustering for DAGs. Results on simulated data and fMRI data obtained from multiple subjects are presented.
Causal learning without DAGs
Causal learning methods are often evaluated in terms of their ability to discover a true underlying directed acyclic graph (DAG) structure. However, in general the true structure is unknown and may not be a DAG structure. We therefore consider evaluating causal learning methods in terms of predicting the effects of interventions on unseen test data. Given this task, we show that there exist a variety of approaches to modeling causality, generalizing DAGbased methods. Our experiments on synthetic and biological data indicate that some nonDAG models perform as well or better than DAGbased methods at causal prediction tasks.
Logic, Reasoning under Uncertainty and Causality
, 2010
A simple framework for reasoning under uncertainty and intervention is introduced. This is achieved in three steps. First, logic is restated in settheoretic terms to obtain a framework for reasoning under certainty. Second, this framework is extended to model reasoning under uncertainty. Finally, causal spaces are introduced and shown how they provide enough information to model knowledge containing causal information about the world. 1 Bayesian Probability Theory It is advantageous to endow plausibilities with an explanatory framework that has a logically intuitive appeal. Such a framework is Bayesian probability theory. Simply put, Bayesian probability theory is a framework that extends logic for reasoning under uncertainty. 1.1 Reasoning under Certainty Logic is the most important framework of reasoning (under certainty). Here, it is rephrased in settheoretic terms 1. As will be seen, this facilitates its extension to a framework for reasoning under uncertainty. Let Ω be a set of outcomes, which is assumed to be finite for simplicity. A subset A ⊂ Ω is an event. Let c, ∪ and ∩ be the setoperations of complement, union and intersection respectively. Let F be an algebra, i.e. a set of events obeying the axioms
JMLR Workshop and Conference Proceedings 6:177–190 NIPS 2008 workshop on causality Causal learning without DAGs
Causal learning methods are often evaluated in terms of their ability to discover a true underlying directed acyclic graph (DAG) structure. However, in general the true structure is unknown and may not be a DAG structure. We therefore consider evaluating causal learning methods in terms of predicting the effects of interventions on unseen test data. Given this task, we show that there exist a variety of approaches to modeling causality, generalizing DAGbased methods. Our experiments on synthetic and biological data indicate that some nonDAG models perform as well or better than DAGbased methods at causal prediction tasks.
Intelligence
In a criminal trial, a judge or jury needs to reach a conclusion about ‘what happened ’ based on the available evidence. Often this includes probabilistic evidence. Whereas Bayesian networks form a good tool for analysing evidence probabilistically, simply presenting the outcome of the network to a judge or jury does not allow them to make an informed decision. In this paper, we propose to combine Bayesian networks with a narrative approach to reasoning with legal evidence, the result of which allows a juror to reason with alternative scenarios while also incorporating probabilistic information. The proposed method aids both the construction and the understanding of Bayesian networks, using scenario schemes. We make three distinct contributions: (1) we propose to use scenario schemes to aid the construction of Bayesian networks, (2) we propose a method for producing scenarios in text form from the resulting networks and (3) we propose a format for reporting the alternative scenarios and their relations to the evidence (including strength). 1.
Extracting Legal Arguments from Forensic Bayesian Networks
Abstract Recent developments in the forensic sciences have confronted the field of legal reasoning with the new challenge of reasoning under uncertainty. Forensic results come with uncertainty and are described in terms of likelihood ratios and random match probabilities. The legal field is unfamiliar with numerical valuations of evidence, which has led to confusion and in some cases to serious miscarriages of justice. The cases of Lucia de B. in the Netherlands and Sally Clark in the UK are infamous examples where probabilistic reasoning has gone wrong with dramatic consequences. One way of structuring probabilistic information is in Bayesian networks(BNs). In this paper we explore a new method to identify legal arguments in forensic BNs. This establishes a formal connection between probabilistic and argumentative reasoning. Developing such a method is ultimately aimed at supporting legal experts in their decision making process.