Results

**1 - 2**of**2**### A constraint based algorithm for learning Bayesian network structure from distributed data with overlapping variables

"... While there has been considerable research in learning Bayesian network structure from data, until recently most of this research assumed that every variable of interest may be jointly measured in a single dataset. In practice, however, it is often the case that researchers only have access to data ..."

Abstract
- Add to MetaCart

While there has been considerable research in learning Bayesian network structure from data, until recently most of this research assumed that every variable of interest may be jointly measured in a single dataset. In practice, however, it is often the case that researchers only have access to data that is distributed across multiple datasets, which share some variables, but have other unique variables. Tillman et al. [2008] proposed the ION algorithm for learning causal structure in these scenarios. The space complexity of ION, however, prevents its use in many cases where the number of variables of interest is not relatively small. We present the Distributed Causal Inference (DCI) algorithm, which is asymptotically correct and displays similar performance in practice, but has space complexity that is bounded by the number of structures that will be output, which no asymptotically correct algorithm can beat, and is thus scalable to a wider variety of distributed data cases. 1