## Structure learning with independent non-identically distributed data

### Cached

### Download Links

Citations: | 3 - 2 self |

### BibTeX

@MISC{Tillman_structurelearning,

author = {Robert E. Tillman},

title = {Structure learning with independent non-identically distributed data},

year = {}

}

### OpenURL

### Abstract

There are well known algorithms for learning the structure of directed and undirected graphical models from data, but nearly all assume that the data consists of a single i.i.d. sample. In contexts such as fMRI analysis, data may consist of an ensemble of independent samples from a common data generating mechanism which may not have identical distributions. Pooling such data can result in a number of well known statistical problems so each sample must be analyzed individually, which offers no increase in power due to the presence of multiple samples. We show how existing constraint based methods can be modified to learn structure from the aggregate of such data in a statistically sound manner. The prescribed method is simple to implement and based on existing statistical methods employed in metaanalysis and other areas, but works surprisingly well in this context where there are increased concerns due to issues such as retesting. We report results for directed models, but the method given is just as applicable to undirected models. 1.

### Citations

1285 |
Causality: Models, Reasoning, and Inference
- Pearl
(Show Context)
Citation Context ... have been proposed which learn the complete set of structures that are consistent with every dataset. Since sets of independencies can imply further independencies, e.g. through the graphoid axioms (=-=Pearl, 2000-=-), these algorithms can exclude many more structures than only those that are inconsistent with the conditional independencies common to every dataset. Due to statistical errors, conditional independe... |

681 |
Statistical Methods for Research Workers
- Fisher
- 1925
(Show Context)
Citation Context ... been proposed for combining information from multiple p-values that account for the independent sources of evidence in a theoretically sound manner. One of the most well known is due to R.A. Fisher (=-=Fisher, 1950-=-). Fisher’s method requires computing the statistic TF which, as shown below, has a χ2 distribution with 2k degrees of freedom under the null hypothesis, where k is the number of p-values combined. TF... |

540 |
Causation, prediction, and search
- Spirtes, Glymour, et al.
- 1993
(Show Context)
Citation Context ... by the author(s)/owner(s). such as BIC. The well known constraint based PC algorithm is guaranteed to learn the correct directed acyclic graphical model (DAG) from data under reasonable assumptions (=-=Spirtes et al., 2000-=-) and is useful for datasets with hundreds of variables in general or thousands with an additional sparcity assumption (Kalisch & Bühlmann, 2007). The GES algorithm is a score-based alternative with t... |

176 | Optimal structure identification with greedy search
- Chickering
(Show Context)
Citation Context ...th hundreds of variables in general or thousands with an additional sparcity assumption (Kalisch & Bühlmann, 2007). The GES algorithm is a score-based alternative with the same asymptotic guarantees (=-=Chickering, 2002-=-), but in practice is useful only with datasets that have far fewer variables. Most existing structure learning algorithms assume that the data consists of a single i.i.d. sample. In some cases, howev... |

61 |
Some limit theorems in statistics
- Bahadur
- 1971
(Show Context)
Citation Context ...d has in general been more reliable than other similar methods for combining p-values (Lazar et al., 2002). In addition, Fisher’s method satisfies an optimality criterion known as Bahadur efficiency (=-=Bahadur, 1971-=-), which is related to the effective use of data as the number of samples increases (Lazar et al., 2002). We now briefly describe some of the most common competing methods. Tippett (1950) and Worsely ... |

53 | Estimating high-dimensional directed acyclic graphs with the pc-algorithm
- Kalisch, Buhlmann
- 2007
(Show Context)
Citation Context ...aphical model (DAG) from data under reasonable assumptions (Spirtes et al., 2000) and is useful for datasets with hundreds of variables in general or thousands with an additional sparcity assumption (=-=Kalisch & Bühlmann, 2007-=-). The GES algorithm is a score-based alternative with the same asymptotic guarantees (Chickering, 2002), but in practice is useful only with datasets that have far fewer variables. Most existing stru... |

45 | Conditional independence relations have no finite complete characterization - Studený - 1992 |

26 |
Combining brains: a survey of methods for statistical pooling of information. Neuroimage 2002
- NA, Luna, et al.
(Show Context)
Citation Context ...e of the data from all of the studies. We refer to such p-values as combined p-values. Such methods are used regularly in metanalysis (Sutton et al., 2000) and also for certain tasks in neuroimaging (=-=Lazar et al., 2002-=-). A significant advantage of these methods over some other data aggregation procedures is that they are not affected by differences in distribution across the different experiments, since they rely o... |

21 | The method of statistics - Tippett - 1931 |

17 |
Methods for meta-analysis in medical research
- Sutton, Abrams, et al.
- 2000
(Show Context)
Citation Context ...in a new p-value using the observed value, which is representative of the data from all of the studies. We refer to such p-values as combined p-values. Such methods are used regularly in metanalysis (=-=Sutton et al., 2000-=-) and also for certain tasks in neuroimaging (Lazar et al., 2002). A significant advantage of these methods over some other data aggregation procedures is that they are not affected by differences in ... |

11 | Random generation of Dags for graph drawing - Melançon, Dutour, et al. - 2000 |

9 | Integerating locally learned causal structures with overlapping variables - Tillman, Danks, et al. |

5 | The logit method for combining probabilities - Mudholkar, George - 1979 |

4 | The American soldier: Vol. 1. Adjustment during army life - Stouffer, Suchman, et al. - 1949 |

1 |
Learning bayesian network structure from distributed data with overlapping variables (Technical Report
- Tillman
- 2008
(Show Context)
Citation Context ...ests become less reliable. As a result, the algorithms have thus far only been useful for datasets with a few variables. We found that with 10 variables and N = 2500, the DCI algorithm, described in (=-=Tillman, 2008-=-), returns structures consistent with the data less than half of the time (out of 100 examples). Figure 8 plots the trend. We combined the Fisher combined p-value test with Percentage of runs with 0 c... |