## Learning the Structure of Linear Latent Variable Models (2006)

### Cached

### Download Links

- [www.jmlr.org]
- [www.gatsby.ucl.ac.uk]
- [www.homepages.ucl.ac.uk]
- [www.hss.cmu.edu]
- [hss.cmu.edu]
- [www.statslab.cam.ac.uk]
- [www.hss.cmu.edu]
- [jmlr.org]
- [www.phil.cmu.edu]
- DBLP

### Other Repositories/Bibliography

Venue: | JOURNAL OF MACHINE LEARNING RESEARCH 7 (2006) 191--246 |

Citations: | 41 - 13 self |

### BibTeX

@ARTICLE{SIlva06learningthe,

author = {Ricardo SIlva and Richard Scheines and Clark Glymour and Peter Spirtes},

title = { Learning the Structure of Linear Latent Variable Models},

journal = {JOURNAL OF MACHINE LEARNING RESEARCH 7 (2006) 191--246},

year = {2006},

volume = {7},

pages = {2006}

}

### OpenURL

### Abstract

We describe anytime search procedures that (1) find disjoint subsets of recorded variables for which the members of each subset are d-separated by a single common unrecorded cause, if such exists; (2) return information about the causal relations among the latent factors so identified. We prove the procedure is point-wise consistent assuming (a) the causal relations can be represented by a directed acyclic graph (DAG) satisfying the Markov Assumption and the Faithfulness Assumption; (b) unrecorded variables are not caused by recorded variables; and (c) dependencies are linear. We compare the procedure with standard approaches over a variety of simulated structures and sample sizes, and illustrate its practical value with brief studies of social science data sets. Finally, we consider generalizations for non-linear systems.

### Citations

1117 |
Causality: Models, Reasoning, and Inference
- Pearl
- 2000
(Show Context)
Citation Context ...likelihood function might be very large. While, for instance, a Markov equivalence class for models with no latent variables can be neatly represented by a single graphical object known as “pattern” (=-=Pearl, 2000-=-; Spirtes et al., 2000), the same is not true for latent 1. Assuming T1 in this Figure is the true latent that entails the same conditional independencies. In Figure 3(b), T1 should correspond to L2. ... |

496 |
Causation, Prediction, and Search
- Spirtes, Glymour, et al.
- 1993
(Show Context)
Citation Context ...sis: the usual focus of automated search procedures for causal Bayes nets is on relations among observed variables. Loehlin’s comment overlooks Bayes net search procedures robust to latent variables (=-=Spirtes et al., 2000-=-) and heuristic approaches for learning networks with hidden nodes (Elidan et al., 2000), but the general sense of his comment is correct. For a kind of model widely used in applied sciences − “multip... |

227 |
Latent Variable Models and Factor Analysis
- Bartholomew
- 1987
(Show Context)
Citation Context ...ependencies. Better methods are needed. Yet the common view is that solving this problem is actually impossible, as illustrated by the closing words of a popular textbook on latent variable modeling (=-=Bartholomew and Knott, 1999-=-): When we come to models for relationships between latent variables we have reached a point where so much has to be assumed that one might justly conclude that the limits of scientific usefulness hav... |

220 | The Bayesian Structural EM Algorithm
- Friedman
- 1998
(Show Context)
Citation Context ... the proper complexity penalization for a candidate model (Spirtes et al., 2000). We suggest using the Bayesian Information Criterion (BIC) function as a score function. Using BIC with STRUCTURAL EM (=-=Friedman, 1998-=-) and GES results in a computationally efficient way of learning structural models, where the measurement model is fixed and GES is restricted to modify edges among latents only. Assuming a Gaussian d... |

159 | Optimal structure identification with greedy search - HEMMECKE, Chickering |

98 | Latent Variable Models: An Introduction to Factor, Path, and Structural Analysis - Loehlin - 1998 |

79 |
The mind’s arrows: Bayes nets and graphical causal models in psychology
- Glymour
- 2001
(Show Context)
Citation Context ... variable is determined informally based on the magnitude of the coefficients relating each observed variable to each latent. This is, by far, the most common method used in several applied sciences (=-=Glymour, 2002-=-). Social science methodology also contains various beam searches that begin with an initial latent variable model and iteratively add or delete dependencies in a greedy search guided by significance ... |

44 |
Graphical models, selecting causal and statistical models
- Meek
- 1997
(Show Context)
Citation Context ...nce class described in Theorem 15 and the Markov equivalence class of the structural model. 6.2 Score-Based Search Score-based approaches for learning the structure of Bayesian networks, such as GES (=-=Meek, 1997-=-; Chickering, 2002) are usually more accurate than PC or FCI when there are no omitted common causes, or in other terms, when the set of recorded variables is causally sufficient. We know of 7. One wa... |

42 | Beyond independent components: Trees and clusters - Bach, Jordan - 2003 |

40 | Discovering hidden variables: A structurebased approach - Elidan, Lotner, et al. - 2000 |

36 |
Structural equation models with latent variables
- Bollen
- 1989
(Show Context)
Citation Context ...ed to denote both a linear latent variable model and the corresponding latent variable graph. Linear latent variable models are ubiquitous in econometric, psychometric, and social scientific studies (=-=Bollen, 1989-=-), where they are usually known as structural equation models. Definition 2 (Measurement model) Given a linear latent variable model G, with vertex set V, the subgraph containing all vertices in V, an... |

28 | A discovery algorithm for directed cyclic graphs - Richardson - 1996 |

20 |
The Analysis and Interpretation of Multivariate Data for Social Scientists. Chapman & Hall/CRC
- Bartholomew, Steele, et al.
- 2002
(Show Context)
Citation Context ...uding latent variables), and then consider possibilities for generalization. 3. Related Work The traditional framework for discovering latent variables is factor analysis and its variants (see, e.g., =-=Bartholomew et al., 2002-=-). A number of factors is chosen based on some criterion such as the minimum number of factors that fit the data at a given significance level or the number that maximizes a score such as BIC. After f... |

18 | Probabilistic Reasoning in Expert Systems: Networks of Plausible Inference - Pearl - 1988 |

11 |
Learning measurement models for unobserved variables
- Silva, Scheines, et al.
- 2003
(Show Context)
Citation Context ...imators provide one possible causal model compatible with the data. To tackle issues of sound identifiability of causal structures, we previously developed an approach to learning measurement models (=-=Silva et al., 2003-=-). That procedure requires that the true underlying graph has a “pure” submodel with three measures for each latent variable, which is a strong and generally untestable assumption. That assumption is ... |

9 | Quantifier elimination for statistical problems
- Geiger, Meek
(Show Context)
Citation Context ...Bollen, 1990). Linear and non-linear models can imply other constraints on the correlation matrix, but general, feasible computational procedures to determine arbitrary constraints are not available (=-=Geiger and Meek, 1999-=-) nor are there any available statistical tests of good power for higher order constraints. Tetrad constraints therefore provide a practical way of distinguishing among possible candidate models, with... |

7 | Generalization of the Tetrad Representation Theorem - Shafer, Kogan, et al. - 1993 |

7 | Generalized measurement models - Silva, Scheines - 2004 |

7 | Sampling errors in the theory of two factors - Wishart - 1928 |

5 |
Outlier screening and a distribution-free test for vanishing tetrads
- Bollen
- 1990
(Show Context)
Citation Context ...4 (2) The constraints hold as well if covariances are substituted for correlations. Statistical tests for vanishing tetrad differences are available for a wide family of distributions (Wishart, 1928; =-=Bollen, 1990-=-). Linear and non-linear models can imply other constraints on the correlation matrix, but general, feasible computational procedures to determine arbitrary constraints are not available (Geiger and M... |

5 |
Social Statistics and Genuine Inquiry: Reflections on the Bell Curve.” Pp
- Glymour
- 1997
(Show Context)
Citation Context ... assumptions, of convergence to features of the true causal structure. The few simulation studies of the accuracy of these methods on finite samples with diverse causal structures are not reassuring (=-=Glymour, 1997-=-). The use of proxy scores with regression is demonstrably not consistent, and systematically overestimates dependencies. Better methods are needed. Yet the common view is that solving this problem is... |

5 | Stepwise variable selection in factor analysis - Kano, Harada |

5 | Automatic Discovery of Latent Variable Models - Silva - 2005 |

2 |
Independent factor analysis. Graphical Models: foundations of neural computation
- Attias
- 1999
(Show Context)
Citation Context ...influenced by multiple latent varibles and by other measured variables. In non-Gaussian cases, the usual methods are variations of independent component analysis, such as independent factor analysis (=-=Attias, 1999-=-) and tree-based component analysis (Bach and Jordan, 2003). These methods severely constrain the dependency structure among the latent vari195SILVA, SCHEINES, GLYMOUR AND SPIRTES ables. That facilit... |

2 | New d-separation identification results for learning continuous latent variable models
- Silva, Scheines
- 2005
(Show Context)
Citation Context ... called latent trait models in the literature (Bartholomew and Knott, 1999). Much larger sample sizes are required than for linear, Gaussian measured variables. In previous works (Silva et al., 2003; =-=Silva and Scheines, 2005-=-), we developed an approach to learn measurement models even when the functional relationships among latents are non-linear. In practice, that generality is of limited use because there are at present... |

1 | Probabilistic Reasoning in Expert Systems: Networks of Plausible Inference - SILVA, GLYMOUR, et al. - 1988 |