Results 1  10
of
15
A SINful approach to Gaussian graphical model selection
 Journal of Statistical Planning and Inference
"... Abstract. Multivariate Gaussian graphical models are defined in terms of Markov properties, i.e., conditional independences associated with the underlying graph. Thus, model selection can be performed by testing these conditional independences, which are equivalent to specified zeroes among certain ..."
Abstract

Cited by 25 (5 self)
 Add to MetaCart
Abstract. Multivariate Gaussian graphical models are defined in terms of Markov properties, i.e., conditional independences associated with the underlying graph. Thus, model selection can be performed by testing these conditional independences, which are equivalent to specified zeroes among certain (partial) correlation coefficients. For concentration graphs, covariance graphs, acyclic directed graphs, and chain graphs (both LWF and AMP), we apply Fisher’s ztransformation, ˇ Sidák’s correlation inequality, and Holm’s stepdown procedure, to simultaneously test the multiple hypotheses obtained from the Markov properties. This leads to a simple method for model selection that controls the overall error rate for incorrect edge inclusion. In practice, we advocate partitioning the simultaneous pvalues into three disjoint sets, a significant set S, an indeterminate set I, and a nonsignificant set N. Then our SIN model selection method selects two graphs, a graph whose edges correspond to the union of S and I, and a more conservative graph whose edges correspond to S only. Prior information about the presence and/or absence of particular edges can be incorporated readily. 1.
Iterative conditional fitting for Gaussian ancestral graph models
 In M. Chickering and J. Halpern (Eds.), Proceedings of the 20th Conference on Uncertainty in Artificial Intelligence
, 2004
"... Ancestral graph models, introduced by Richardson and Spirtes (2002), generalize both Markov random fields and Bayesian networks to a class of graphs with a global Markov property that is closed under conditioning and marginalization. By design, ancestral graphs encode precisely the conditional indep ..."
Abstract

Cited by 18 (6 self)
 Add to MetaCart
Ancestral graph models, introduced by Richardson and Spirtes (2002), generalize both Markov random fields and Bayesian networks to a class of graphs with a global Markov property that is closed under conditioning and marginalization. By design, ancestral graphs encode precisely the conditional independence structures that can arise from Bayesian networks with selection and unobserved (hidden/latent) variables. Thus, ancestral graph models provide a potentially very useful framework for exploratory model selection when unobserved variables might be involved in the datagenerating process but no particular hidden structure can be specified. In this paper, we present the Iterative Conditional Fitting (ICF) algorithm for maximum likelihood estimation in Gaussian ancestral graph models. The name reflects that in each step of the procedure a conditional distribution is estimated, subject to constraints, while a marginal distribution is held fixed. This approach is in duality to the wellknown Iterative Proportional Fitting algorithm, in which marginal distributions are fitted while conditional distributions are held fixed. 1
Binary models for marginal independence
 JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B
, 2005
"... A number of authors have considered multivariate Gaussian models for marginal independence. In this paper we develop models for binary data with the same independence structure. The models can be parameterized based on Möbius inversion and maximum likelihood estimation can be performed using a versi ..."
Abstract

Cited by 16 (2 self)
 Add to MetaCart
A number of authors have considered multivariate Gaussian models for marginal independence. In this paper we develop models for binary data with the same independence structure. The models can be parameterized based on Möbius inversion and maximum likelihood estimation can be performed using a version of the Iterated Conditional Fitting algorithm. The approach is illustrated on a simple example. Relations to multivariate logistic and dependence ratio models are discussed.
Covariance Chains
 Bernoulli
, 2006
"... Covariance matrices which can be arranged in tridiagonal form are called covariance chains. They are used to clarify some issues of parameter equivalence and of independence equivalence for linear models in which a set of latent variables influences a set of observed variables. For this purpose, ort ..."
Abstract

Cited by 12 (8 self)
 Add to MetaCart
Covariance matrices which can be arranged in tridiagonal form are called covariance chains. They are used to clarify some issues of parameter equivalence and of independence equivalence for linear models in which a set of latent variables influences a set of observed variables. For this purpose, orthogonal decompositions for covariance chains are derived first in explicit form. Covariance chains are also contrasted to concentration chains, for which estimation is explicit and simple. For this purpose, maximumlikelihood equations are derived first for exponential families when some parameters satisfy zero value constraints. From these equations explicit estimates are obtained, which are asymptotically efficient, and they are applied to covariance chains. Simulation results confirm the satisfactory behaviour of the explicit covariance chain estimates also in moderatesize samples.
The hidden life of latent variables: Bayesian learning with mixed graph models
, 2008
"... Directed acyclic graphs (DAGs) have been widely used as a representation of conditional independence in machine learning and statistics. Moreover, hidden or latent variables are often an important component of graphical models. However, DAG models suffer from an important limitation: the family of D ..."
Abstract

Cited by 7 (3 self)
 Add to MetaCart
Directed acyclic graphs (DAGs) have been widely used as a representation of conditional independence in machine learning and statistics. Moreover, hidden or latent variables are often an important component of graphical models. However, DAG models suffer from an important limitation: the family of DAGs is not closed under marginalization of hidden variables. This means that in general we cannot use a DAG to represent the independencies over a subset of variables in a larger DAG. Directed mixed graphs (DMGs) are a representation that includes DAGs as a special case, and overcomes this limitation. This paper introduces algorithms for performing Bayesian inference in Gaussian and probit DMG models. An important requirement for inference is the characterization of the distribution over parameters of the models. We introduce a new distribution for covariance matrices of Gaussian DMGs. We discuss and illustrate how several Bayesian machine learning tasks can benefit from the principle presented here: the power to model dependencies that are generated from hidden variables, but without necessarily modelling such variables explicitly.
Graphical methods for efficient likelihood inference in gaussian covariance models
 Journal of Machine Learning
, 2008
"... Abstract. In graphical modelling, a bidirected graph encodes marginal independences among random variables that are identified with the vertices of the graph. We show how to transform a bidirected graph into a maximal ancestral graph that (i) represents the same independence structure as the origi ..."
Abstract

Cited by 7 (2 self)
 Add to MetaCart
Abstract. In graphical modelling, a bidirected graph encodes marginal independences among random variables that are identified with the vertices of the graph. We show how to transform a bidirected graph into a maximal ancestral graph that (i) represents the same independence structure as the original bidirected graph, and (ii) minimizes the number of arrowheads among all ancestral graphs satisfying (i). Here the number of arrowheads of an ancestral graph is the number of directed edges plus twice the number of bidirected edges. In Gaussian models, this construction can be used for more efficient iterative maximization of the likelihood function and to determine when maximum likelihood estimates are equal to empirical counterparts. 1.
Mixed Cumulative Distribution Networks
"... Directed acyclic graphs (DAGs) are a popular framework to express multivariate probability distributions. Acyclic directed mixed graphs (ADMGs) are generalizations of DAGs that can succinctly capture much richer sets of conditional independencies, and are especially useful in modeling the effects of ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
Directed acyclic graphs (DAGs) are a popular framework to express multivariate probability distributions. Acyclic directed mixed graphs (ADMGs) are generalizations of DAGs that can succinctly capture much richer sets of conditional independencies, and are especially useful in modeling the effects of latent variables implicitly. Unfortunately, there are currently no parameterizations of general ADMGs. In this paper, we apply recent work on cumulative distribution networks and copulas to propose one general construction for ADMG models. We consider a simple parameter estimation approach, and report some encouraging experimental results. MGs are. Reading off independence constraints from a ADMG can be done with a procedure essentially identical to dseparation (Pearl, 1988, Richardson and Spirtes, 2002). Given a graphical structure, the challenge is to provide a procedure to parameterize models that correspond to the independence constraints of the graph, as illustrated below. Example 1: Bidirected edges correspond to some hidden common parent that has been marginalized. In the Gaussian case, this has an easy interpretation as constraints in the marginal covariance matrix of the remaining variables. Consider the two graphs below.
Clique Matrices for Statistical Graph Decomposition and Parameterising Restricted Positive Definite Matrices
"... We introduce Clique Matrices as an alternative representation of undirected graphs, being a generalisation of the incidence matrix representation. Here we use clique matrices to decompose a graph into a set of possibly overlapping clusters, defined as wellconnected subsets of vertices. The decomposi ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
We introduce Clique Matrices as an alternative representation of undirected graphs, being a generalisation of the incidence matrix representation. Here we use clique matrices to decompose a graph into a set of possibly overlapping clusters, defined as wellconnected subsets of vertices. The decomposition is based on a statistical description which encourages clusters to be well connected and few in number. Inference is carried out using a variational approximation. Clique matrices also play a natural role in parameterising positive definite matrices under zero constraints on elements of the matrix. We show that clique matrices can parameterise all positive definite matrices restricted according to a decomposable graph and form a structured Factor Analysis approximation in the nondecomposable case. 1
Identifying Graph Clusters using Variational Inference and links to Covariance Parameterisation
"... Finding clusters of wellconnected nodes in a graph is useful in many domains, including Social Network, Web and molecular interaction analyses. From a computational viewpoint, finding these clusters or graph communities is a difficult problem. We consider the framework of Clique Matrices to decompo ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Finding clusters of wellconnected nodes in a graph is useful in many domains, including Social Network, Web and molecular interaction analyses. From a computational viewpoint, finding these clusters or graph communities is a difficult problem. We consider the framework of Clique Matrices to decompose a graph into a set of possibly overlapping clusters, defined as wellconnected subsets of vertices. The decomposition is based on a statistical description which encourages clusters to be well connected and few in number. The formal intractability of inferring the clusters is addressed using a variational approximation which has links to meanfield theories in statistical mechanics. Clique matrices also play a natural role in parameterising positive definite matrices under zero constraints on elements of the matrix. We show that clique matrices can parameterise all positive definite matrices restricted according to a decomposable graph and form a structured Factor Analysis approximation in the nondecomposable case.