Results 1  10
of
55
High dimensional graphs and variable selection with the Lasso
 ANNALS OF STATISTICS
, 2006
"... The pattern of zero entries in the inverse covariance matrix of a multivariate normal distribution corresponds to conditional independence restrictions between variables. Covariance selection aims at estimating those structural zeros from data. We show that neighborhood selection with the Lasso is a ..."
Abstract

Cited by 399 (21 self)
 Add to MetaCart
The pattern of zero entries in the inverse covariance matrix of a multivariate normal distribution corresponds to conditional independence restrictions between variables. Covariance selection aims at estimating those structural zeros from data. We show that neighborhood selection with the Lasso is a computationally attractive alternative to standard covariance selection for sparse highdimensional graphs. Neighborhood selection estimates the conditional independence restrictions separately for each node in the graph and is hence equivalent to variable selection for Gaussian linear models. We show that the proposed neighborhood selection scheme is consistent for sparse highdimensional graphs. Consistency hinges on the choice of the penalty parameter. The oracle value for optimal prediction does not lead to a consistent neighborhood estimate. Controlling instead the probability of falsely joining some distinct connectivity components of the graph, consistent estimation for sparse graphs is achieved (with exponential rates), even when the number of variables grows as the number of observations raised to an arbitrary power.
Model selection through sparse maximum likelihood estimation
 Journal of Machine Learning Research
, 2008
"... We consider the problem of estimating the parameters of a Gaussian or binary distribution in such a way that the resulting undirected graphical model is sparse. Our approach is to solve a maximum likelihood problem with an added ℓ1norm penalty term. The problem as formulated is convex but the memor ..."
Abstract

Cited by 158 (1 self)
 Add to MetaCart
We consider the problem of estimating the parameters of a Gaussian or binary distribution in such a way that the resulting undirected graphical model is sparse. Our approach is to solve a maximum likelihood problem with an added ℓ1norm penalty term. The problem as formulated is convex but the memory requirements and complexity of existing interior point methods are prohibitive for problems with more than tens of nodes. We present two new algorithms for solving problems with at least a thousand nodes in the Gaussian case. Our first algorithm uses block coordinate descent, and can be interpreted as recursive ℓ1norm penalized regression. Our second algorithm, based on Nesterov’s first order method, yields a complexity estimate with a better dependence on problem size than existing interior point methods. Using a log determinant relaxation of the log partition function (Wainwright and Jordan, 2006), we show that these same algorithms can be used to solve an approximate sparse maximum likelihood problem for the binary case. We test our algorithms on synthetic data, as well as on gene expression and senate voting records data.
Thin Junction Tree Filters for Simultaneous Localization and Mapping
 In Intl. Joint Conf. on Artificial Intelligence (IJCAI
, 2003
"... Simultaneous Localization and Mapping (SLAM) is a fundamental problem in mobile robotics: while a robot navigates in an unknown environment, it must incrementally build a map of its surroundings and localize itself within that map. Traditional approaches to the problem are based upon Kalman filters, ..."
Abstract

Cited by 126 (1 self)
 Add to MetaCart
Simultaneous Localization and Mapping (SLAM) is a fundamental problem in mobile robotics: while a robot navigates in an unknown environment, it must incrementally build a map of its surroundings and localize itself within that map. Traditional approaches to the problem are based upon Kalman filters, but suffer from complexity issues: the size of the belief state and the time complexity of the filtering operation grow quadratically in the size of the map. This paper presents a filtering technique that maintains a tractable approximation of the filtered belief state as a thin junction tree. The junction tree grows under measurement and motion updates and is periodically "thinned" to remain tractable via efficient maximum likelihood projections. When applied to the SLAM problem, these thin junction tree filters have a linearspace belief state representation, and use a lineartime filtering operation. Further approximation can yield a constanttime filtering operation, at the expense of delaying the incorporation of observations into the majority of the map. Experiments on a suite of SLAM problems validate the approach.
WalkSums and Belief Propagation in Gaussian Graphical Models
 Journal of Machine Learning Research
, 2006
"... We present a new framework based on walks in a graph for analysis and inference in Gaussian graphical models. The key idea is to decompose the correlation between each pair of variables as a sum over all walks between those variables in the graph. The weight of each walk is given by a product of edg ..."
Abstract

Cited by 66 (14 self)
 Add to MetaCart
We present a new framework based on walks in a graph for analysis and inference in Gaussian graphical models. The key idea is to decompose the correlation between each pair of variables as a sum over all walks between those variables in the graph. The weight of each walk is given by a product of edgewise partial correlation coefficients. This representation holds for a large class of Gaussian graphical models which we call walksummable. We give a precise characterization of this class of models, and relate it to other classes including diagonally dominant, attractive, nonfrustrated, and pairwisenormalizable. We provide a walksum interpretation of Gaussian belief propagation in trees and of the approximate method of loopy belief propagation in graphs with cycles. The walksum perspective leads to a better understanding of Gaussian belief propagation and to stronger results for its convergence in loopy graphs.
Exactly sparse extended information filters for featurebased SLAM
 Proceedings of the IJCAI Workshop on Reasoning with Uncertainty in Robotics
, 2001
"... Recent research concerning the Gaussian canonical form for Simultaneous Localization and Mapping (SLAM) has given rise to a handful of algorithms that attempt to solve the SLAM scalability problem for arbitrarily large environments. One such estimator that has received due attention is the Sparse Ex ..."
Abstract

Cited by 48 (5 self)
 Add to MetaCart
Recent research concerning the Gaussian canonical form for Simultaneous Localization and Mapping (SLAM) has given rise to a handful of algorithms that attempt to solve the SLAM scalability problem for arbitrarily large environments. One such estimator that has received due attention is the Sparse Extended Information Filter (SEIF) by Thrun et al., which is reported to be nearly constant time, irrespective of the size of the map. The key to the SEIF’s scalability is to prune weak links in what is a dense information (inverse covariance) matrix to achieve a sparse approximation that allows for efficient, scalable SLAM. We demonstrate that the SEIF sparsification strategy yields error estimates that are overconfident when expressed in the global reference frame, while empirical results show that relative map consistency is maintained. In this paper, we propose an alternative scalable estimator based in the information form that maintains sparsity while preserving consistency. The paper describes a method for controlling the population of the information matrix, whereby we track a modified version of the SLAM posterior, essentially by ignoring a small fraction of temporal measurements. In this manner, the Exactly Sparse Extended Information Filter (ESEIF) performs inference over a model that is conservative relative to the standard Gaussian distribution. We compare our algorithm to the SEIF and standard EKF both in simulation as well as on two nonlinear datasets. The results convincingly show that our method yields conservative estimates for the robot pose and map that are nearly identical to those of the EKF.
Embedded Trees: Estimation of Gaussian Processes on Graphs with Cycles
 IEEE Transactions on Signal Processing
, 2002
"... Graphical models provide a powerful general framework for encoding the structure of largescale estimation problems. However, the graphs describing typical realworld phenomena contain many cycles, making direct estimation procedures prohibitively costly. In this paper, we develop an iterative infer ..."
Abstract

Cited by 36 (13 self)
 Add to MetaCart
Graphical models provide a powerful general framework for encoding the structure of largescale estimation problems. However, the graphs describing typical realworld phenomena contain many cycles, making direct estimation procedures prohibitively costly. In this paper, we develop an iterative inference algorithm for general Gaussian graphical models. It operates by exactly solving a series of modified estimation problems on spanning trees embedded within the original cyclic graph. When these subproblems are suitably chosen, the algorithm converges to the correct conditional means. Moreover, and in contrast to many other iterative methods, the treebased procedures we propose can also be used to calculate exact error variances. Although the conditional mean iteration is effective for quite densely connected graphical models, the error variance computation is most efficient for sparser graphs. In this context, we present a modeling example which suggests that very sparsely connected graphs with cycles may provide significant advantages relative to their treestructured counterparts, thanks both to the expressive power of these models and to the efficient inference algorithms developed herein.
Latent Variable Graphical Model Selection via Convex Optimization
, 2010
"... Suppose we have samples of a subset of a collection of random variables. No additional information is provided about the number of latent variables, nor of the relationship between the latent and observed variables. Is it possible to discover the number of hidden components, and to learn a statistic ..."
Abstract

Cited by 22 (2 self)
 Add to MetaCart
Suppose we have samples of a subset of a collection of random variables. No additional information is provided about the number of latent variables, nor of the relationship between the latent and observed variables. Is it possible to discover the number of hidden components, and to learn a statistical model over the entire collection of variables? We address this question in the setting in which the latent and observed variables are jointly Gaussian, with the conditional statistics of the observed variables conditioned on the latent variables being specified by a graphical model. As a first step we give natural conditions under which such latentvariable Gaussian graphical models are identifiable given marginal statistics of only the observed variables. Essentially these conditions require that the conditional graphical model among the observed variables is sparse, while the effect of the latent variables is “spread out ” over most of the observed variables. Next we propose a tractable convex program based on regularized maximumlikelihood for model selection in this latentvariable setting; the regularizer uses both the ℓ1 norm and the nuclear norm. Our modeling framework can be viewed as a combination of dimensionality reduction (to identify latent variables) and graphical modeling (to capture remaining statistical structure not attributable to the latent variables), and it consistently estimates both the number of hidden components and the conditional graphical model structure among the observed variables. These results are applicable in the highdimensional setting in which the number of latent/observed variables grows with the number of samples of the observed variables. The geometric properties of the algebraic varieties of sparse matrices and of lowrank matrices play an important role in our analysis.
Lagrangian relaxation for MAP estimation in graphical models
 in: 45th Annual Allerton Conference on Communication, Control and Computing
, 2007
"... Abstract — We develop a general framework for MAP estimation in discrete and Gaussian graphical models using Lagrangian relaxation techniques. The key idea is to reformulate an intractable estimation problem as one defined on a more tractable graph, but subject to additional constraints. Relaxing th ..."
Abstract

Cited by 21 (1 self)
 Add to MetaCart
Abstract — We develop a general framework for MAP estimation in discrete and Gaussian graphical models using Lagrangian relaxation techniques. The key idea is to reformulate an intractable estimation problem as one defined on a more tractable graph, but subject to additional constraints. Relaxing these constraints gives a tractable dual problem, one defined by a thin graph, which is then optimized by an iterative procedure. When this iterative optimization leads to a consistent estimate, one which also satisfies the constraints, then it corresponds to an optimal MAP estimate of the original model. Otherwise there is a “duality gap”, and we obtain a bound on the optimal solution. Thus, our approach combines convex optimization with dynamic programming techniques applicable for thin graphs. The popular treereweighted maxproduct (TRMP) method may be seen as solving a particular class of such relaxations, where the intractable graph is relaxed to a set of spanning trees. We also consider relaxations to a set of small induced subgraphs, thin subgraphs (e.g. loops), and a connected tree obtained by “unwinding ” cycles. In addition, we propose a new class of multiscale relaxations that introduce “summary ” variables. The potential benefits of such generalizations include: reducing or eliminating the “duality gap ” in hard problems, reducing the number or Lagrange multipliers in the dual problem, and accelerating convergence of the iterative optimization procedure. I.
From association to causation via regression
 Indiana: University of Notre Dame
, 1997
"... For nearly a century, investigators in the social sciences have used regression models to deduce causeandeffect relationships from patterns of association. Path models and automated search procedures are more recent developments. In my view, this enterprise has not been successful. The models tend ..."
Abstract

Cited by 16 (6 self)
 Add to MetaCart
For nearly a century, investigators in the social sciences have used regression models to deduce causeandeffect relationships from patterns of association. Path models and automated search procedures are more recent developments. In my view, this enterprise has not been successful. The models tend to neglect the difficulties in establishing causal relations, and the mathematical complexities tend to obscure rather than clarify the assumptions on which the analysis is based. Formal statistical inference is, by its nature, conditional. If maintained hypotheses A, B, C,... hold, then H can be tested against the data. However, if A, B, C,... remain in doubt, so must inferences about H. Careful scrutiny of maintained hypotheses should therefore be a critical part of empirical work a principle honored more often in the breach than the observance.