Results 1 
9 of
9
Greedy sparsityconstrained optimization
 in Signals, Systems and Computers (ASILOMAR), 2011 Conference Record of the Forty Fifth Asilomar Conference on, IEEE, 2011
"... Abstract—Finding optimal sparse solutions to estimation problems, particularly in underdetermined regimes has recently gained much attention. Most existing literature study linear models in which the squared error is used as the measure of discrepancy to be minimized. However, in many applications d ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
Abstract—Finding optimal sparse solutions to estimation problems, particularly in underdetermined regimes has recently gained much attention. Most existing literature study linear models in which the squared error is used as the measure of discrepancy to be minimized. However, in many applications discrepancy is measured in more general forms such as loglikelihood. Regularization by ℓ1norm has been shown to induce sparse solutions, but their sparsity level can be merely suboptimal. In this paper we present a greedy algorithm, dubbed Gradient Support Pursuit (GraSP), for sparsityconstrained optimization. Quantifiable guarantees are provided for GraSP when cost functions have the “Stable Hessian Property”. I.
Highdimensional Sparse Inverse Covariance Estimation using Greedy Methods
"... In this paper we consider the task of estimating the nonzero pattern of the sparse inverse covariance matrix of a zeromean Gaussian random vector from a set of iid samples. Note that this is also equivalent to recovering the underlying graph structure of a sparse Gaussian Markov Random Field (GMRF ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
In this paper we consider the task of estimating the nonzero pattern of the sparse inverse covariance matrix of a zeromean Gaussian random vector from a set of iid samples. Note that this is also equivalent to recovering the underlying graph structure of a sparse Gaussian Markov Random Field (GMRF). We present two novel greedy approaches to solving this problem. The first estimates the nonzero covariates of the overall inverse covariance matrix using a series of global forward and backward greedy steps. The second estimates the neighborhood of each node in the graph separately, again using greedy forward and backward steps, and combines the intermediate neighborhoods to form an overall estimate. The principal contribution of this paper is a rigorous analysis of the sparsistency of these two greedy procedures, that is, their consistency in recovering the sparsity pattern of the inverse covariance matrix. Surprisingly, we show that both the local and global greedy methods learn the full structure of the model with high probability given just O(d log(p)) samples, which is a significant improvement over state of the art ℓ1regularized Gaussian MLE (Graphical Lasso) that requires O(d2 log(p)) samples. Moreover, the restricted eigenvalue and smoothness conditions imposed by our greedy methods are much weaker than the strong irrepresentable conditions required by the ℓ1regularization based methods. We corroborate our results with extensive simulations and examples, comparing our local and
Learning Mixtures of Tree Graphical Models
"... We consider unsupervised estimation of mixtures of discrete graphical models, where the class variable is hidden and each mixture component can have a potentially different Markov graph structure and parameters over the observed variables. We propose a novel method for estimating the mixture compone ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
We consider unsupervised estimation of mixtures of discrete graphical models, where the class variable is hidden and each mixture component can have a potentially different Markov graph structure and parameters over the observed variables. We propose a novel method for estimating the mixture components with provable guarantees. Our output is a treemixture model which serves as a good approximation to the underlying graphical model mixture. The sample and computational requirements for our method scale aspoly(p,r), for anrcomponent mixture ofpvariate graphical models, for a wide class of models which includes tree mixtures and mixtures over bounded degree graphs. Keywords: Graphical models, mixture models, spectral methods, tree approximation.
iv Greedy Structure Learning of Markov Random Fields
"... I would like to thank my advisor, Pradeep Ravikumar, for inspiration, guidance, and encouragement on this work. In addition, I would like to thank Ali Jalali for his collaboration and work on the proof techniques and theoretical analysis used in this paper. Also, I would also like to thank Inderjit ..."
Abstract
 Add to MetaCart
I would like to thank my advisor, Pradeep Ravikumar, for inspiration, guidance, and encouragement on this work. In addition, I would like to thank Ali Jalali for his collaboration and work on the proof techniques and theoretical analysis used in this paper. Also, I would also like to thank Inderjit Dhillon and the students of his lab for motivation and many stimulating conversations regarding Machine Learning, Data Mining, and Statistics. Finally, I would like to thank my friends and family for their faith and encouragement in my many late nights of research and writing. I couldn’t have finished this work without their support.
Greedy Learning of Graphical Models with Small Girth
"... Abstract — This paper develops two new greedy algorithms for learning the Markov graph of discrete probability distributions, from samples thereof. For finding the neighborhood of a node (i.e. variable), the simple, naive greedy algorithm iteratively adds the new node that gives the biggest improvem ..."
Abstract
 Add to MetaCart
Abstract — This paper develops two new greedy algorithms for learning the Markov graph of discrete probability distributions, from samples thereof. For finding the neighborhood of a node (i.e. variable), the simple, naive greedy algorithm iteratively adds the new node that gives the biggest improvement in prediction performance over the existing set. While fast to implement, this can yield incorrect graphs when there are many short cycles, as now the single node that gives the best prediction can be outside the neighborhood. Our new algorithms get around this in two different ways. The forwardbackward greedy algorithm includes a deletion step, which goes back and prunes incorrect nodes that may have initially been added. The recursive greedy algorithm uses forward steps in a twolevel process, running greedy iterations in an inner loop, but only including the final node. We show, both analytically and empirically, that these algorithms can learn graphs with small girth which other algorithms both greedy, and those based on convex optimization cannot. I.
Convergence Rates of Biased Stochastic Optimization for Learning Sparse Ising Models
"... We study the convergence rate of stochastic optimization of exact (NPhard) objectives, for which only biased estimates of the gradient are available. We motivate this problem in the context of learning the structure and parameters of Ising models. We first provide a convergencerate analysis of det ..."
Abstract
 Add to MetaCart
We study the convergence rate of stochastic optimization of exact (NPhard) objectives, for which only biased estimates of the gradient are available. We motivate this problem in the context of learning the structure and parameters of Ising models. We first provide a convergencerate analysis of deterministic errors for forwardbackward splitting (FBS). We then extend our analysis to biased stochastic errors, by first characterizing a family of samplers and providing a high probability bound that allows understanding not only FBS, but also proximal gradient (PG) methods. We derive some interesting conclusions: FBS requires only a logarithmically increasing number of random samples in order to converge (although at a very low rate); the required number of random samples is the same for the deterministic and the biased stochastic setting for FBS and basic PG; accelerated PG is not guaranteed to converge in the biased stochastic setting. 1.
Where
, 2012
"... Sparse: # of offdiagonal nonzeros in inverse covariance matrix is small Θ ∗ =Σ −1 X1,...,Xp ∼N(0, Σ) ..."
Abstract
 Add to MetaCart
Sparse: # of offdiagonal nonzeros in inverse covariance matrix is small Θ ∗ =Σ −1 X1,...,Xp ∼N(0, Σ)
Latent Graphical Model Selection: Efficient Methods for Locally Treelike Graphs
"... Graphical model selection refers to the problem of estimating the unknown graph structure given observations at the nodes in the model. We consider a challenging instance of this problem when some of the nodes are latent or hidden. We characterize conditions for tractable graph estimation and develo ..."
Abstract
 Add to MetaCart
Graphical model selection refers to the problem of estimating the unknown graph structure given observations at the nodes in the model. We consider a challenging instance of this problem when some of the nodes are latent or hidden. We characterize conditions for tractable graph estimation and develop efficient methods with provable guarantees. We consider the class of Ising models Markov on locally treelike graphs, which are in the regime of correlation decay. We propose an efficient method for graph estimation, and establish its structural consistency when the number of samples n scales as n = Ω(θ −δη(η+1)−2 min log p), where θmin is the minimum edge potential, δ is the depth (i.e., distance from a hidden node to the nearest observed nodes), and η is a parameter which depends on the minimum and maximum node and edge potentials in the Ising model. The proposed method is practical to implement and provides flexibility to control the number of latent variables and the cycle lengths in the output graph. We also present necessary conditions for graph estimation by any method and show that our method nearly matches the lower bound on sample requirements.