Results 1  10
of
12
Greedy sparsityconstrained optimization
 in Signals, Systems and Computers (ASILOMAR), 2011 Conference Record of the Forty Fifth Asilomar Conference on, IEEE, 2011
"... Abstract—Finding optimal sparse solutions to estimation problems, particularly in underdetermined regimes has recently gained much attention. Most existing literature study linear models in which the squared error is used as the measure of discrepancy to be minimized. However, in many applications d ..."
Abstract

Cited by 4 (1 self)
 Add to MetaCart
Abstract—Finding optimal sparse solutions to estimation problems, particularly in underdetermined regimes has recently gained much attention. Most existing literature study linear models in which the squared error is used as the measure of discrepancy to be minimized. However, in many applications discrepancy is measured in more general forms such as loglikelihood. Regularization by ℓ1norm has been shown to induce sparse solutions, but their sparsity level can be merely suboptimal. In this paper we present a greedy algorithm, dubbed Gradient Support Pursuit (GraSP), for sparsityconstrained optimization. Quantifiable guarantees are provided for GraSP when cost functions have the “Stable Hessian Property”. I.
Learning Mixtures of Tree Graphical Models
"... We consider unsupervised estimation of mixtures of discrete graphical models, where the class variable is hidden and each mixture component can have a potentially different Markov graph structure and parameters over the observed variables. We propose a novel method for estimating the mixture compone ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
(Show Context)
We consider unsupervised estimation of mixtures of discrete graphical models, where the class variable is hidden and each mixture component can have a potentially different Markov graph structure and parameters over the observed variables. We propose a novel method for estimating the mixture components with provable guarantees. Our output is a treemixture model which serves as a good approximation to the underlying graphical model mixture. The sample and computational requirements for our method scale aspoly(p,r), for anrcomponent mixture ofpvariate graphical models, for a wide class of models which includes tree mixtures and mixtures over bounded degree graphs. Keywords: Graphical models, mixture models, spectral methods, tree approximation.
Greedy Learning of Graphical Models with Small Girth
"... Abstract — This paper develops two new greedy algorithms for learning the Markov graph of discrete probability distributions, from samples thereof. For finding the neighborhood of a node (i.e. variable), the simple, naive greedy algorithm iteratively adds the new node that gives the biggest improvem ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract — This paper develops two new greedy algorithms for learning the Markov graph of discrete probability distributions, from samples thereof. For finding the neighborhood of a node (i.e. variable), the simple, naive greedy algorithm iteratively adds the new node that gives the biggest improvement in prediction performance over the existing set. While fast to implement, this can yield incorrect graphs when there are many short cycles, as now the single node that gives the best prediction can be outside the neighborhood. Our new algorithms get around this in two different ways. The forwardbackward greedy algorithm includes a deletion step, which goes back and prunes incorrect nodes that may have initially been added. The recursive greedy algorithm uses forward steps in a twolevel process, running greedy iterations in an inner loop, but only including the final node. We show, both analytically and empirically, that these algorithms can learn graphs with small girth which other algorithms both greedy, and those based on convex optimization cannot. I.
ForwardBackward Greedy Algorithms for General Convex Smooth Functions over A Cardinality Constraint
"... We consider forwardbackward greedy algorithms for solving sparse feature selection problems with general convex smooth functions. A stateoftheart greedy method, the ForwardBackward greedy algorithm (FoBaobj) requires to solve a large number of optimization problems, thus it is not scalable ..."
Abstract
 Add to MetaCart
(Show Context)
We consider forwardbackward greedy algorithms for solving sparse feature selection problems with general convex smooth functions. A stateoftheart greedy method, the ForwardBackward greedy algorithm (FoBaobj) requires to solve a large number of optimization problems, thus it is not scalable for largesize problems. The FoBagdt algorithm, which uses the gradient information for feature selection at each forward iteration, significantly improves the efficiency of FoBaobj. In this paper, we systematically analyze the theoretical properties of both algorithms. Our main contributions are: 1) We derive better theoretical bounds than existing analyses regarding FoBaobj for general smooth convex functions; 2) We show that FoBagdt achieves the same theoretical performance as FoBaobj under the same condition: restricted strong convexity condition. Our new bounds are consistent with the bounds of a special case (least squares) and fills a previously existing theoretical gap for general convex smooth functions; 3) We show that the restricted strong convexity condition is satisfied if the number of independent samples is more than k ̄ log d where k ̄ is the sparsity number and d is the dimension of the variable; 4) We apply FoBagdt (with the conditional random field objective) to the sensor selection problem for human indoor activity recognition and our results show that FoBagdt outperforms other methods based on forward greedy selection
Latent Graphical Model Selection: Efficient Methods for Locally Treelike Graphs
"... Graphical model selection refers to the problem of estimating the unknown graph structure given observations at the nodes in the model. We consider a challenging instance of this problem when some of the nodes are latent or hidden. We characterize conditions for tractable graph estimation and develo ..."
Abstract
 Add to MetaCart
Graphical model selection refers to the problem of estimating the unknown graph structure given observations at the nodes in the model. We consider a challenging instance of this problem when some of the nodes are latent or hidden. We characterize conditions for tractable graph estimation and develop efficient methods with provable guarantees. We consider the class of Ising models Markov on locally treelike graphs, which are in the regime of correlation decay. We propose an efficient method for graph estimation, and establish its structural consistency when the number of samples n scales as n = Ω(θ −δη(η+1)−2 min log p), where θmin is the minimum edge potential, δ is the depth (i.e., distance from a hidden node to the nearest observed nodes), and η is a parameter which depends on the minimum and maximum node and edge potentials in the Ising model. The proposed method is practical to implement and provides flexibility to control the number of latent variables and the cycle lengths in the output graph. We also present necessary conditions for graph estimation by any method and show that our method nearly matches the lower bound on sample requirements.
Convergence Rates of Biased Stochastic Optimization for Learning Sparse Ising Models
"... We study the convergence rate of stochastic optimization of exact (NPhard) objectives, for which only biased estimates of the gradient are available. We motivate this problem in the context of learning the structure and parameters of Ising models. We first provide a convergencerate analysis of det ..."
Abstract
 Add to MetaCart
(Show Context)
We study the convergence rate of stochastic optimization of exact (NPhard) objectives, for which only biased estimates of the gradient are available. We motivate this problem in the context of learning the structure and parameters of Ising models. We first provide a convergencerate analysis of deterministic errors for forwardbackward splitting (FBS). We then extend our analysis to biased stochastic errors, by first characterizing a family of samplers and providing a high probability bound that allows understanding not only FBS, but also proximal gradient (PG) methods. We derive some interesting conclusions: FBS requires only a logarithmically increasing number of random samples in order to converge (although at a very low rate); the required number of random samples is the same for the deterministic and the biased stochastic setting for FBS and basic PG; accelerated PG is not guaranteed to converge in the biased stochastic setting. 1.
Gradient Hard Thresholding Pursuit for SparsityConstrained Optimization
"... Hard Thresholding Pursuit (HTP) is an iterative greedy selection procedure for finding sparse solutions of underdetermined linear systems. This method has been shown to have strong theoretical guarantees and impressive numerical performance. In this paper, we generalize HTP from compressed sensin ..."
Abstract
 Add to MetaCart
(Show Context)
Hard Thresholding Pursuit (HTP) is an iterative greedy selection procedure for finding sparse solutions of underdetermined linear systems. This method has been shown to have strong theoretical guarantees and impressive numerical performance. In this paper, we generalize HTP from compressed sensing to a generic problem setup of sparsityconstrained convex optimization. The proposed algorithm iterates between a standard gradient descent step and a hard truncation step with or without debiasing. We prove that our method enjoys the strong guarantees analogous to HTP in terms of rate of convergence and parameter estimation accuracy. Numerical evidences show that our method is superior to the stateoftheart greedy selection methods when applied to learning tasks of sparse logistic regression and sparse support vector machines. 1.