Results 11 - 20
of
681
Coordinate Descent for mixed-norm NMF
, 2013
"... Nonnegative matrix factorization (NMF) is widely used in a variety of machine learning tasks involving speech, documents and images. Being able to specify the structure of the matrix factors is crucial in incorporating prior information. The factors correspond to the feature matrix and the learnt re ..."
Abstract
- Add to MetaCart
representation. In particular, we allow an user-friendly specification of sparsity on the groups of features using the L1/L2 measure. Also, we propose a pairwise coordinate descent algorithm to minimize the objective. Experimental evidence of the efficacy of this approach is provided on the ORL faces dataset.
On the complexity analysis of randomized block-coordinate descent methods
- Mathematical Programming DOI
, 2014
"... Abstract In this paper we analyze the randomized block-coordinate descent (RBCD) methods proposed in ..."
Abstract
-
Cited by 31 (2 self)
- Add to MetaCart
Abstract In this paper we analyze the randomized block-coordinate descent (RBCD) methods proposed in
A dual coordinate descent method for large-scale linear SVM.
- In ICML,
, 2008
"... Abstract In many applications, data appear with a huge number of instances as well as features. Linear Support Vector Machines (SVM) is one of the most popular tools to deal with such large-scale sparse data. This paper presents a novel dual coordinate descent method for linear SVM with L1-and L2-l ..."
Abstract
-
Cited by 207 (20 self)
- Add to MetaCart
Abstract In many applications, data appear with a huge number of instances as well as features. Linear Support Vector Machines (SVM) is one of the most popular tools to deal with such large-scale sparse data. This paper presents a novel dual coordinate descent method for linear SVM with L1-and L2
Perceptron learning with random coordinate descent
- California Institute of Technology
, 2005
"... Abstract. A perceptron is a linear threshold classifier that separates examples with a hyperplane. It is perhaps the simplest learning model that is used standalone. In this paper, we propose a family of random coordinate descent algorithms for perceptron learning on binary classification problems. ..."
Abstract
-
Cited by 6 (3 self)
- Add to MetaCart
Abstract. A perceptron is a linear threshold classifier that separates examples with a hyperplane. It is perhaps the simplest learning model that is used standalone. In this paper, we propose a family of random coordinate descent algorithms for perceptron learning on binary classification problems
Coordinate descent algorithms for lasso penalized regression
- Ann. Appl. Stat
, 2008
"... Imposition of a lasso penalty shrinks parameter estimates toward zero and performs continuous model selection. Lasso penalized regression is capable of handling linear regression problems where the number of predictors far exceeds the number of cases. This paper tests two exceptionally fast algorith ..."
Abstract
-
Cited by 109 (3 self)
- Add to MetaCart
algorithms for estimating regression coefficients with a lasso penalty. The previously known ℓ2 algorithm is based on cyclic coordinate descent. Our new ℓ1 algorithm is based on greedy coordinate descent and Edgeworth’s algorithm for ordinary ℓ1 regression. Each algorithm relies on a tuning constant that can
SparseNet: Coordinate Descent with Non-Convex Penalties
, 2009
"... We address the problem of sparse selection in linear models. A number of non-convex penalties have been proposed for this purpose, along with a variety of convex-relaxation algorithms for finding good solutions. In this paper we pursue the coordinate-descent approach for optimization, and study its ..."
Abstract
-
Cited by 71 (0 self)
- Add to MetaCart
We address the problem of sparse selection in linear models. A number of non-convex penalties have been proposed for this purpose, along with a variety of convex-relaxation algorithms for finding good solutions. In this paper we pursue the coordinate-descent approach for optimization, and study its
Tree Block Coordinate Descent for MAP in Graphical Models
"... A number of linear programming relaxations have been proposed for finding most likely settings of the variables (MAP) in large probabilistic models. The relaxations are often succinctly expressed in the dual and reduce to different types of reparameterizations of the original model. The dual objecti ..."
Abstract
-
Cited by 39 (3 self)
- Add to MetaCart
objectives are typically solved by performing local block coordinate descent steps. In this work, we show how to perform block coordinate descent on spanning trees of the graphical model. We also show how all of the earlier dual algorithms are related to each other, giving transformations from one type
On the non-asymptotic convergence of cyclic coordinate descent methods
- SIAM Journal on Optimization
"... Abstract. Cyclic coordinate descent is a classic optimization method that has witnessed a resurgence of interest in signal processing, statistics, and machine learning. Reasons for this renewed interest include the simplicity, speed, and stability of the method, as well as its competitive per-forman ..."
Abstract
-
Cited by 18 (0 self)
- Add to MetaCart
Abstract. Cyclic coordinate descent is a classic optimization method that has witnessed a resurgence of interest in signal processing, statistics, and machine learning. Reasons for this renewed interest include the simplicity, speed, and stability of the method, as well as its competitive per
Block Coordinate Descent for Sparse NMF
, 2013
"... Nonnegative matrix factorization (NMF) has become a ubiquitous tool for data analysis. An important variant is the sparse NMF problem which arises when we explicitly require the learnt features to be sparse. A natural measure of sparsity is the L0 norm, however its optimization is NP-hard. Mixed nor ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Nonnegative matrix factorization (NMF) has become a ubiquitous tool for data analysis. An important variant is the sparse NMF problem which arises when we explicitly require the learnt features to be sparse. A natural measure of sparsity is the L0 norm, however its optimization is NP-hard. Mixed norms, such as L1/L2 measure, have been shown to model sparsity robustly, based on intuitive attributes that such measures need to satisfy. This is in contrast to computationally cheaper alternatives such as the plain L1 norm. However, present algorithms designed for optimizing the mixed norm L1/L2 are slow and other formulations for sparse NMF have been proposed such as those based on L1 and L0 norms. Our proposed algorithm allows us to solve the mixed norm sparsity constraints while not sacrificing computation time. We present experimental evidence on real-world datasets that shows our new algorithm performs an order of magnitude faster compared to the current state-of-the- art solvers optimizing the mixed norm and is suitable for large-scale datasets.
Results 11 - 20
of
681