Results 1  10
of
22
Sequential Quadratic Programming
, 1995
"... this paper we examine the underlying ideas of the SQP method and the theory that establishes it as a framework from which effective algorithms can ..."
Abstract

Cited by 145 (4 self)
 Add to MetaCart
this paper we examine the underlying ideas of the SQP method and the theory that establishes it as a framework from which effective algorithms can
Fast Optimization Methods for L1 Regularization: A Comparative Study and Two New Approaches
"... Abstract. L1 regularization is effective for feature selection, but the resulting optimization is challenging due to the nondifferentiability of the 1norm. In this paper we compare stateoftheart optimization techniques to solve this problem across several loss functions. Furthermore, we propose ..."
Abstract

Cited by 77 (2 self)
 Add to MetaCart
(Show Context)
Abstract. L1 regularization is effective for feature selection, but the resulting optimization is challenging due to the nondifferentiability of the 1norm. In this paper we compare stateoftheart optimization techniques to solve this problem across several loss functions. Furthermore, we propose two new techniques. The first is based on a smooth (differentiable) convex approximation for the L1 regularizer that does not depend on any assumptions about the loss function used. The other technique is a new strategy that addresses the nondifferentiability of the L1regularizer by casting the problem as a constrained optimization problem that is then solved using a specialized gradient projection method. Extensive comparisons show that our newly proposed approaches consistently rank among the best in terms of convergence speed and efficiency by measuring the number of function evaluations required. 1
Feature Selection via Mathematical Programming
, 1997
"... The problem of discriminating between two finite point sets in ndimensional feature space by a separating plane that utilizes as few of the features as possible, is formulated as a mathematical program with a parametric objective function and linear constraints. The step function that appears in th ..."
Abstract

Cited by 69 (22 self)
 Add to MetaCart
The problem of discriminating between two finite point sets in ndimensional feature space by a separating plane that utilizes as few of the features as possible, is formulated as a mathematical program with a parametric objective function and linear constraints. The step function that appears in the objective function can be approximated by a sigmoid or by a concave exponential on the nonnegative real line, or it can be treated exactly by considering the equivalent linear program with equilibrium constraints (LPEC). Computational tests of these three approaches on publicly available realworld databases have been carried out and compared with an adaptation of the optimal brain damage (OBD) method for reducing neural network complexity. One feature selection algorithm via concave minimization (FSV) reduced crossvalidation error on a cancer prognosis database by 35.4% while reducing problem features from 32 to 4. Feature selection is an important problem in machine learning [18, 15, 1...
Quadratically And Superlinearly Convergent Algorithms For The Solution Of Inequality Constrained Minimization Problems
, 1995
"... . In this paper some Newton and quasiNewton algorithms for the solution of inequality constrained minimization problems are considered. All the algorithms described produce sequences fx k g converging qsuperlinearly to the solution. Furthermore, under mild assumptions, a qquadratic convergence ra ..."
Abstract

Cited by 31 (10 self)
 Add to MetaCart
. In this paper some Newton and quasiNewton algorithms for the solution of inequality constrained minimization problems are considered. All the algorithms described produce sequences fx k g converging qsuperlinearly to the solution. Furthermore, under mild assumptions, a qquadratic convergence rate in x is also attained. Other features of these algorithms are that the solution of linear systems of equations only is required at each iteration and that the strict complementarity assumption is never invoked. First the superlinear or quadratic convergence rate of a Newtonlike algorithm is proved. Then, a simpler version of this algorithm is studied and it is shown that it is superlinearly convergent. Finally, quasiNewton versions of the previous algorithms are considered and, provided the sequence defined by the algorithms converges, a characterization of superlinear convergence extending the result of Boggs, Tolle and Wang is given. Key Words. Inequality constrained optimization, New...
A Parallel Inexact Newton Method for Stochastic Programs with Recourse
 Oper. Res
, 1996
"... A parallel inexact Newton method with a line search is proposed for twostage quadratic stochastic programs with recourse. A lattice rule is used for the numerical evaluation of multidimensional integrals, and a parallel iterative method is used to solve the quadratic programming subproblems. Althou ..."
Abstract

Cited by 8 (5 self)
 Add to MetaCart
(Show Context)
A parallel inexact Newton method with a line search is proposed for twostage quadratic stochastic programs with recourse. A lattice rule is used for the numerical evaluation of multidimensional integrals, and a parallel iterative method is used to solve the quadratic programming subproblems. Although the objective only has a locally Lipschitz gradient, global convergence and local superlinear convergence of the method are established. Furthermore, the method provides an error estimate which does not require much extra computation. The performance of the method is illustrated on a CM5 parallel computer. Keywords: Stochastic programming, inexact Newton method, parallel quadratic programming, numerical integration. Short title: Newton's method for stochastic programs This work was supported by the Australian Research Council and the numerical experiments were done on the Sydney Regional Centre for Parallel Computing CM5. 1 Introduction Twostage quadratic stochastic programs with re...
Mathematical Programming Approaches To Machine Learning And Data Mining
, 1998
"... Machine learning problems of supervised classification, unsupervised clustering and parsimonious approximation are formulated as mathematical programs. The feature selection problem arising in the supervised classification task is effectively addressed by calculating a separating plane by minimizing ..."
Abstract

Cited by 6 (0 self)
 Add to MetaCart
(Show Context)
Machine learning problems of supervised classification, unsupervised clustering and parsimonious approximation are formulated as mathematical programs. The feature selection problem arising in the supervised classification task is effectively addressed by calculating a separating plane by minimizing separation error and the number of problem features utilized. The support vector machine approach is formulated using various norms to measure the margin of separation. The clustering problem of assigning m points in ndimensional real space to k clusters is formulated as minimizing a piecewiselinear concave function over a polyhedral set. This problem is also formulated in a novel fashion by minimizing the sum of squared distances of data points to nearest cluster planes characterizing the k clusters. The problem of obtaining a parsimonious solution to a linear system where the right hand side vector may be corrupted by noise is formulated as minimizing the system residual plus either the number of nonzero elements in the solution vector or the norm of the solution vector. The feature selection problem, the clustering problem and the parsimonious approximation problem can all be stated as the minimization of a concave function over a polyhedral region and are solved by a theoretically justifiable, fast and finite successive linearization algorithm. Numerical tests indicate the utility and efficiency of these formulations on realworld databases. In particular, the feature selection approach via concave minimization computes a separatingplane based classifier that improves upon the generalization ability of a separating plane computed without feature suppression. This approach produces ii classifiers utilizing fewer original problem features than the support vector machin...
Analyse und Restrukturierung eines Verfahrens zur direkten Lösung von OptimalSteuerungsproblemen (The Theory of MUSCOD in a Nutshell)
, 1995
"... MUSCOD (MU ltiple Shooting COde for Direct Optimal Control) is the implementation of an algorithm for the direct solution of optimal control problems. The method is based on multiple shooting combined with a sequential quadratic programming (SQP) technique; its original version was developed in the ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
MUSCOD (MU ltiple Shooting COde for Direct Optimal Control) is the implementation of an algorithm for the direct solution of optimal control problems. The method is based on multiple shooting combined with a sequential quadratic programming (SQP) technique; its original version was developed in the early 1980s by Plitt under the supervision of Bock [Plitt81, Bock84]. The following report is intended to describe the basic aspects of the underlying theory in a concise but readable form. Such a description is not yet available: the paper by Bock and Plitt [Bock84] gives a good overview of the method, but it leaves out too many important details to be a complete reference, while the diploma thesis by Plitt [Plitt81], on the other hand, presents a fairly complete description, but is rather difficult to read. Throughout the present document, emphasis is given to a clear presentation of the concepts upon which MUSCOD is based. An effort has been made to properly reflect the structure of the a...
Enlarging the Region of Convergence of Newton's Method for Constrained Optimization
, 1982
"... In this paper, we consider Newton's method for solving the system of necessary optimality conditions of optimization problems with equality and inequality constraints. The principal drawbacks of the method are the need for a good starting point, the inability to distinguish between local maxima ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
In this paper, we consider Newton's method for solving the system of necessary optimality conditions of optimization problems with equality and inequality constraints. The principal drawbacks of the method are the need for a good starting point, the inability to distinguish between local maxima and local minima, and, when inequality constraints are present, the necessity to solve a quadratic programming problem at each iteration. We show that all these drawbacks can be overcome to a great extent without sacrificing the superlinear convergence rate by making use of exact differentiable penalty functions introduced by Di Pillo and Grippo (Ref. 1). We also show that there is a close relationship between the class of penalty functions of Di Pillo and Grippo and the class of Fletcher (Ref. 2), and that the region of convergence of a variation of Newton's method can be enlarged by making use of one of Fletcher's penalty functions.
Global convergence of an SQP method without boundedness assumptions on any of the iterative sequences
, 2009
"... ..."
A smoothing SQP method for solving degenerate nonsmooth constrained optimization problems with applications to bilevel programs
 SIAM Journal on Optimization
"... Abstract. We consider a degenerate nonsmooth and nonconvex optimization problem for which the standard constraint qualification such as the generalized Mangasarian–Fromovitz constraint qualification (GMFCQ) may not hold. We use smoothing functions with the gradient consistency property to approxima ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
(Show Context)
Abstract. We consider a degenerate nonsmooth and nonconvex optimization problem for which the standard constraint qualification such as the generalized Mangasarian–Fromovitz constraint qualification (GMFCQ) may not hold. We use smoothing functions with the gradient consistency property to approximate the nonsmooth functions and introduce a smoothing sequential quadratic programming (SQP) algorithm under the l ∞ penalty framework. We show that any accumulation point of a selected subsequence of the iteration sequence generated by the smoothing SQP algorithm is a Clarke stationary point, provided that the sequence of multipliers and the sequence of penalty parameters are bounded. Furthermore, we propose a new condition called the weakly generalized Mangasarian– Fromovitz constraint qualification (WGMFCQ) that is weaker than the GMFCQ. We show that the extended version of the WGMFCQ guarantees the boundedness of the sequence of multipliers and the sequence of penalty parameters and thus guarantees the global convergence of the smoothing SQP algorithm. We demonstrate that the WGMFCQ can be satisfied by bilevel programs for which the GMFCQ never holds. Preliminary numerical experiments show that the algorithm is efficient for solving degenerate nonsmooth optimization problems such as the simple bilevel program.