Results 1  10
of
63
First and SecondOrder Methods for Learning: between Steepest Descent and Newton's Method
 Neural Computation
, 1992
"... Online first order backpropagation is sufficiently fast and effective for many largescale classification problems but for very high precision mappings, batch processing may be the method of choice. This paper reviews first and secondorder optimization methods for learning in feedforward neura ..."
Abstract

Cited by 162 (7 self)
 Add to MetaCart
Online first order backpropagation is sufficiently fast and effective for many largescale classification problems but for very high precision mappings, batch processing may be the method of choice. This paper reviews first and secondorder optimization methods for learning in feedforward neural networks. The viewpoint is that of optimization: many methods can be cast in the language of optimization techniques, allowing the transfer to neural nets of detailed results about computational complexity and safety procedures to ensure convergence and to avoid numerical problems. The review is not intended to deliver detailed prescriptions for the most appropriate methods in specific applications, but to illustrate the main characteristics of the different methods and their mutual relations.
Theory of Algorithms for Unconstrained Optimization
, 1992
"... this article I will attempt to review the most recent advances in the theory of unconstrained optimization, and will also describe some important open questions. Before doing so, I should point out that the value of the theory of optimization is not limited to its capacity for explaining the behavio ..."
Abstract

Cited by 104 (1 self)
 Add to MetaCart
this article I will attempt to review the most recent advances in the theory of unconstrained optimization, and will also describe some important open questions. Before doing so, I should point out that the value of the theory of optimization is not limited to its capacity for explaining the behavior of the most widely used techniques. The question
An Analysis for the DIIS Acceleration Method used in Quantum Chemistry Calculations
, 2010
"... The consecutive numbering of the publications is determined by their chronological order. The aim of this preprint series is to make new research rapidly available for scientific discussion. Therefore, the responsibility for the contents is solely due to the authors. The publications will be distrib ..."
Abstract

Cited by 76 (5 self)
 Add to MetaCart
(Show Context)
The consecutive numbering of the publications is determined by their chronological order. The aim of this preprint series is to make new research rapidly available for scientific discussion. Therefore, the responsibility for the contents is solely due to the authors. The publications will be distributed by the authors. An analysis for the DIIS acceleration method used in quantum chemistry calculations Thorsten Rohwedder and Reinhold Schneider Abstract. This work features an analysis for the acceleration technique DIIS that is standardly used in most of the important quantum chemistry codes, e.g. in DFT and HartreeFock calculations and in the Coupled Cluster method. Taking up results from [23], we show that for the general nonlinear case, DIIS corresponds to a projected quasiNewton/ secant method. For linear systems, we establish connections to the wellknown GMRES solver and transfer according (positive as well as negative) convergence results to DIIS. In particular, we discuss the circumstances under which DIIS exhibits superlinear convergence behaviour. For the general nonlinear case, we then use these results to show that a DIIS step can be interpreted as step of a quasiNewton method in which the Jacobian used in the Newton step is approximated by finite differences and in which the according linear system is solved by a GMRES procedure, and give according convergence estimates.
UOBYQA: unconstrained optimization by quadratic approximation
, 2000
"... : UOBYQA is a new algorithm for general unconstrained optimization calculations, that takes account of the curvature of the objective function, F say, by forming quadratic models by interpolation. Therefore, because no first derivatives are required, each model is defined by 1 2 (n+1)(n+2) values ..."
Abstract

Cited by 67 (3 self)
 Add to MetaCart
: UOBYQA is a new algorithm for general unconstrained optimization calculations, that takes account of the curvature of the objective function, F say, by forming quadratic models by interpolation. Therefore, because no first derivatives are required, each model is defined by 1 2 (n+1)(n+2) values of F , where n is the number of variables, and the interpolation points must have the property that no nonzero quadratic polynomial vanishes at all of them. A typical iteration of the algorithm generates a new vector of variables, e x t say, either by minimizing the quadratic model subject to a trust region bound, or by a procedure that should improve the accuracy of the model. Then usually F (e x t ) is obtained, and one of the interpolation points is replaced by e x t . Therefore the paper addresses the initial positions of the interpolation points, the adjustment of trust region radii, the calculation of e x t in the two cases that have been mentioned, and the selection of the point to b...
Inexact Newton Methods for Solving Nonsmooth Equations
 Journal of Computational and Applied Mathematics
, 1999
"... This paper investigates inexact Newton methods for solving systems of nonsmooth equations. We define two inexact Newton methods for locally Lipschitz functions and we prove local (linear and superlinear) convergence results under the assumptions of semismoothness and BDregularity at the solution. W ..."
Abstract

Cited by 29 (9 self)
 Add to MetaCart
This paper investigates inexact Newton methods for solving systems of nonsmooth equations. We define two inexact Newton methods for locally Lipschitz functions and we prove local (linear and superlinear) convergence results under the assumptions of semismoothness and BDregularity at the solution. We introduce a globally convergent inexact iteration function based method. We discuss implementations and we give some numerical examples.
A Family of Variable Metric Proximal Methods
, 1993
"... We consider conceptual optimization methods combining two ideas: the MoreauYosida regularization in convex analysis, and quasiNewton approximations of smooth functions. We outline several approaches based on this combination, and establish their global convergence. Then we study theoretically the ..."
Abstract

Cited by 26 (2 self)
 Add to MetaCart
We consider conceptual optimization methods combining two ideas: the MoreauYosida regularization in convex analysis, and quasiNewton approximations of smooth functions. We outline several approaches based on this combination, and establish their global convergence. Then we study theoretically the local convergence properties of one of these approaches, which uses quasiNewton updates of the objective function itself. Also, we obtain a globally and superlinearly convergent BFGS proximal method. At each step of our study, we single out the assumptions that are useful to derive the result concerned.
Superlinear Convergence And Implicit Filtering
, 1999
"... . In this note we show how the implicit filtering algorithm can be coupled with the BFGS quasiNewton update to obtain a superlinearly convergent iteration if the noise in the objective function decays sufficiently rapidly as the optimal point is approached. We show how known theory for the noisefr ..."
Abstract

Cited by 25 (3 self)
 Add to MetaCart
(Show Context)
. In this note we show how the implicit filtering algorithm can be coupled with the BFGS quasiNewton update to obtain a superlinearly convergent iteration if the noise in the objective function decays sufficiently rapidly as the optimal point is approached. We show how known theory for the noisefree case can be extended and thereby provide a partial explanation for the good performance of quasiNewton methods when coupled with implicit filtering. Key words. noisy optimization, implicit filtering, BFGS algorithm, superlinear convergence AMS subject classifications. 65K05, 65K10, 90C30 1. Introduction. In this paper we examine the local and global convergence behavior of the combination of the BFGS [4], [20], [17], [23] quasiNewton method with the implicit filtering algorithm. The resulting method is intended to minimize smooth functions that are perturbed with lowamplitude noise. Our results, which extend those of [5], [15], and [6], show that if the amplitude of the noise decays ...
Derivative Convergence for Iterative Equation Solvers
, 1993
"... this paper, we consider two approaches to computing the desired implicitly defined derivative x ..."
Abstract

Cited by 24 (16 self)
 Add to MetaCart
(Show Context)
this paper, we consider two approaches to computing the desired implicitly defined derivative x
Fast Secant Methods for the Iterative Solution of Large Nonsymmetric Linear Systems
 IMPACT OF COMPUTING IN SCIENCE AND ENGINEERING
, 1990
"... A family of secant methods based on general rank1 updates has been revisited in view of the construction of iterative solvers for large nonHermitian linear systems. As it turns out, both Broyden's "good" and "bad" update techniques play a special role — but should be assoc ..."
Abstract

Cited by 23 (4 self)
 Add to MetaCart
A family of secant methods based on general rank1 updates has been revisited in view of the construction of iterative solvers for large nonHermitian linear systems. As it turns out, both Broyden's "good" and "bad" update techniques play a special role — but should be associated with two different line search principles. For Broyden's "bad" update technique, a minimum residual principle is natural — thus making it theoretically comparable with a series of wellknown algorithms like GMRES. Broyden's "good" update technique, however, is shown to be naturally linked with a minimum "next correction" principle — which asymptotically mimics a minimum error principle. The two minimization principles differ significantly for sufficiently large system dimension. Numerical experiments on discretized PDE's of convection diffusion type in 2D with internal layers give a first impression of the possible power of the derived "good" Broyden variant.
Convergence theorems for least change secant update methods
 SIAM Journal of Numerical Analysis
, 1981
"... Abstract. The purpose of this paper is to present a convergence analysis of least change secant methods in which part of the derivative matrix being approximated is computed by other means. The theorems and proofs given here can be viewed as generalizations of those given by BroydenDennisMor6 [J. ..."
Abstract

Cited by 20 (1 self)
 Add to MetaCart
(Show Context)
Abstract. The purpose of this paper is to present a convergence analysis of least change secant methods in which part of the derivative matrix being approximated is computed by other means. The theorems and proofs given here can be viewed as generalizations of those given by BroydenDennisMor6 [J. Inst. Math. Appl. 12 (1973), pp. 223246] and by DennisMor6 [Math. Comp., 28 (1974), pp. 549560]. The analysis is done in the orthogonal projection setting of DennisSchnabel [SIAM Rev., 21 (1980), pp. 443459] and many readers might feel that it is easier to understand. The theorems here readily imply local and qsuperlinear convergence of all the standard methods in addition to proving these results for the first time for the sparse symmetric method of Marwil and Toint and the nonlinear leastsquares method of DennisGayWelsch. 1. Introduction. The