Results 1  10
of
122
On the limited memory BFGS method for large scale optimization
 MATHEMATICAL PROGRAMMING
, 1989
"... ..."
A scaled conjugate gradient algorithm for fast supervised learning
 NEURAL NETWORKS
, 1993
"... A supervised learning algorithm (Scaled Conjugate Gradient, SCG) with superlinear convergence rate is introduced. The algorithm is based upon a class of optimization techniques well known in numerical analysis as the Conjugate Gradient Methods. SCG uses second order information from the neural netwo ..."
Abstract

Cited by 340 (0 self)
 Add to MetaCart
A supervised learning algorithm (Scaled Conjugate Gradient, SCG) with superlinear convergence rate is introduced. The algorithm is based upon a class of optimization techniques well known in numerical analysis as the Conjugate Gradient Methods. SCG uses second order information from the neural network but requires only O(N) memory usage, where N is the number of weights in the network. The performance of SCG is benchmarked against the performance of the standard backpropagation algorithm (BP) [13], the conjugate gradient backpropagation (CGB) [6] and the onestep BroydenFletcherGoldfarbShanno memoryless quasiNewton algorithm (BFGS) [1]. SCG yields a speedup of at least an order of magnitude relative to BP. The speedup depends on the convergence criterion, i.e., the bigger demand for reduction in error the bigger the speedup. SCG is fully automated including no user dependent parameters and avoids a time consuming linesearch, which CGB and BFGS uses in each iteration in order to determine an appropriate step size.
Incorporating problem dependent structural information in the architecture of a neural network often lowers the overall complexity. The smaller the complexity of the neural network relative to the problem domain, the bigger the possibility that the weight space contains long ravines characterized by sharp curvature. While BP is inefficient on these ravine phenomena, it is shown that SCG handles them effectively.
LARGESCALE LINEARLY CONSTRAINED OPTIMIZATION
, 1978
"... An algorithm for solving largescale nonlinear ' programs with linear constraints is presented. The method combines efficient sparsematrix techniques as in the revised simplex method with stable quasiNewton methods for handling the nonlinearities. A generalpurpose production code (MINOS) is ..."
Abstract

Cited by 93 (15 self)
 Add to MetaCart
An algorithm for solving largescale nonlinear ' programs with linear constraints is presented. The method combines efficient sparsematrix techniques as in the revised simplex method with stable quasiNewton methods for handling the nonlinearities. A generalpurpose production code (MINOS) is described, along with computational experience on a wide variety of problems.
Theory of Algorithms for Unconstrained Optimization
, 1992
"... this article I will attempt to review the most recent advances in the theory of unconstrained optimization, and will also describe some important open questions. Before doing so, I should point out that the value of the theory of optimization is not limited to its capacity for explaining the behavio ..."
Abstract

Cited by 92 (1 self)
 Add to MetaCart
this article I will attempt to review the most recent advances in the theory of unconstrained optimization, and will also describe some important open questions. Before doing so, I should point out that the value of the theory of optimization is not limited to its capacity for explaining the behavior of the most widely used techniques. The question
H.: A new conjugate gradient method with guaranteed descent and an efficient line search
 SIAM J. Optim
, 2005
"... Abstract. A new nonlinear conjugate gradient method and an associated implementation, based on an inexact line search, are proposed and analyzed. With exact line search, our method reduces to a nonlinear version of the Hestenes–Stiefel conjugate gradient scheme. For any (inexact) line search, our sc ..."
Abstract

Cited by 35 (6 self)
 Add to MetaCart
(Show Context)
Abstract. A new nonlinear conjugate gradient method and an associated implementation, based on an inexact line search, are proposed and analyzed. With exact line search, our method reduces to a nonlinear version of the Hestenes–Stiefel conjugate gradient scheme. For any (inexact) line search, our scheme satisfies the descent condition gT k dk ≤ − 7 8 ‖gk‖2. Moreover, a global convergence result is established when the line search fulfills the Wolfe conditions. A new line search scheme is developed that is efficient and highly accurate. Efficiency is achieved by exploiting properties of linear interpolants in a neighborhood of a local minimizer. High accuracy is achieved by using a convergence criterion, which we call the “approximate Wolfe ” conditions, obtained by replacing the sufficient decrease criterion in the Wolfe conditions with an approximation that can be evaluated with greater precision in a neighborhood of a local minimum than the usual sufficient decrease criterion. Numerical comparisons are given with both LBFGS and conjugate gradient methods using the unconstrained optimization problems in the CUTE library.
Markov Monitoring with Unknown States
 IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS
, 1993
"... Pattern recognition methods and hidden Markov models can be effective tools for online health monitoring of communications systems. Previous work has assumed that the states in the system model are exhaustive. This can be a significant drawback in realworld fault monitoring applications where it is ..."
Abstract

Cited by 30 (1 self)
 Add to MetaCart
Pattern recognition methods and hidden Markov models can be effective tools for online health monitoring of communications systems. Previous work has assumed that the states in the system model are exhaustive. This can be a significant drawback in realworld fault monitoring applications where it is if not impossible to model all the possible fault states of the system in advance. In this paper a method is described for extending the Markov monitoring approach to allow for unknown or novel states which can not be accounted for when the model is being designed, The method is described and evaluated on data from one of the Jet Propulsion Laboratory 's Deep Space Network antennas. The experimental results indicate that the method is both practical and effective, allowing both discrimination between known states and detection of previously unknown fault conditions.
Theory and implementation of numerical methods based on RungeKutta integration for solving optimal control problems
, 1996
"... ..."
A survey of nonlinear conjugate gradient methods
 Pacific Journal of Optimization
, 2006
"... ..."
(Show Context)
ConjugateGradient Methods for LargeScale Minimization
 in Meteorology, Monthly Weather Review
"... Abstract. During the last few years, conjugategradient methods have been found to be the best available tool for largescale minimization of nonlinear functions occurring in geophysical applications. While vectorization techniques have been applied to linear conjugategradient methods designed to s ..."
Abstract

Cited by 28 (3 self)
 Add to MetaCart
Abstract. During the last few years, conjugategradient methods have been found to be the best available tool for largescale minimization of nonlinear functions occurring in geophysical applications. While vectorization techniques have been applied to linear conjugategradient methods designed to solve symmetric linear systems of algebraic equations, arising mainly from discretization of elliptic partial differential equations, due to their suitability for vector or parallel processing, no such effort was undertaken for the nonlinear conjugategradient method for largescale unconstrained minimization. Computational results are presented here using a robust memoryless quasiNewtonlike conjugategradient algorithm by Shanno and Phua applied to a set of largescale meteorological problems. These results point to the vectorization of the conjugategradient code inducing a significant speedup in the function and gradient evaluation for the nonlinear conjugategradient method, resulting in a sizable reduction
Algorithm 851: CG DESCENT, a conjugate gradient method with guaranteed descent
 ACM Trans. Math. Softw
, 2006
"... Recently, a new nonlinear conjugate gradient scheme was developed which satisfies the descent condition gT kdk ≤ − 7 8 ‖gk‖2 and which is globally convergent whenever the line search fulfills the Wolfe conditions. This article studies the convergence behavior of the algorithm; extensive numerical t ..."
Abstract

Cited by 18 (3 self)
 Add to MetaCart
Recently, a new nonlinear conjugate gradient scheme was developed which satisfies the descent condition gT kdk ≤ − 7 8 ‖gk‖2 and which is globally convergent whenever the line search fulfills the Wolfe conditions. This article studies the convergence behavior of the algorithm; extensive numerical tests and comparisons with other methods for largescale unconstrained optimization are given.