Results 1 - 10
of
13
On the limited memory BFGS method for large scale optimization
- Mathematical Programming
, 1989
"... this paper has appeared in ..."
Theory of Algorithms for Unconstrained Optimization
, 1992
"... this article I will attempt to review the most recent advances in the theory of unconstrained optimization, and will also describe some important open questions. Before doing so, I should point out that the value of the theory of optimization is not limited to its capacity for explaining the behavio ..."
Abstract
-
Cited by 67 (1 self)
- Add to MetaCart
this article I will attempt to review the most recent advances in the theory of unconstrained optimization, and will also describe some important open questions. Before doing so, I should point out that the value of the theory of optimization is not limited to its capacity for explaining the behavior of the most widely used techniques. The question
NeuroRule: a connectionist approach to data mining
- In Proceedings of the International Conference on Very Large Databases (VLDB95
, 1995
"... Classification, which involves finding rules that partition a given da.ta set into disjoint groups, is one class of data mining problems. Approaches proposed so far for mining classification rules for large databases are mainly decision tree based symbolic learning methods. The connectionist approac ..."
Abstract
-
Cited by 34 (6 self)
- Add to MetaCart
Classification, which involves finding rules that partition a given da.ta set into disjoint groups, is one class of data mining problems. Approaches proposed so far for mining classification rules for large databases are mainly decision tree based symbolic learning methods. The connectionist approach based on neura.l networks has been thought not well suited for data mining. One of the major reasons cited is that knowledge generated by neural networks is not explicitly represented in the form of rules suitable for verification or interpretation by humans. This paper examines this issue. With our newly developed algorithms, rules which are similar to, or more concise than those generated by the symbolic methods can be extracted from the neural networks. The data mining process using neural networks with the emphasis on rule extraction is described. ExperimenM results and comparison with previously published works are presented. 1
On the Barzilai-Borwein method
, 2001
"... A review is given of the underlying theory and recent developments in regard to the Barzilai-Borwein steepest descent method for large scale unconstrained optimization. One aim is to assess why the method seems to be comparable in practical eciency to conjugate gradient methods. The importance of ..."
Abstract
-
Cited by 17 (1 self)
- Add to MetaCart
A review is given of the underlying theory and recent developments in regard to the Barzilai-Borwein steepest descent method for large scale unconstrained optimization. One aim is to assess why the method seems to be comparable in practical eciency to conjugate gradient methods. The importance of using a non-monotone line search is stressed, although some suggestions are made as to why the modi- cation proposed by Raydan [22] often does not usually perform well for an illconditioned problem. Extensions for box constraints are discussed. A number of interesting open questions are put forward. Keywords Barzilai-Borwein method, steepest descent, elliptic systems, unconstrained optimization. 1
A Penalty-Function Approach for Pruning Feedforward Neural Networks
- Neural Computation
, 1994
"... This paper proposes the use of a penalty function for pruning feedforward neural network by weight elimination. The penalty function proposed consists of two terms; the first term is to discourage the use of unnecessary connections and the second term is to prevent the weights of the connections ..."
Abstract
-
Cited by 13 (6 self)
- Add to MetaCart
This paper proposes the use of a penalty function for pruning feedforward neural network by weight elimination. The penalty function proposed consists of two terms; the first term is to discourage the use of unnecessary connections and the second term is to prevent the weights of the connections from taking excessively large values. Simple criteria for eliminating weights from the network are also given. The effectiveness of this penalty function is tested on three well known problems. These test problems are the contiguity problem, the parity problems, and the monks problems. The resulting pruned networks obtained for many of these problems have fewer connections than previously reported in the literature. 1 Introduction We are concerned in this paper with finding a minimal feedforward backpropagation neural network for solving the problem of distinguishing patterns from two or more sets in n-dimensional space. Backpropagation feedforward neural networks have been gaining ac...
An overview of unconstrained optimization
- Online]. Available: citeseer.ist.psu.edu/fletcher93overview.html 150
, 1993
"... bundle filter method for nonsmooth nonlinear ..."
A Maximum-entropy Solution to the Frame-dependency Problem in Speech Recognition
, 2001
"... The HMM assumption of conditional independence of observations causes a variety of problems for speech-recognition applications. Previous attempts to construct acoustic models that remove this assumption have suffered from a significant increase in the number of parameters to train. Another weakness ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
The HMM assumption of conditional independence of observations causes a variety of problems for speech-recognition applications. Previous attempts to construct acoustic models that remove this assumption have suffered from a significant increase in the number of parameters to train. Another weakness of current acoustic models is that they do not account for the origin of derived features (estimated derivatives). We show how to both remove the independence assumption and properly account for derived features, with little or no increase in the number of parameters to train, by applying the principle of maximum entropy. We also show that ignoring the origins of derived features in training HMM acoustic models can lead to severe distortions of the effective language model. Evaluation of our maxent model on a simple problem cuts an already-low error rate in half compared to an equivalent HMM with the same number of parameters.
Training Feed-Forward Neural Networks Using Conjugate Gradients
- In SPIE
, 1992
"... this paper show very significant improvements in the method of training feed forward networks are readily obtainable. Benefits extend to perceptron applications beyond character recognition such as speech synthesis and control. Backpropagation (BP) has been used for some years ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
this paper show very significant improvements in the method of training feed forward networks are readily obtainable. Benefits extend to perceptron applications beyond character recognition such as speech synthesis and control. Backpropagation (BP) has been used for some years
Notes on Limited Memory BFGS Updating in a Trust-Region Framework
, 1996
"... Abstract. The limited memory BFGS method pioneered by Jorge Nocedal is usually implemented as a line search method where the search direction is computed from a BFGS approximation to the inverse of the Hessian. The advantage of inverse updating is that the search directions are obtained by a matrix– ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Abstract. The limited memory BFGS method pioneered by Jorge Nocedal is usually implemented as a line search method where the search direction is computed from a BFGS approximation to the inverse of the Hessian. The advantage of inverse updating is that the search directions are obtained by a matrix–vector multiplication. In this paper it is observed that limited memory updates to the Hessian approximations can also be applied in the context of a trust–region algorithm with only a modest increase in the linear algebra costs. At each iteration, a limited memory BFGS Step is computed. If it is rejected, then we compute the solutions for trust-region subproblem with the trust-region radius smaller than the length of L-BFGS Step. Numerical results on a few of the MINPACK-2 test problems show that the initial limited memory BFGS Step is accepted in most cases. In terms of the number of function and gradient evaluations, the trust-region approach is comparable to a standard linesearch implementation. 1.
Some Numerical Methods For The Study Of The Convexity Notions Arising In The Calculus Of Variations
- M AN , Mathematical Modelling and Numerical Analysis
, 1998
"... . We propose numerical schemes to determine whether a given function is convex, polyconvex, quasiconvex, and rank one convex. These notions are of fundamental importance in the vectorial problems of the calculus of variations. 1. Introduction One of the most important problems in the calculus of va ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
. We propose numerical schemes to determine whether a given function is convex, polyconvex, quasiconvex, and rank one convex. These notions are of fundamental importance in the vectorial problems of the calculus of variations. 1. Introduction One of the most important problems in the calculus of variations deals with the integral I(u) = Z \Omega f(ru(x)) dx (1) where 1.\Omega ae R n is a bounded open set, 2. u :\Omega ae R n \Gamma! R m belongs to a Sobolev space, 3. f : R m\Thetan \Gamma! R is a continuous function. Usually one wants to minimize (1) subject to some constraints, e.g. certain boundary conditions, isoperimetric constraints, etc : : : The only general method to deal with these problems consists in proving the sequential weak lower semicontinuity of I(u). When m = 1 or n = 1, this property is equivalent to the convexity of f . However, when m;n ? 1, it is equivalent to the so called quasiconvexity of f , a notion introduced by Morrey [23]. Definition 1.1. ...

