Results 11 - 20
of
36
Efficient Training of Feed-Forward Neural Networks
, 1997
"... : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 61 A.2 Introduction : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 61 A.2.1 Motivation : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 61 A.3 Optimization strategy : : : : : : : : : : : : ..."
Abstract
-
Cited by 9 (0 self)
- Add to MetaCart
: : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 61 A.2 Introduction : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 61 A.2.1 Motivation : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 61 A.3 Optimization strategy : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 62 A.4 The Backpropagation algorithm : : : : : : : : : : : : : : : : : : : : : : : : 63 A.5 Conjugate direction methods : : : : : : : : : : : : : : : : : : : : : : : : : : 63 A.5.1 Conjugate gradients : : : : : : : : : : : : : : : : : : : : : : : : : : 65 A.5.2 The CGL algorithm : : : : : : : : : : : : : : : : : : : : : : : : : : : 67 A.5.3 The BFGS algorithm : : : : : : : : : : : : : : : : : : : : : : : : : : 67 A.6 The SCG algorithm : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 67 A.7 Test results : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 70 A.7.1 Comparison metric : : : : : : : : : : : : : : : : : : : : : : : :...
On-line Step Size Adaptation
- INESC. 9 Rua Alves Redol, 1000
, 1997
"... Sub-category: online learning algorithms ..."
Online Independent Component Analysis with Local Learning Rate Adaptation
- Neural Information Processing Systems
, 2000
"... Stochastic meta-descent (SMD) is a new technique for online adaptation of local learning rates in arbitrary twice-dierentiable systems. ..."
Abstract
-
Cited by 7 (2 self)
- Add to MetaCart
Stochastic meta-descent (SMD) is a new technique for online adaptation of local learning rates in arbitrary twice-dierentiable systems.
Gradient Descent: Second-Order Momentum and Saturating Error
- In (Moody et al
, 1992
"... Batch gradient descent, \Deltaw(t) = \GammajdE=dw(t), converges to a minimum of quadratic form with a time constant no better than 1 4 max= min where min and max are the minimum and maximum eigenvalues of the Hessian matrix of E with respect to w. It was recently shown that adding a momentum ter ..."
Abstract
-
Cited by 6 (2 self)
- Add to MetaCart
Batch gradient descent, \Deltaw(t) = \GammajdE=dw(t), converges to a minimum of quadratic form with a time constant no better than 1 4 max= min where min and max are the minimum and maximum eigenvalues of the Hessian matrix of E with respect to w. It was recently shown that adding a momentum term \Deltaw(t) = \GammajdE=dw(t) + ff\Deltaw(t \Gamma 1) improves this to 1 4 p max = min , although only in the batch case. Here we show that secondorder momentum, \Deltaw(t) = \GammajdE=dw(t) + ff\Deltaw(t \Gamma 1) + fi \Deltaw(t \Gamma 2), can lower this no further. We then regard gradient descent with momentum as a dynamic system and explore a nonquadratic error surface, showing that saturation of the error accounts for a variety of effects observed in simulations and justifies some popular heuristics. 1 INTRODUCTION Gradient descent is the bread-and-butter optimization technique in neural networks. Some people build special purpose hardware to accelerate gradient descent optimization...
Hybrid Decision Tree
, 2002
"... In this paper, a hybrid learning approach named HDT is proposed. HDT simulates human reasoning by using symbolic leaming to do qualitative analysis and using neural leaming to do subsequent quantitative analysis. It generates the trunk of a binary hybrid decision tree according to the binary informa ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
In this paper, a hybrid learning approach named HDT is proposed. HDT simulates human reasoning by using symbolic leaming to do qualitative analysis and using neural leaming to do subsequent quantitative analysis. It generates the trunk of a binary hybrid decision tree according to the binary information gain ratio criterion in an instance space defined by only original unordered attributes. If unordered attributes cannot further distinguish training examples falling into a leaf node whose diversity is beyond the diversity-threshold, then the node is marked as a dummy node. After all those dummy nodes are marked, a specific feedforward neural network named Fnqc that is trained in an instance space defined by only original ordered attributes is exploited to accomplish the leaming task. Moreover, this paper distinguishes three kinds of incremental learning tasks. Two incremental leaming procedures designed for example-incremental leaming with different storage requirements are provided, which enables HDT to deal gracefully with data sets where new data are frequently appended. Also a hypothesis-driven constructive induction mechanism is provided, which enables HDT to generate compact concept descriptions.
Classification-Based Objective Functions
- Machine Learning. In
, 2007
"... Abstract. Backpropagation, similar to most learning algorithms that can form complex decision surfaces, is prone to overfitting. This work presents classification-based objective functions, an intuitive approach to training artificial neural networks on classification problems. Classification-based ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
Abstract. Backpropagation, similar to most learning algorithms that can form complex decision surfaces, is prone to overfitting. This work presents classification-based objective functions, an intuitive approach to training artificial neural networks on classification problems. Classification-based learning attempts to guide the network directly to correct pattern classification rather than using an implicit search of common error minimization heuristics, such as sum-squared-error (SSE) and cross-entropy (CE). CB1 is presented here as a novel objective function for learning classification problems. It seeks to directly minimize classification error by backpropagating error only on misclassified patterns from culprit output nodes. CB1 discourages weight saturation and overfitting and achieves higher accuracy on classification problems than optimizing SSE or CE. Experiments on a large OCR data set have shown CB1 to significantly increase generalization accuracy over SSE or CE optimization, from 97.86 % and 98.10%, respectively, to 99.11%. Comparable results are achieved over several data sets from the UC Irvine Machine Learning Database Repository, with an average increase in accuracy from 90.7 % and 91.3 % using optimized SSE and CE networks, respectively, to 92.1 % for CB1. Analysis indicates that CB1 performs a fundamentally different search of the feature space than optimizing SSE or CE and produces significantly different solutions.
Speeding Up Fuzzy Clustering with Neural Network Techniques
- Fuzzy Systems
"... Abstract — We explore how techniques that were developed to improve the training process of artificial neural networks can be used to speed up fuzzy clustering. The basic idea of our approach is to regard the difference between two consecutive steps of the alternating optimization scheme of fuzzy cl ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
Abstract — We explore how techniques that were developed to improve the training process of artificial neural networks can be used to speed up fuzzy clustering. The basic idea of our approach is to regard the difference between two consecutive steps of the alternating optimization scheme of fuzzy clustering as providing a gradient, which may be modified in the same way as the gradient of neural network backpropagation is modified in order to improve training. Our experimental results show that some methods actually lead to a considerable acceleration of the clustering process. I.
JETNET 3.0 - A Versatile Artificial Neural Network Package
, 1993
"... this paper quantities written in sans-serif denote matrices and quantities written in boldface denote vectors ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
this paper quantities written in sans-serif denote matrices and quantities written in boldface denote vectors
Robot Learning - Three case studies in Robotics and Machine Learning
, 1994
"... This paper describes methodologies applied and results achieved in the framework of the ESPRIT Basic Research Action B-Learn II (project no. 7274). B-Learn II is one of the first projects working towards an application of Machine Learning techniques in fields of industrial relevance, which are much ..."
Abstract
-
Cited by 4 (4 self)
- Add to MetaCart
This paper describes methodologies applied and results achieved in the framework of the ESPRIT Basic Research Action B-Learn II (project no. 7274). B-Learn II is one of the first projects working towards an application of Machine Learning techniques in fields of industrial relevance, which are much more complex than the domains usually treated in ML research. In particular, B-Learn II aims at easing the programming of robots and enhancing their ability to cooperate with humans. The paper gives a short introduction to learning in robotics and to the three applications under consideration in B-Learn II. Afterwards, learning methodologies used in each of the applications, the experimental setups, and the results obtained are described. In general, it can be found that providing good examples and a good interface between the learning and the performance components is crucial for success, so the extension of the "Programming by Demonstration" paradigm to robotics has become one of the key a...
FANRE: A Fast Adaptive Neural Regression Estimator
- Lecture Notes in Artificial Intelligence
, 1999
"... In this paper, a fast adaptive neural regression estimator named FANRE is proposed. FANRE exploits the advantages of both Adaptive Resonance Theory and Field Theory while contraposing the Characteristic of regression problems. It achieves not only impressive approximating results but also fast learn ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
In this paper, a fast adaptive neural regression estimator named FANRE is proposed. FANRE exploits the advantages of both Adaptive Resonance Theory and Field Theory while contraposing the Characteristic of regression problems. It achieves not only impressive approximating results but also fast learning speed. Besides, FANRE has incremental learning ability.

