Results 1 - 10
of
92
A Direct Adaptive Method for Faster Backpropagation Learning: The RPROP Algorithm
- IEEE INTERNATIONAL CONFERENCE ON NEURAL NETWORKS
, 1993
"... A new learning algorithm for multilayer feedforward networks, RPROP, is proposed. To overcome the inherent disadvantages of pure gradient-descent, RPROP performs a local adaptation of the weight-updates according to the behaviour of the errorfunction. In substantial difference to other adaptive tech ..."
Abstract
-
Cited by 505 (32 self)
- Add to MetaCart
A new learning algorithm for multilayer feedforward networks, RPROP, is proposed. To overcome the inherent disadvantages of pure gradient-descent, RPROP performs a local adaptation of the weight-updates according to the behaviour of the errorfunction. In substantial difference to other adaptive techniques, the effect of the RPROP adaptation process is not blurred by the unforseeable influence of the size of the derivative but only dependent on the temporal behaviour of its sign. This leads to an efficient and transparent adaptation process. The promising capabilities of RPROP are shown in comparison to other wellknown adaptive techniques.
Growing Cell Structures - A Self-organizing Network for Unsupervised and Supervised Learning
- Neural Networks
, 1993
"... We present a new self-organizing neural network model having two variants. The first variant performs unsupervised learning and can be used for data visualization, clustering, and vector quantization. The main advantage over existing approaches, e.g., the Kohonen feature map, is the ability of the m ..."
Abstract
-
Cited by 228 (11 self)
- Add to MetaCart
We present a new self-organizing neural network model having two variants. The first variant performs unsupervised learning and can be used for data visualization, clustering, and vector quantization. The main advantage over existing approaches, e.g., the Kohonen feature map, is the ability of the model to automatically find a suitable network structure and size. This is achieved through a controlled growth process which also includes occasional removal of units. The second variant of the model is a supervised learning method which results from the combination of the abovementioned self-organizing network with the radial basis function (RBF) approach. In this model it is possible - in contrast to earlier approaches - to perform the positioning of the RBF units and the supervised training of the weights in parallel. Therefore, the current classification error can be used to determine where to insert new RBF units. This leads to small networks which generalize very well. Results on the t...
PROBEN1 - a set of neural network benchmark problems and benchmarking rules
, 1994
"... Proben1 is a collection of problems for neural network learning in the realm of pattern classification and function approximation plus a set of rules and conventions for carrying out benchmark tests with these or similar problems. Proben1 contains 15 data sets from 12 different domains. All datasets ..."
Abstract
-
Cited by 156 (0 self)
- Add to MetaCart
Proben1 is a collection of problems for neural network learning in the realm of pattern classification and function approximation plus a set of rules and conventions for carrying out benchmark tests with these or similar problems. Proben1 contains 15 data sets from 12 different domains. All datasets represent realistic problems which could be called diagnosis tasks and all but one consist of real world data. The datasets are all presented in the same simple format, using an attribute representation that can directly be used for neural network training. Along with the datasets, Proben1 defines a set of rules for how to conduct and how to document neural network benchmarking. The purpose of the problem and rule collection is to give researchers easy access to data for the evaluation of their algorithms and networks and to make direct comparison of the published results feasible. This report describes the datasets and the benchmarking rules. It also gives some basic performance measures indicating the difficulty of the various problems. These measures can be used as baselines for comparison.
The Sample Complexity of Pattern Classification With Neural Networks: The Size of the Weights is More Important Than the Size of the Network
, 1997
"... Sample complexity results from computational learning theory, when applied to neural network learning for pattern classification problems, suggest that for good generalization performance the number of training examples should grow at least linearly with the number of adjustable parameters in the ne ..."
Abstract
-
Cited by 156 (14 self)
- Add to MetaCart
Sample complexity results from computational learning theory, when applied to neural network learning for pattern classification problems, suggest that for good generalization performance the number of training examples should grow at least linearly with the number of adjustable parameters in the network. Results in this paper show that if a large neural network is used for a pattern classification problem and the learning algorithm finds a network with small weights that has small squared error on the training patterns, then the generalization performance depends on the size of the weights rather than the number of weights. For example, consider a two-layer feedforward network of sigmoid units, in which the sum of the magnitudes of the weights associated with each unit is bounded by A and the input dimension is n. We show that the misclassification probability is no more than a certain error estimate (that is related to squared error on the training set) plus A³ p (log n)=m (ignori...
Cooperative Coevolution: An Architecture for Evolving Coadapted Subcomponents
- Evolutionary Computation
, 2000
"... To successfully apply evolutionary algorithms to the solution of increasingly complex problems, we must develop effective techniques for evolving solutions in the form of interacting coadapted subcomponents. One of the major difficulties is finding computational extensions to our current evolutionar ..."
Abstract
-
Cited by 153 (4 self)
- Add to MetaCart
To successfully apply evolutionary algorithms to the solution of increasingly complex problems, we must develop effective techniques for evolving solutions in the form of interacting coadapted subcomponents. One of the major difficulties is finding computational extensions to our current evolutionary paradigms that will enable such subcomponents to “emerge ” rather than being hand designed. In this paper, we describe an architecture for evolving such subcomponents as a collection of cooperating species. Given a simple stringmatching task, we show that evolutionary pressure to increase the overall fitness of the ecosystem can provide the needed stimulus for the emergence of an appropriate number of interdependent subcomponents that cover multiple niches, evolve to an appropriate level of generality, and adapt as the number and roles of their fellow subcomponents change over time. We then explore these issues within the context of a more complicated domain through a case study involving the evolution of artificial neural networks.
Extracting Comprehensible Models from Trained Neural Networks
, 1996
"... To Mom, Dad, and Susan, for their support and encouragement. ..."
Abstract
-
Cited by 65 (4 self)
- Add to MetaCart
To Mom, Dad, and Susan, for their support and encouragement.
Bootstrapping with Noise: An Effective Regularization Technique
- Connection Science
, 1996
"... Bootstrap samples with noise are shown to be an effective smoothness and capacity control technique for training feed-forward networks and for other statistical methods such as generalized additive models. It is shown that noisy bootstrap performs best in conjunction with weight decay regularization ..."
Abstract
-
Cited by 53 (14 self)
- Add to MetaCart
Bootstrap samples with noise are shown to be an effective smoothness and capacity control technique for training feed-forward networks and for other statistical methods such as generalized additive models. It is shown that noisy bootstrap performs best in conjunction with weight decay regularization and ensemble averaging. The two-spiral problem, a highly non-linear noise-free data, is used to demonstrate these findings. The combination of noisy bootstrap and ensemble averaging is also shown useful for generalized additive modeling, and is also demonstrated on the well known Cleveland Heart Data [7]. Keywords: Noise Injection, Combining Estimators, Pattern Classification, Two Spiral Problem Clinical Data Analysis. 1 Introduction The bootstrap technique has become one of the major tools for producing empirical confidence intervals of estimated parameters or predictors [8]. One way to view bootstrap is as a method to simulate noise inherent in the data, and thus, increase effectively t...
Neural-Network Feature Selector
- IEEE Transactions on Neural Networks
, 1997
"... Feature selection is an integral part of most learning algorithms. Due to the existence of irrelevant and redundant attributes, by selecting only the relevant attributes of the data, higher predictive accuracy can be expected from a machine learning method. In this paper, we propose the use of a ..."
Abstract
-
Cited by 48 (3 self)
- Add to MetaCart
Feature selection is an integral part of most learning algorithms. Due to the existence of irrelevant and redundant attributes, by selecting only the relevant attributes of the data, higher predictive accuracy can be expected from a machine learning method. In this paper, we propose the use of a three-layer feedforward neural network to select those input attributes that are most useful for discriminating classes in a given set of input patterns. A network pruning algorithm is the foundation of the proposed algorithm. By adding a penalty term to the error function of the network, redundant network connections can be distinguished from those relevant ones by their small weights when the network training process has been completed. A simple criterion to remove an attribute based on the accuracy rate of the network is developed. The network is retrained after removal of an attribute, and the selection process is repeated until no attribute meets the criterion for removal. Our ...
Selective Sampling For Nearest Neighbor Classifiers
- MACHINE LEARNING
, 2004
"... Most existing inductive learning algorithms work under the assumption that their training examples are already tagged. There are domains, however, where the tagging procedure requires significant computation resources or manual labor. In such cases, it may be beneficial for the learner to be active, ..."
Abstract
-
Cited by 44 (3 self)
- Add to MetaCart
Most existing inductive learning algorithms work under the assumption that their training examples are already tagged. There are domains, however, where the tagging procedure requires significant computation resources or manual labor. In such cases, it may be beneficial for the learner to be active, intelligently selecting the examples for labeling with the goal of reducing the labeling cost. In this paper we present LSS---a lookahead algorithm for selective sampling of examples for nearest neighbor classifiers. The algorithm is looking for the example with the highest utility, taking its effect on the resulting classifier into account. Computing the expected utility of an example requires estimating the probability of its possible labels. We propose to use the random field model for this estimation. The LSS algorithm was evaluated empirically on seven real and artificial data sets, and its performance was compared to other selective sampling algorithms. The experiments show that the proposed algorithm outperforms other methods in terms of average error rate and stability.

