Results 1  10
of
25
Growing Cell Structures  A Selforganizing Network for Unsupervised and Supervised Learning
 Neural Networks
, 1993
"... We present a new selforganizing neural network model having two variants. The first variant performs unsupervised learning and can be used for data visualization, clustering, and vector quantization. The main advantage over existing approaches, e.g., the Kohonen feature map, is the ability of the m ..."
Abstract

Cited by 258 (11 self)
 Add to MetaCart
We present a new selforganizing neural network model having two variants. The first variant performs unsupervised learning and can be used for data visualization, clustering, and vector quantization. The main advantage over existing approaches, e.g., the Kohonen feature map, is the ability of the model to automatically find a suitable network structure and size. This is achieved through a controlled growth process which also includes occasional removal of units. The second variant of the model is a supervised learning method which results from the combination of the abovementioned selforganizing network with the radial basis function (RBF) approach. In this model it is possible  in contrast to earlier approaches  to perform the positioning of the RBF units and the supervised training of the weights in parallel. Therefore, the current classification error can be used to determine where to insert new RBF units. This leads to small networks which generalize very well. Results on the t...
Fast Training Algorithms For MultiLayer Neural Nets
, 1993
"... Training a multilayer neural net by backpropagation is slow and requires arbitrary choices regarding the number of hidden units and layers. This paper describes an algorithm which is much faster than backpropagation and for which it is not necessary to specify the number of hidden units in advance ..."
Abstract

Cited by 31 (0 self)
 Add to MetaCart
Training a multilayer neural net by backpropagation is slow and requires arbitrary choices regarding the number of hidden units and layers. This paper describes an algorithm which is much faster than backpropagation and for which it is not necessary to specify the number of hidden units in advance. The relationship with other fast pattern recognition algorithms, such as algorithms based on kd trees, is mentioned. The algorithm has been implemented and tested on articial problems such as the parity problem and on real problems arising in speech recognition. Experimental results, including training times and recognition accuracy, are given. Generally, the algorithm achieves accuracy as good as or better than nets trained using backpropagation, and the training process is much faster than backpropagation. Accuracy is comparable to that for the \nearest neighbour" algorithm, which is slower and requires more storage space. Comments Only the Abstract is given here. The full paper ap...
Multiclass AdaBoost
 STATISTICS AND ITS INTERFACE VOLUME
, 2009
"... Boosting has been a very successful technique for solving the twoclass classification problem. In going from twoclass to multiclass classification, most algorithms have been restricted to reducing the multiclass classification problem to multiple twoclass problems. In this paper, we develop a n ..."
Abstract

Cited by 24 (1 self)
 Add to MetaCart
Boosting has been a very successful technique for solving the twoclass classification problem. In going from twoclass to multiclass classification, most algorithms have been restricted to reducing the multiclass classification problem to multiple twoclass problems. In this paper, we develop a new algorithm that directly extends the AdaBoost algorithm to the multiclass case without reducing it to multiple twoclass problems. We show that the proposed multiclass AdaBoost algorithm is equivalent to a forward stagewise additive modeling algorithm that minimizes a novel exponential loss for multiclass classification. Furthermore, we show that the exponential loss is a member of a class of Fisherconsistent loss functions for multiclass classification. As shown in the paper, the new algorithm is extremely easy to implement and is highly competitive in terms of misclassification error rate.
A knearest neighbor classification rule based on DempsterShafer theory
 IEEE Trans. on Systems, Man and Cybernetics
, 1995
"... ..."
Square Unit Augmented, Radially Extended, Multilayer Perceptrons
 Neural Networks: Tricks of the Trade
"... . Consider a multilayer perceptron (MLP) with d inputs, a single hidden sigmoidal layer and a linear output. By adding an additional d inputs to the network with values set to the square of the first d inputs, properties reminiscent of higherorder neural networks and radial basis function netw ..."
Abstract

Cited by 17 (1 self)
 Add to MetaCart
. Consider a multilayer perceptron (MLP) with d inputs, a single hidden sigmoidal layer and a linear output. By adding an additional d inputs to the network with values set to the square of the first d inputs, properties reminiscent of higherorder neural networks and radial basis function networks (RBFN) are added to the architecture with little added expense in terms of weight requirements. Of particular interest, this architecture has the ability to form localized features in a ddimensional space with a single hidden node but can also span large volumes of the input space; thus, the architecture has the localized properties of an RBFN but does not suffer as badly from the curse of dimensionality. I refer to a network of this type as a SQuare Unit Augmented, Radially Extended, MultiLayer Perceptron (SQUAREMLP or SMLP). 1 Introduction and Motivation When faced with a new and challenging problem, the most crucial decision that a neural network researcher must make is in...
Automatic model selection in a hybrid perceptron/radial network
 TO APPEAR: SPECIAL ISSUE OF INFORMATION FUSION ON MULTIPLE EXPERTS
, 2002
"... ..."
A hybrid projection based and radial basis function architecture: Initial values and global optimization
, 2001
"... We introduce a mechanism for constructing and training a hybrid architecture of projection based units and radial basis functions. In particular, we introduce an optimization scheme which includes several steps and assures a convergence to a useful solution. During network architecture constructi ..."
Abstract

Cited by 13 (6 self)
 Add to MetaCart
We introduce a mechanism for constructing and training a hybrid architecture of projection based units and radial basis functions. In particular, we introduce an optimization scheme which includes several steps and assures a convergence to a useful solution. During network architecture construction and training, it is determined whether a unit should be removed or replaced. The resulting architecture has often smaller number of units compared with competing architectures. A specific overfitting resulting from shrinkage of the RBF radii is addressed by introducing a penalty on small radii. Classification and regression results are demonstrated on various benchmark data sets and compared with several variants of RBF networks [?, ?]. A striking performance improvement is achieved on the vowel data set [?]. Keywords: Projection units, RBF Units, Hybrid Network Architecture, SMLP, Clustering, Regularization. 1
Estimating APosteriori Probabilities Using Stochastic Network Models
 IN PROCEEDINGS OF THE SUMMER SCHOOL ON NEURAL NETWORKS
, 1994
"... In this paper we present a systematic approach to constructing neural network classifiers based on stochastic model theory. A two step process is described where the first problem is to model the stochastic relationship between sample patterns and their classes using a stochastic neural network. The ..."
Abstract

Cited by 8 (1 self)
 Add to MetaCart
In this paper we present a systematic approach to constructing neural network classifiers based on stochastic model theory. A two step process is described where the first problem is to model the stochastic relationship between sample patterns and their classes using a stochastic neural network. Then we convert the stochastic network to a deterministic one, which calculates the aposteriori probabilities of the stochastic counterpart. That is, the outputs of the final network estimate aposteriori probabilities by construction. The wellknown method of normalizing network outputs by applying the softmax function in order to allow a probabilistic interpretation is shown to be more than a heuristic, since it is wellfounded in the context of stochastic networks. Simulation results show a performance of our networks superior to standard multilayer networks in the case of few training samples and a large number of classes.
Discovering Efficient Learning Rules for Feedforward Neural Networks using Genetic Programming
, 2002
"... The Standard BackPropagation (SBP) algorithm is the most widely known and used learning method for training neural networks. Unfortunately, SBP suffers from several problems such as sensitivity to the initial conditions and very slow convergence. Here we describe how we used Genetic Programming, ..."
Abstract

Cited by 6 (0 self)
 Add to MetaCart
The Standard BackPropagation (SBP) algorithm is the most widely known and used learning method for training neural networks. Unfortunately, SBP suffers from several problems such as sensitivity to the initial conditions and very slow convergence. Here we describe how we used Genetic Programming, a search algorithm inspired by Darwinian evolution, to discover new supervised learning algorithms for neural networks which can overcome some of these problems. Comparing our new algorithms with SBP on different problems we show that these are faster, are more stable and have greater feature extracting capabilities.
Centering Neural Network Gradient Factors
 Neural Networks: Tricks of the Trade, volume 1524 of Lecture Notes in Computer Science
, 1997
"... It has long been known that neural networks can learn faster when their input and hidden unit activities are centered about zero; recently we have extended this approach to also encompass the centering of error signals [2]. Here we generalize this notion to all factors involved in the network's ..."
Abstract

Cited by 5 (2 self)
 Add to MetaCart
It has long been known that neural networks can learn faster when their input and hidden unit activities are centered about zero; recently we have extended this approach to also encompass the centering of error signals [2]. Here we generalize this notion to all factors involved in the network's gradient, leading us to propose centering the slope of hidden unit activation functions as well. Slope centering removes the linear component of backpropagated error; this improves credit assignment in networks with shortcut connections. Benchmark results show that this can speed up learning significantly without adversely affecting the trained network's generalization ability.