Results 1  10
of
36
Hierarchical mixtures of experts and the EM algorithm
 Neural Computation
, 1994
"... We present a treestructured architecture for supervised learning. The statistical model underlying the architecture is a hierarchical mixture model in which both the mixture coefficients and the mixture components are generalized linear models (GLIM’s). Learning is treated as a maximum likelihood ..."
Abstract

Cited by 764 (20 self)
 Add to MetaCart
We present a treestructured architecture for supervised learning. The statistical model underlying the architecture is a hierarchical mixture model in which both the mixture coefficients and the mixture components are generalized linear models (GLIM’s). Learning is treated as a maximum likelihood problem; in particular, we present an ExpectationMaximization (EM) algorithm for adjusting the parameters of the architecture. We also develop an online learning algorithm in which the parameters are updated incrementally. Comparative simulation results are presented in the robot dynamics domain. 1
The cascadecorrelation learning architecture
 Advances in Neural Information Processing Systems 2
, 1990
"... CascadeCorrelation is a new architecture and supervised learning algorithm for artificial neural networks. Instead of just adjusting the weights in a network of fixed topology, CascadeCorrelation begins with a minimal network, then automatically trains and adds new hidden units one by one, creatin ..."
Abstract

Cited by 711 (5 self)
 Add to MetaCart
(Show Context)
CascadeCorrelation is a new architecture and supervised learning algorithm for artificial neural networks. Instead of just adjusting the weights in a network of fixed topology, CascadeCorrelation begins with a minimal network, then automatically trains and adds new hidden units one by one, creating a multilayer structure. Once a new hidden unit has been added to the network, its inputside weights are frozen. This unit then becomes a permanent featuredetector in the network, available for producing outputs or for creating other, more complex feature detectors. The CascadeCorrelation architecture has several advantages over existing algorithms: it learns very quickly, the network determines its own size and topology, it retains the structures it has built even if the training set changes, and it requires no backpropagation of error signals through the connections of the network.
ANFIS: adaptivenetworkbased fuzzy inference
 IEEE Transactions on Systems Man and Cybernetics
, 1993
"... ..."
A Theory of Networks for Approximation and Learning
 Laboratory, Massachusetts Institute of Technology
, 1989
"... Learning an inputoutput mapping from a set of examples, of the type that many neural networks have been constructed to perform, can be regarded as synthesizing an approximation of a multidimensional function, that is solving the problem of hypersurface reconstruction. From this point of view, t ..."
Abstract

Cited by 208 (24 self)
 Add to MetaCart
Learning an inputoutput mapping from a set of examples, of the type that many neural networks have been constructed to perform, can be regarded as synthesizing an approximation of a multidimensional function, that is solving the problem of hypersurface reconstruction. From this point of view, this form of learning is closely related to classical approximation techniques, such as generalized splines and regularization theory. This paper considers the problems of an exact representation and, in more detail, of the approximation of linear and nonlinear mappings in terms of simpler functions of fewer variables. Kolmogorov's theorem concerning the representation of functions of several variables in terms of functions of one variable turns out to be almost irrelevant in the context of networks for learning. Wedevelop a theoretical framework for approximation based on regularization techniques that leads to a class of threelayer networks that we call Generalized Radial Basis Functions (GRBF), since they are mathematically related to the wellknown Radial Basis Functions, mainly used for strict interpolation tasks. GRBF networks are not only equivalent to generalized splines, but are also closely related to pattern recognition methods suchasParzen windows and potential functions and to several neural network algorithms, suchas Kanerva's associative memory,backpropagation and Kohonen's topology preserving map. They also haveaninteresting interpretation in terms of prototypes that are synthesized and optimally combined during the learning stage. The paper introduces several extensions and applications of the technique and discusses intriguing analogies with neurobiological data.
A resourceallocating network for function interpolation
 Neural Computation
, 1991
"... We have created a network that allocates a new computational unit whenever an unusual pattern is presented to the network. This network forms compact representations, yet learns easily and rapidly. The network can be used at any time in the learning process and the learning patterns do not have to b ..."
Abstract

Cited by 178 (2 self)
 Add to MetaCart
(Show Context)
We have created a network that allocates a new computational unit whenever an unusual pattern is presented to the network. This network forms compact representations, yet learns easily and rapidly. The network can be used at any time in the learning process and the learning patterns do not have to be repeated. The units in this network respond to only a local region of the space of input values. The network learns by allocating new units and adjusting the parameters of existing units. If the network performs poorly on a presented pattern, then a new unit is allocated which corrects the response to the presented pattern. If the network performs well on a presented pattern, then the network parameters are updated using standard LMS gradient descent. We have obtained good results with our resourceallocating network (RAN). For predicting the Mackey Glass chaotic time series, our network learns much faster than do those using backpropagation and uses a comparable number of synapses. 1
Neurofuzzy modeling and control
 IEEE Proceedings
, 1995
"... Abstract  Fundamental and advanced developments in neurofuzzy synergisms for modeling and control are reviewed. The essential part of neurofuzzy synergisms comes from a common framework called adaptive networks, which uni es both neural networks and fuzzy models. The fuzzy models under the framew ..."
Abstract

Cited by 176 (1 self)
 Add to MetaCart
(Show Context)
Abstract  Fundamental and advanced developments in neurofuzzy synergisms for modeling and control are reviewed. The essential part of neurofuzzy synergisms comes from a common framework called adaptive networks, which uni es both neural networks and fuzzy models. The fuzzy models under the framework of adaptive networks is called ANFIS (AdaptiveNetworkbased Fuzzy Inference System), which possess certain advantages over neural networks. We introduce the design methods for ANFIS in both modeling and control applications. Current problems and future directions for neurofuzzy approaches are also addressed. KeywordsFuzzy logic, neural networks, fuzzy modeling, neurofuzzy modeling, neurofuzzy control, ANFIS. I.
Improving Regression Estimation: Averaging Methods for Variance Reduction with Extensions to General Convex Measure Optimization
, 1993
"... ..."
Prediction risk and architecture selection for neural networks
, 1994
"... Abstract. We describe two important sets of tools for neural network modeling: prediction risk estimation and network architecture selection. Prediction risk is defined as the expected performance of an estimator in predicting new observations. Estimated prediction risk can be used both for estimati ..."
Abstract

Cited by 77 (2 self)
 Add to MetaCart
Abstract. We describe two important sets of tools for neural network modeling: prediction risk estimation and network architecture selection. Prediction risk is defined as the expected performance of an estimator in predicting new observations. Estimated prediction risk can be used both for estimating the quality of model predictions and for model selection. Prediction risk estimation and model selection are especially important for problems with limited data. Techniques for estimating prediction risk include data resampling algorithms such as nonlinear cross–validation (NCV) and algebraic formulae such as the predicted squared error (PSE) and generalized prediction error (GPE). We show that exhaustive search over the space of network architectures is computationally infeasible even for networks of modest size. This motivates the use of heuristic strategies that dramatically reduce the search complexity. These strategies employ directed search algorithms, such as selecting the number of nodes via sequential network construction (SNC) and pruning inputs and weights via sensitivity based pruning (SBP) and optimal brain damage (OBD) respectively.
Learning Controllers for Industrial Robots
, 1996
"... . One of the most significant cost factors in robotics applications is the design and development of realtime robot control software. Control theory helps when linear controllers have to be developed, but it doesn't sufficiently support the generation of nonlinear controllers, although in man ..."
Abstract

Cited by 29 (14 self)
 Add to MetaCart
. One of the most significant cost factors in robotics applications is the design and development of realtime robot control software. Control theory helps when linear controllers have to be developed, but it doesn't sufficiently support the generation of nonlinear controllers, although in many cases (such as in compliance control), nonlinear control is essential for achieving high performance. This paper discusses how Machine Learning has been applied to the design of (non)linear controllers. Several alternative function approximators, including Multilayer Perceptrons (MLP), Radial Basis Function Networks (RBFNs), and Fuzzy Controllers are analyzed and compared, leading to the definition of two major families: Open Field Function Function Approximators and Locally Receptive Field Function Approximators. It is shown that RBFNs and Fuzzy Controllers bear strong similarities, and that both have a symbolic interpretation. This characteristics allows for applying both symbolic and statis...