Results 1 - 10
of
23
Hierarchical mixtures of experts and the EM algorithm
- Neural Computation
, 1994
"... We present a tree-structured architecture for supervised learning. The statistical model underlying the architecture is a hi-erarchical mixture model in which both the mixture coefficients and the mixture components are generalized linear models (GLIM’s). Learning is treated as a max-imum likelihood ..."
Abstract
-
Cited by 634 (19 self)
- Add to MetaCart
We present a tree-structured architecture for supervised learning. The statistical model underlying the architecture is a hi-erarchical mixture model in which both the mixture coefficients and the mixture components are generalized linear models (GLIM’s). Learning is treated as a max-imum likelihood problem; in particular, we present an Expectation-Maximization (EM) algorithm for adjusting the parame-ters of the architecture. We also develop an on-line learning algorithm in which the pa-rameters are updated incrementally. Com-parative simulation results are presented in the robot dynamics domain. 1
ANFIS: Adaptive-Network-Based Fuzzy Inference System
, 1993
"... This paper presents the architecture and learning procedure underlying ANFIS (AdaptiveNetwork -based Fuzzy Inference System), a fuzzy inference system implemented in the framework of adaptive networks. By using a hybrid learning procedure, the proposed ANFIS can construct an input-output mapping bas ..."
Abstract
-
Cited by 324 (5 self)
- Add to MetaCart
This paper presents the architecture and learning procedure underlying ANFIS (AdaptiveNetwork -based Fuzzy Inference System), a fuzzy inference system implemented in the framework of adaptive networks. By using a hybrid learning procedure, the proposed ANFIS can construct an input-output mapping based on both human knowledge (in the form of fuzzy if-then rules) and stipulated input-output data pairs. In our simulation, we employ the ANFIS architecture to model nonlinear functions, identify nonlinear components on-linely in a control system, and predict a chaotic time series, all yielding remarkable results. Comparisons with artificail neural networks and earlier work on fuzzy modeling are listed and discussed. Other extensions of the proposed ANFIS and promising applications to automatic control and signal processing are also suggested. 1 Introduction System modeling based on conventional mathematical tools (e.g., differential equations) is not well suited for dealing with ill-define...
A Theory of Networks for Approximation and Learning
- Laboratory, Massachusetts Institute of Technology
, 1989
"... Learning an input-output mapping from a set of examples, of the type that many neural networks have been constructed to perform, can be regarded as synthesizing an approximation of a multi-dimensional function, that is solving the problem of hypersurface reconstruction. From this point of view, t ..."
Abstract
-
Cited by 170 (25 self)
- Add to MetaCart
Learning an input-output mapping from a set of examples, of the type that many neural networks have been constructed to perform, can be regarded as synthesizing an approximation of a multi-dimensional function, that is solving the problem of hypersurface reconstruction. From this point of view, this form of learning is closely related to classical approximation techniques, such as generalized splines and regularization theory. This paper considers the problems of an exact representation and, in more detail, of the approximation of linear and nonlinear mappings in terms of simpler functions of fewer variables. Kolmogorov's theorem concerning the representation of functions of several variables in terms of functions of one variable turns out to be almost irrelevant in the context of networks for learning. Wedevelop a theoretical framework for approximation based on regularization techniques that leads to a class of three-layer networks that we call Generalized Radial Basis Functions (GRBF), since they are mathematically related to the well-known Radial Basis Functions, mainly used for strict interpolation tasks. GRBF networks are not only equivalent to generalized splines, but are also closely related to pattern recognition methods suchasParzen windows and potential functions and to several neural network algorithms, suchas Kanerva's associative memory,backpropagation and Kohonen's topology preserving map. They also haveaninteresting interpretation in terms of prototypes that are synthesized and optimally combined during the learning stage. The paper introduces several extensions and applications of the technique and discusses intriguing analogies with neurobiological data.
Neuro-Fuzzy Modeling and Control
- PROCEEDINGS OF THE IEEE
, 1995
"... Fundamental and advanced developments in neuro-fuzzy synergisms for modeling and control are reviewed. The essential part of neuro-fuzzy synergisms comes from a common framework called adaptive networks, which unifies both neural networks and fuzzy models. The fuzzy models under the framework of ada ..."
Abstract
-
Cited by 110 (1 self)
- Add to MetaCart
Fundamental and advanced developments in neuro-fuzzy synergisms for modeling and control are reviewed. The essential part of neuro-fuzzy synergisms comes from a common framework called adaptive networks, which unifies both neural networks and fuzzy models. The fuzzy models under the framework of adaptive networks is called ANFIS (Adaptive-Network-based Fuzzy Inference System), which possess certain advantages over neural networks. We introduce the design methods for ANFIS in both modeling and control applications. Current problems and future directions for neuro-fuzzy approaches are also addressed.
Improving Regression Estimation: Averaging Methods for Variance Reduction with Extensions to General Convex Measure Optimization
, 1993
"... ..."
Prediction risk and architecture selection for neural networks
, 1994
"... Abstract. We describe two important sets of tools for neural network modeling: prediction risk estimation and network architecture selection. Prediction risk is defined as the expected performance of an estimator in predicting new observations. Estimated prediction risk can be used both for estimati ..."
Abstract
-
Cited by 69 (2 self)
- Add to MetaCart
Abstract. We describe two important sets of tools for neural network modeling: prediction risk estimation and network architecture selection. Prediction risk is defined as the expected performance of an estimator in predicting new observations. Estimated prediction risk can be used both for estimating the quality of model predictions and for model selection. Prediction risk estimation and model selection are especially important for problems with limited data. Techniques for estimating prediction risk include data resampling algorithms such as nonlinear cross–validation (NCV) and algebraic formulae such as the predicted squared error (PSE) and generalized prediction error (GPE). We show that exhaustive search over the space of network architectures is computationally infeasible even for networks of modest size. This motivates the use of heuristic strategies that dramatically reduce the search complexity. These strategies employ directed search algorithms, such as selecting the number of nodes via sequential network construction (SNC) and pruning inputs and weights via sensitivity based pruning (SBP) and optimal brain damage (OBD) respectively.
Flat Minima
, 1997
"... this paper (available on the World-Wide Web; see our home pages) contains pseudo-code of an efficient implementation. It is based on fast multiplication of the Hessian and a vector due to Pearlmutter (1994) and Mller (1993). Acknowledgments ..."
Abstract
-
Cited by 32 (13 self)
- Add to MetaCart
this paper (available on the World-Wide Web; see our home pages) contains pseudo-code of an efficient implementation. It is based on fast multiplication of the Hessian and a vector due to Pearlmutter (1994) and Mller (1993). Acknowledgments
Learning Controllers for Industrial Robots
, 1996
"... . One of the most significant cost factors in robotics applications is the design and development of real-time robot control software. Control theory helps when linear controllers have to be developed, but it doesn't sufficiently support the generation of non-linear controllers, although in many cas ..."
Abstract
-
Cited by 26 (14 self)
- Add to MetaCart
. One of the most significant cost factors in robotics applications is the design and development of real-time robot control software. Control theory helps when linear controllers have to be developed, but it doesn't sufficiently support the generation of non-linear controllers, although in many cases (such as in compliance control), nonlinear control is essential for achieving high performance. This paper discusses how Machine Learning has been applied to the design of (non-)linear controllers. Several alternative function approximators, including Multilayer Perceptrons (MLP), Radial Basis Function Networks (RBFNs), and Fuzzy Controllers are analyzed and compared, leading to the definition of two major families: Open Field Function Function Approximators and Locally Receptive Field Function Approximators. It is shown that RBFNs and Fuzzy Controllers bear strong similarities, and that both have a symbolic interpretation. This characteristics allows for applying both symbolic and statis...
The Cascade-Correlation Learning Architecture
- Advances in Neural Information Processing Systems 2
, 1990
"... Cascade-Correlation is a new architecture and supervised learning algorithm for artificial neural networks. Instead of just adjusting the weights in a network of fixed topology, Cascade-Correlation begins with a minimal network, then automatically trains and adds new hidden units one by one, creatin ..."
Abstract
-
Cited by 15 (0 self)
- Add to MetaCart
Cascade-Correlation is a new architecture and supervised learning algorithm for artificial neural networks. Instead of just adjusting the weights in a network of fixed topology, Cascade-Correlation begins with a minimal network, then automatically trains and adds new hidden units one by one, creating a multi-layer structure. Once a new hidden unit has been added to the network, its input-side weights are frozen. This unit then becomes a permanent feature-detector in the network, available for producing outputs or for creating other, more complex feature detectors. The Cascade-Correlation architecture has several advantages over existing algorithms: it learns very quickly, the network determines its own size and topology, it retains the structures it has built even if the training set changes, and it requires no back-propagation of error signals through the connections of the network. This research was sponsored in part by the National Science Foundation under Contract Number EET-87163...
Reinforcement Learning for Planning and Control
, 1993
"... Introduction In this chapter, we consider a form of learning in which the system, referred to as the controller, in the course of interacting with its environment constructs a control law, referred to as a policy, that determines for each state enncountered by the controller an action to perform in ..."
Abstract
-
Cited by 14 (0 self)
- Add to MetaCart
Introduction In this chapter, we consider a form of learning in which the system, referred to as the controller, in the course of interacting with its environment constructs a control law, referred to as a policy, that determines for each state enncountered by the controller an action to perform in that state. The objective is to construct an optimal policy, one that maximizes a given measure of performance. The system does not, however, learn by being given examples of states and the optimal actions to take in those states. Rather, the system performs actions in states and is given feedback in the form of rewards. This type of learning is called reinforcement learning. Reinforcement learning is complicated by the fact that the reinforcement in the form of rewards and punishments is often intermittent and delayed. The controller may perform a long sequence of actions before receiving any reward. This makes it di

