Results 1  10
of
321
Multilayer Feedforward Networks With a Nonpolynomial Activation Function Can Approximate Any Function
, 1993
"... Several researchers characterized the activation fimction under which multilayer feedforward networks can act as universal approximators. We show that most of all the characterizations that were reported thus far in the literature are special cases of the following general result: A standard multila ..."
Abstract

Cited by 156 (3 self)
 Add to MetaCart
Several researchers characterized the activation fimction under which multilayer feedforward networks can act as universal approximators. We show that most of all the characterizations that were reported thus far in the literature are special cases of the following general result: A standard multilayer feedforward network with a locally bounded piecewise continuous activation fimction can approximate an3, continuous function to any degree of accuracy if and only if the network's activation function is not a polynomial. We also emphasize the important role of the threshold, asserting that without it the last theorem does not hold.
Neural networks for classification: a survey
 and Cybernetics  Part C: Applications and Reviews
, 2000
"... Abstract—Classification is one of the most active research and application areas of neural networks. The literature is vast and growing. This paper summarizes the some of the most important developments in neural network classification research. Specifically, the issues of posterior probability esti ..."
Abstract

Cited by 132 (0 self)
 Add to MetaCart
(Show Context)
Abstract—Classification is one of the most active research and application areas of neural networks. The literature is vast and growing. This paper summarizes the some of the most important developments in neural network classification research. Specifically, the issues of posterior probability estimation, the link between neural and conventional classifiers, learning and generalization tradeoff in classification, the feature variable selection, as well as the effect of misclassification costs are examined. Our purpose is to provide a synthesis of the published research in this area and stimulate further research interests and efforts in the identified topics. Index Terms—Bayesian classifier, classification, ensemble methods, feature variable selection, learning and generalization, misclassification costs, neural networks. I.
Extreme learning machine: Theory and applications
, 2006
"... It is clear that the learning speed of feedforward neural networks is in general far slower than required and it has been a major bottleneck in their applications for past decades. Two key reasons behind may be: (1) the slow gradientbased learning algorithms are extensively used to train neural net ..."
Abstract

Cited by 114 (9 self)
 Add to MetaCart
It is clear that the learning speed of feedforward neural networks is in general far slower than required and it has been a major bottleneck in their applications for past decades. Two key reasons behind may be: (1) the slow gradientbased learning algorithms are extensively used to train neural networks, and (2) all the parameters of the networks are tuned iteratively by using such learning algorithms. Unlike these conventional implementations, this paper proposes a new learning algorithm called extreme learning machine (ELM) for singlehidden layer feedforward neural networks (SLFNs) which randomly chooses hidden nodes and analytically determines the output weights of SLFNs. In theory, this algorithm tends to provide good generalization performance at extremely fast learning speed. The experimental results based on a few artificial and real benchmark function approximation and classification problems including very large complex applications show that the new algorithm can produce good generalization performance in most cases and can learn thousands of times faster than conventional popular learning algorithms for feedforward neural networks.
Extreme learning machine: A new learning scheme of feedforward neural networks
 IN PROC. INT. JOINT CONF. NEURAL NETW
, 2006
"... It is clear that the learning speed of feedforward neural networks is in general far slower than required and it has been a major bottleneck in their applications for past decades. Two key reasons behind may be: 1) the slow gradientbased learning algorithms are extensively used to train neural netwo ..."
Abstract

Cited by 112 (17 self)
 Add to MetaCart
(Show Context)
It is clear that the learning speed of feedforward neural networks is in general far slower than required and it has been a major bottleneck in their applications for past decades. Two key reasons behind may be: 1) the slow gradientbased learning algorithms are extensively used to train neural networks, and 2) all the parameters of the networks are tuned iteratively by using such learning algorithms. Unlike these traditional implementations, this paper proposes a new learning algorithm called extreme learning machine (ELM) for singlehidden layer feedforward neural networks (SLFNs) which randomly chooses the input weights and analytically determines the output weights of SLFNs. In theory, this algorithm tends to provide the best generalization performance at extremely fast learning speed. The experimental results based on realworld benchmarking function approximation and classification problems including large complex applications show that the new algorithm can produce best generalization performance in some cases and can learn much faster than traditional popular learning algorithms for feedforward neural networks.
A Bayesian/information theoretic model of learning to learn via multiple task sampling
 Machine Learning
, 1997
"... Abstract. A Bayesian model of learning to learn by sampling from multiple tasks is presented. The multiple tasks are themselves generated by sampling from a distribution over an environment of related tasks. Such an environment is shown to be naturally modelled within a Bayesian context by the conce ..."
Abstract

Cited by 93 (2 self)
 Add to MetaCart
Abstract. A Bayesian model of learning to learn by sampling from multiple tasks is presented. The multiple tasks are themselves generated by sampling from a distribution over an environment of related tasks. Such an environment is shown to be naturally modelled within a Bayesian context by the concept of an objective prior distribution. It is argued that for many common machine learning problems, although in general we do not know the true (objective) prior for the problem, we do have some idea of a set of possible priors to which the true prior belongs. It is shown that under these circumstances a learner can use Bayesian inference to learn the true prior by learning sufficiently many tasks from the environment. In addition, bounds are given on the amount of information required to learn a task when it is simultaneously learnt with several other tasks. The bounds show that if the learner has little knowledge of the true prior, but the dimensionality of the true prior is small, then sampling multiple tasks is highly advantageous. The theory is applied to the problem of learning a common feature set or equivalently a lowdimensionalrepresentation (LDR) for an environment of related tasks.
Consistent Specification Testing With Nuisance Parameters Present Only Under The Alternative
, 1995
"... . The nonparametric and the nuisance parameter approaches to consistently testing statistical models are both attempts to estimate topological measures of distance between a parametric and a nonparametric fit, and neither dominates in experiments. This topological unification allows us to greatly ex ..."
Abstract

Cited by 92 (13 self)
 Add to MetaCart
. The nonparametric and the nuisance parameter approaches to consistently testing statistical models are both attempts to estimate topological measures of distance between a parametric and a nonparametric fit, and neither dominates in experiments. This topological unification allows us to greatly extend the nuisance parameter approach. How and why the nuisance parameter approach works and how it can be extended bears closely on recent developments in artificial neural networks. Statistical content is provided by viewing specification tests with nuisance parameters as tests of hypotheses about Banachvalued random elements and applying the Banach Central Limit Theorem and Law of Iterated Logarithm, leading to simple procedures that can be used as a guide to when computationally more elaborate procedures may be warranted. 1. Introduction In testing whether or not a parametric statistical model is correctly specified, there are a number of apparently distinct approaches one might take. T...
Universal approximation using incremental constructive feedforward networks with . . .
 IEEE TRANSACTIONS ON NEURAL NETWORKS
, 2005
"... According to conventional neural network theories, singlehiddenlayer feedforward networks (SLFNs) with additive or radial basis function (RBF) hidden nodes are universal approximators when all the parameters of the networks are allowed adjustable. However, as observed in most neural network implem ..."
Abstract

Cited by 89 (15 self)
 Add to MetaCart
According to conventional neural network theories, singlehiddenlayer feedforward networks (SLFNs) with additive or radial basis function (RBF) hidden nodes are universal approximators when all the parameters of the networks are allowed adjustable. However, as observed in most neural network implementations, tuning all the parameters of the networks may cause learning complicated and inefficient, and it may be difficult to train networks with nondifferential activation functions such as threshold networks. Unlike conventional neural network theories, this paper proves in an incremental constructive method that in order to let SLFNs work as universal approximators, one may simply randomly choose hidden nodes and then only need to adjust the output weights linking the hidden layer and the output layer. In such SLFNs implementations, the activation functions for additive nodes can be any bounded nonconstant piecewise continuous functions X and the activation functions for RBF nodes can be any integrable piecewise continuous functions X and @ A aH. The proposed incremental method is efficient not only for SFLNs with continuous (including nondifferentiable) activation functions but also for SLFNs with piecewise continuous (such as threshold) activation functions. Compared to other popular methods such a new network is fully automatic and users need not intervene the learning process by manually tuning control parameters.
Constructive Algorithms for Structure Learning in Feedforward Neural Networks for Regression Problems
 IEEE Transactions on Neural Networks
, 1997
"... In this survey paper, we review the constructive algorithms for structure learning in feedforward neural networks for regression problems. The basic idea is to start with a small network, then add hidden units and weights incrementally until a satisfactory solution is found. By formulating the whole ..."
Abstract

Cited by 87 (2 self)
 Add to MetaCart
In this survey paper, we review the constructive algorithms for structure learning in feedforward neural networks for regression problems. The basic idea is to start with a small network, then add hidden units and weights incrementally until a satisfactory solution is found. By formulating the whole problem as a state space search, we first describe the general issues in constructive algorithms, with special emphasis on the search strategy. A taxonomy, based on the differences in the state transition mapping, the training algorithm and the network architecture, is then presented. Keywords Constructive algorithm, structure learning, state space search, dynamic node creation, projection pursuit regression, cascadecorrelation, resourceallocating network, group method of data handling. I. Introduction A. Problems with Fixed Size Networks I N recent years, many neural network models have been proposed for pattern classification, function approximation and regression problems. Among...
Improving Regression Estimation: Averaging Methods for Variance Reduction with Extensions to General Convex Measure Optimization
, 1993
"... ..."
Feedforward nets for interpolation and classification
 J. Comp. Syst. Sci
, 1992
"... This paper deals with singlehiddenlayer feedforward nets, studying various aspects of classification power and interpolation capability. In particular, a worstcase analysis shows that direct input to output connections in threshold nets double the recognition but not the interpolation power, whil ..."
Abstract

Cited by 73 (19 self)
 Add to MetaCart
(Show Context)
This paper deals with singlehiddenlayer feedforward nets, studying various aspects of classification power and interpolation capability. In particular, a worstcase analysis shows that direct input to output connections in threshold nets double the recognition but not the interpolation power, while using sigmoids rather than thresholds allows doubling both. For other measures of classification, including the VapnikChervonenkis dimension, the effect of direct connections or sigmoidal activations is studied in the special case of twodimensional inputs. 1