Results 1  10
of
259
Multilayer Feedforward Networks With a Nonpolynomial Activation Function Can Approximate Any Function
, 1993
"... Several researchers characterized the activation fimction under which multilayer feedforward networks can act as universal approximators. We show that most of all the characterizations that were reported thus far in the literature are special cases of the following general result: A standard multila ..."
Abstract

Cited by 128 (2 self)
 Add to MetaCart
Several researchers characterized the activation fimction under which multilayer feedforward networks can act as universal approximators. We show that most of all the characterizations that were reported thus far in the literature are special cases of the following general result: A standard multilayer feedforward network with a locally bounded piecewise continuous activation fimction can approximate an3, continuous function to any degree of accuracy if and only if the network's activation function is not a polynomial. We also emphasize the important role of the threshold, asserting that without it the last theorem does not hold.
Improving Regression Estimation: Averaging Methods for Variance Reduction with Extensions to General Convex Measure Optimization
, 1993
"... ..."
A Bayesian/information theoretic model of learning to learn via multiple task sampling
 Machine Learning
, 1997
"... Abstract. A Bayesian model of learning to learn by sampling from multiple tasks is presented. The multiple tasks are themselves generated by sampling from a distribution over an environment of related tasks. Such an environment is shown to be naturally modelled within a Bayesian context by the conce ..."
Abstract

Cited by 78 (2 self)
 Add to MetaCart
Abstract. A Bayesian model of learning to learn by sampling from multiple tasks is presented. The multiple tasks are themselves generated by sampling from a distribution over an environment of related tasks. Such an environment is shown to be naturally modelled within a Bayesian context by the concept of an objective prior distribution. It is argued that for many common machine learning problems, although in general we do not know the true (objective) prior for the problem, we do have some idea of a set of possible priors to which the true prior belongs. It is shown that under these circumstances a learner can use Bayesian inference to learn the true prior by learning sufficiently many tasks from the environment. In addition, bounds are given on the amount of information required to learn a task when it is simultaneously learnt with several other tasks. The bounds show that if the learner has little knowledge of the true prior, but the dimensionality of the true prior is small, then sampling multiple tasks is highly advantageous. The theory is applied to the problem of learning a common feature set or equivalently a lowdimensionalrepresentation (LDR) for an environment of related tasks.
Constructive Algorithms for Structure Learning in Feedforward Neural Networks for Regression Problems
 IEEE Transactions on Neural Networks
, 1997
"... In this survey paper, we review the constructive algorithms for structure learning in feedforward neural networks for regression problems. The basic idea is to start with a small network, then add hidden units and weights incrementally until a satisfactory solution is found. By formulating the whole ..."
Abstract

Cited by 74 (2 self)
 Add to MetaCart
In this survey paper, we review the constructive algorithms for structure learning in feedforward neural networks for regression problems. The basic idea is to start with a small network, then add hidden units and weights incrementally until a satisfactory solution is found. By formulating the whole problem as a state space search, we first describe the general issues in constructive algorithms, with special emphasis on the search strategy. A taxonomy, based on the differences in the state transition mapping, the training algorithm and the network architecture, is then presented. Keywords Constructive algorithm, structure learning, state space search, dynamic node creation, projection pursuit regression, cascadecorrelation, resourceallocating network, group method of data handling. I. Introduction A. Problems with Fixed Size Networks I N recent years, many neural network models have been proposed for pattern classification, function approximation and regression problems. Among...
Consistent Specification Testing With Nuisance Parameters Present Only Under The Alternative
, 1995
"... . The nonparametric and the nuisance parameter approaches to consistently testing statistical models are both attempts to estimate topological measures of distance between a parametric and a nonparametric fit, and neither dominates in experiments. This topological unification allows us to greatly ex ..."
Abstract

Cited by 66 (11 self)
 Add to MetaCart
. The nonparametric and the nuisance parameter approaches to consistently testing statistical models are both attempts to estimate topological measures of distance between a parametric and a nonparametric fit, and neither dominates in experiments. This topological unification allows us to greatly extend the nuisance parameter approach. How and why the nuisance parameter approach works and how it can be extended bears closely on recent developments in artificial neural networks. Statistical content is provided by viewing specification tests with nuisance parameters as tests of hypotheses about Banachvalued random elements and applying the Banach Central Limit Theorem and Law of Iterated Logarithm, leading to simple procedures that can be used as a guide to when computationally more elaborate procedures may be warranted. 1. Introduction In testing whether or not a parametric statistical model is correctly specified, there are a number of apparently distinct approaches one might take. T...
Feedforward nets for interpolation and classification
 J. Comp. Syst. Sci
, 1992
"... This paper deals with singlehiddenlayer feedforward nets, studying various aspects of classification power and interpolation capability. In particular, a worstcase analysis shows that direct input to output connections in threshold nets double the recognition but not the interpolation power, whil ..."
Abstract

Cited by 65 (16 self)
 Add to MetaCart
(Show Context)
This paper deals with singlehiddenlayer feedforward nets, studying various aspects of classification power and interpolation capability. In particular, a worstcase analysis shows that direct input to output connections in threshold nets double the recognition but not the interpolation power, while using sigmoids rather than thresholds allows doubling both. For other measures of classification, including the VapnikChervonenkis dimension, the effect of direct connections or sigmoidal activations is studied in the special case of twodimensional inputs. 1
Neural networks for classification: a survey
 and Cybernetics  Part C: Applications and Reviews
, 2000
"... Abstract—Classification is one of the most active research and application areas of neural networks. The literature is vast and growing. This paper summarizes the some of the most important developments in neural network classification research. Specifically, the issues of posterior probability esti ..."
Abstract

Cited by 59 (0 self)
 Add to MetaCart
(Show Context)
Abstract—Classification is one of the most active research and application areas of neural networks. The literature is vast and growing. This paper summarizes the some of the most important developments in neural network classification research. Specifically, the issues of posterior probability estimation, the link between neural and conventional classifiers, learning and generalization tradeoff in classification, the feature variable selection, as well as the effect of misclassification costs are examined. Our purpose is to provide a synthesis of the published research in this area and stimulate further research interests and efforts in the identified topics. Index Terms—Bayesian classifier, classification, ensemble methods, feature variable selection, learning and generalization, misclassification costs, neural networks. I.
Approximation theory of the MLP model in neural networks
 ACTA NUMERICA
, 1999
"... In this survey we discuss various approximationtheoretic problems that arise in the multilayer feedforward perceptron (MLP) model in neural networks. Mathematically it is one of the simpler models. Nonetheless the mathematics of this model is not well understood, and many of these problems are appr ..."
Abstract

Cited by 47 (3 self)
 Add to MetaCart
In this survey we discuss various approximationtheoretic problems that arise in the multilayer feedforward perceptron (MLP) model in neural networks. Mathematically it is one of the simpler models. Nonetheless the mathematics of this model is not well understood, and many of these problems are approximationtheoretic in character. Most of the research we will discuss is of very recent vintage. We will report on what has been done and on various unanswered questions. We will not be presenting practical (algorithmic) methods. We will, however, be exploring the capabilities and limitations of this model. In the first
Predicting protein disorder for N, C, and internal regions
 Genome Informatics
, 1999
"... Logistic regression (LR), discriminant analysis (DA), and neural networks (NN) were used to predict ordered and disordered regions in proteins. Training data were from a set of nonredundant Xray crystal structures, with the data being partitioned into Nterminal, Cterminal and internal (I) region ..."
Abstract

Cited by 45 (6 self)
 Add to MetaCart
(Show Context)
Logistic regression (LR), discriminant analysis (DA), and neural networks (NN) were used to predict ordered and disordered regions in proteins. Training data were from a set of nonredundant Xray crystal structures, with the data being partitioned into Nterminal, Cterminal and internal (I) regions. The DA and LR methods gave almost identical 5cross validation accuracies that averaged to the following values: 75.9 ± 3.1 % (Nregions), 70.7 ± 1.5 % (Iregions), and 74.6 ± 4.4 % (Cregions). NN predictions gave slightly higher scores: 78.8 ± 1.2 % (Nregions), 72.5 ± 1.2 % (Iregions), and 75.3 ± 3.3 % (Cregions). Predictions improved with length of the disordered regions. Averaged over the three methods, values ranged from 52 % to 78 % for length = 914 to ≥ 21, respectively, for Iregions, from 72 % to 81 % for length = 5 to 1215, respectively, for Nregions, and from 70 % to 80 % for length = 5 to 1215, respectively, for Cregions. These data support the hypothesis that disorder is encoded by the amino acid sequence. 1
Extreme learning machine: A new learning scheme of feedforward neural networks
 IN PROC. INT. JOINT CONF. NEURAL NETW
, 2006
"... It is clear that the learning speed of feedforward neural networks is in general far slower than required and it has been a major bottleneck in their applications for past decades. Two key reasons behind may be: 1) the slow gradientbased learning algorithms are extensively used to train neural netwo ..."
Abstract

Cited by 41 (13 self)
 Add to MetaCart
(Show Context)
It is clear that the learning speed of feedforward neural networks is in general far slower than required and it has been a major bottleneck in their applications for past decades. Two key reasons behind may be: 1) the slow gradientbased learning algorithms are extensively used to train neural networks, and 2) all the parameters of the networks are tuned iteratively by using such learning algorithms. Unlike these traditional implementations, this paper proposes a new learning algorithm called extreme learning machine (ELM) for singlehidden layer feedforward neural networks (SLFNs) which randomly chooses the input weights and analytically determines the output weights of SLFNs. In theory, this algorithm tends to provide the best generalization performance at extremely fast learning speed. The experimental results based on realworld benchmarking function approximation and classification problems including large complex applications show that the new algorithm can produce best generalization performance in some cases and can learn much faster than traditional popular learning algorithms for feedforward neural networks.