Results 1 - 10
of
88
Ensemble Methods in Machine Learning
- MULTIPLE CLASSIFIER SYSTEMS, LBCS-1857
, 2000
"... Ensemble methods are learning algorithms that construct a set of classifiers and then classify new data points by taking a (weighted) vote of their predictions. The original ensemble method is Bayesian averaging, but more recent algorithms include error-correcting output coding, Bagging, and boostin ..."
Abstract
-
Cited by 339 (2 self)
- Add to MetaCart
Ensemble methods are learning algorithms that construct a set of classifiers and then classify new data points by taking a (weighted) vote of their predictions. The original ensemble method is Bayesian averaging, but more recent algorithms include error-correcting output coding, Bagging, and boosting. This paper reviews these methods and explains why ensembles can often perform better than any single classifier. Some previous studies comparing ensemble methods are reviewed, and some new experiments are presented to uncover the reasons that Adaboost does not overfit rapidly.
A nonparametric approach to pricing and hedging derivative securities via learning networks
- Journal of Finance
, 1994
"... http://www.jstor.org/about/terms.html. JSTOR's Terms and Conditions of Use provides, in part, that unless you have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you may use content in the JSTOR archive only for your personal, non-com ..."
Abstract
-
Cited by 84 (4 self)
- Add to MetaCart
http://www.jstor.org/about/terms.html. JSTOR's Terms and Conditions of Use provides, in part, that unless you have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you may use content in the JSTOR archive only for your personal, non-commercial use. Please contact the publisher regarding any further use of this work. Publisher contact information may be obtained at
Improving Regression Estimation: Averaging Methods for Variance Reduction with Extensions to General Convex Measure Optimization
, 1993
"... ..."
J.C.: Best practices for convolutional neural networks applied to visual document analysis
- In: Int’l Conference on Document Analysis and Recognition
, 2003
"... Neural networks are a powerful technology for classification of visual inputs arising from documents. However, there is a confusing plethora of different neural network methods that are used in the literature and in industry. This paper describes a set of concrete best practices that document analys ..."
Abstract
-
Cited by 65 (7 self)
- Add to MetaCart
Neural networks are a powerful technology for classification of visual inputs arising from documents. However, there is a confusing plethora of different neural network methods that are used in the literature and in industry. This paper describes a set of concrete best practices that document analysis researchers can use to get good results with neural networks. The most important practice is getting a training set as large as possible: we expand the training set by adding a new form of distorted data. The next most important practice is that convolutional neural networks are better suited for visual document tasks than fully connected networks. We propose that a simple “do-it-yourself ” implementation of convolution with a flexible architecture is suitable for many visual document problems. This simple convolutional neural network does not require complex methods, such as momentum, weight decay, structuredependent learning rates, averaging layers, tangent prop, or even finely-tuning the architecture. The end result is a very simple yet general architecture which can yield state-of-the-art performance for document analysis. We illustrate our claims on the MNIST set of English digit images. 1.
Efficient Agnostic Learning of Neural Networks with Bounded Fan-in
, 1996
"... We show that the class of two layer neural networks with bounded fan-in is efficiently learnable in a realistic extension to the Probably Approximately Correct (PAC) learning model. In this model, a joint probability distribution is assumed to exist on the observations and the learner is required to ..."
Abstract
-
Cited by 57 (18 self)
- Add to MetaCart
We show that the class of two layer neural networks with bounded fan-in is efficiently learnable in a realistic extension to the Probably Approximately Correct (PAC) learning model. In this model, a joint probability distribution is assumed to exist on the observations and the learner is required to approximate the neural network which minimizes the expected quadratic error. As special cases, the model allows learning real-valued functions with bounded noise, learning probabilistic concepts and learning the best approximation to a target function that cannot be well approximated by the neural network. The networks we consider have real-valued inputs and outputs, an unlimited number of threshold hidden units with bounded fan-in, and a bound on the sum of the absolute values of the output weights. The number of computation This work was supported by the Australian Research Council and the Australian Telecommunications and Electronics Research Board. The material in this paper was pres...
On Learning the Derivatives of an Unknown Mapping with Multilayer Feedforward Networks
, 1989
"... Daniel F. Mccaffrey, and Douglas W. Nychka for helpful discussions relating to Recently, multiple input, single output, single hidden layer, feedforward neural networks have been shown to be capable of approximating a nonlinear map and its partial derivatives. Specifically, neural nets have been sho ..."
Abstract
-
Cited by 49 (5 self)
- Add to MetaCart
Daniel F. Mccaffrey, and Douglas W. Nychka for helpful discussions relating to Recently, multiple input, single output, single hidden layer, feedforward neural networks have been shown to be capable of approximating a nonlinear map and its partial derivatives. Specifically, neural nets have been shown to be dense in various Sobolev spaces (Hornik, Stinchcombe and White, 1989). Building upon this result, we show that a net can be trained so that the map and its derivatives are learned. Specifically, we use a result of Gallant (1987b) to show that least squares and similar estimates are strongly consistent in Sobolev norm provided the number of hidden units and the size of the training set increase together. We illustrate these results by an applic~tion to the inverse problem of chaotic dynamics: recovery of a nonlinear map from a time series of iterates. These results extend automatically to nets that embed the single hidden layer, feedforward network as a special case. 1.1 1.
The Design and Evolution of Modular Neural Network Architectures
- Neural Networks
, 1994
"... To investigate the relations between structure and function in both artificial and natural neural networks, we present a series of simulations and analyses with modular neural networks. We suggest a number of design principles in the form of explicit ways in which neural modules can cooperate in rec ..."
Abstract
-
Cited by 44 (0 self)
- Add to MetaCart
To investigate the relations between structure and function in both artificial and natural neural networks, we present a series of simulations and analyses with modular neural networks. We suggest a number of design principles in the form of explicit ways in which neural modules can cooperate in recognition tasks. These results may supplement recent accounts of the relation between structure and function in the brain. The networks used consist out of several modules, standard subnetworks that serve as higher-order units with a distinct structure and function. The simulations rely on a particular network module called CALM (Murre, Phaf, and Wolters, 1989, 1992). This module, developed mainly for unsupervised categorization and learning, is able to adjust its local learning dynamics. The way in which modules are interconnected is an important determinant of the learning and categorization behaviour of the network as a whole. Based on arguments derived from neuroscience, psychology, compu...
Approximation theory of the MLP model in neural networks
- ACTA NUMERICA
, 1999
"... In this survey we discuss various approximation-theoretic problems that arise in the multilayer feedforward perceptron (MLP) model in neural networks. Mathematically it is one of the simpler models. Nonetheless the mathematics of this model is not well understood, and many of these problems are appr ..."
Abstract
-
Cited by 30 (3 self)
- Add to MetaCart
In this survey we discuss various approximation-theoretic problems that arise in the multilayer feedforward perceptron (MLP) model in neural networks. Mathematically it is one of the simpler models. Nonetheless the mathematics of this model is not well understood, and many of these problems are approximation-theoretic in character. Most of the research we will discuss is of very recent vintage. We will report on what has been done and on various unanswered questions. We will not be presenting practical (algorithmic) methods. We will, however, be exploring the capabilities and limitations of this model. In the first
Approximating a Function and its Derivatives Using MSE-Optimal Linear Combinations of Trained Feedforward Neural Networks
- In Proceedings of the Joint Conference on Neural Networks
, 1993
"... In this paper, we show that using MSE-optimal linear combinations of a set of trained feedforward networks may significantly improve the accuracy of approximating a function and its first and second order derivatives. Our results are compared to the accuracies achieved by the single best network and ..."
Abstract
-
Cited by 20 (2 self)
- Add to MetaCart
In this paper, we show that using MSE-optimal linear combinations of a set of trained feedforward networks may significantly improve the accuracy of approximating a function and its first and second order derivatives. Our results are compared to the accuracies achieved by the single best network and by the simple averaging of the outputs of the trained networks. 1 Introduction Feedforward neural networks (FNN) are widely used for function approximation. They are considered universal approximators capable of approximating an unknown mapping and its derivatives arbitrarily well (Hornik et al. 1990). Approximating the derivatives, that is the derivatives of the output with respect to the inputs, is of significant importance in many applications. For example in process optimization, the first and second order derivatives obtained from a neural network, which was trained on the process response, may be used in approximating the gradient vector and the Hessian matrix of the process response...
Nonparametric Estimation of Conditional Quantiles Using Neural Networks
, 1990
"... We establish the consistency of nonparametric conditional quantile estimators based on artificial neural networks. The results follow from general results on sieve estimation for dependent processes. We also show that conditional quantlies can be learned to any pre-specified accuracy using approxima ..."
Abstract
-
Cited by 20 (2 self)
- Add to MetaCart
We establish the consistency of nonparametric conditional quantile estimators based on artificial neural networks. The results follow from general results on sieve estimation for dependent processes. We also show that conditional quantlies can be learned to any pre-specified accuracy using approximate rather than exact network optimization.

