Results 11  20
of
38
Nonparametric Regression with Correlated Errors
 STATISTICAL SCIENCE
, 2000
"... Nonparametric regression techniques are often sensitive to the presence of correlation in the errors. The practical consequences of this sensitivity are explained, including the breakdown of several popular datadriven smoothing parameter selection methods. We review the existing literature in ke ..."
Abstract

Cited by 28 (8 self)
 Add to MetaCart
Nonparametric regression techniques are often sensitive to the presence of correlation in the errors. The practical consequences of this sensitivity are explained, including the breakdown of several popular datadriven smoothing parameter selection methods. We review the existing literature in kernel regression, smoothing splines and wavelet regression under correlation, both for shortrange and longrange dependence. Extensions to random design, higher dimensional models and adaptive estimation are discussed.
Computing Second Derivatives in FeedForward Networks: a Review
 IEEE Transactions on Neural Networks
, 1994
"... . The calculation of second derivatives is required by recent training and analyses techniques of connectionist networks, such as the elimination of superfluous weights, and the estimation of confidence intervals both for weights and network outputs. We here review and develop exact and approximate ..."
Abstract

Cited by 27 (4 self)
 Add to MetaCart
. The calculation of second derivatives is required by recent training and analyses techniques of connectionist networks, such as the elimination of superfluous weights, and the estimation of confidence intervals both for weights and network outputs. We here review and develop exact and approximate algorithms for calculating second derivatives. For networks with jwj weights, simply writing the full matrix of second derivatives requires O(jwj 2 ) operations. For networks of radial basis units or sigmoid units, exact calculation of the necessary intermediate terms requires of the order of 2h + 2 backward/forwardpropagation passes where h is the number of hidden units in the network. We also review and compare three approximations (ignoring some components of the second derivative, numerical differentiation, and scoring). Our algorithms apply to arbitrary activation functions, networks, and error functions (for instance, with connections that skip layers, or radial basis functions, or ...
Constructive Feedforward Neural Networks for Regression Problems: A Survey
, 1995
"... In this paper, we review the procedures for constructing feedforward neural networks in regression problems. While standard backpropagation performs gradient descent only in the weight space of a network with fixed topology, constructive procedures start with a small network and then grow additiona ..."
Abstract

Cited by 21 (0 self)
 Add to MetaCart
In this paper, we review the procedures for constructing feedforward neural networks in regression problems. While standard backpropagation performs gradient descent only in the weight space of a network with fixed topology, constructive procedures start with a small network and then grow additional hidden units and weights until a satisfactory solution is found. The constructive procedures are categorized according to the resultant network architecture and the learning algorithm for the network weights. The Hong Kong University of Science & Technology Technical Report Series Department of Computer Science 1 Introduction In recent years, many neural network models have been proposed for pattern classification, function approximation and regression problems. Among them, the class of multilayer feedforward networks is perhaps the most popular. Standard backpropagation performs gradient descent only in the weight space of a network with fixed topology; this approach is analogous to ...
Combining Exploratory Projection Pursuit And Projection Pursuit Regression With Application To Neural Networks
 Neural Computation
, 1992
"... We present a novel classification and regression method that combines exploratory projection pursuit (unsupervised training) with projection pursuit regression (supervised training), to yield a new family of cost/complexity penalty terms. Some improved generalization properties are demonstrated on r ..."
Abstract

Cited by 17 (9 self)
 Add to MetaCart
We present a novel classification and regression method that combines exploratory projection pursuit (unsupervised training) with projection pursuit regression (supervised training), to yield a new family of cost/complexity penalty terms. Some improved generalization properties are demonstrated on real world problems. 1 Introduction Parameter estimation becomes difficult in highdimensional spaces due to the increasing sparseness of the data. Therefore, when a low dimensional representation is embedded in the data, dimensionality reduction methods become useful. One such method  projection pursuit regression (Friedman and Stuetzle, 1981) (PPR) is capable of performing dimensionality reduction by composition, namely, it constructs an approximation to the desired response function using a composition of lower dimensional smooth functions. These functions depend on low dimensional projections through the data. When the dimensionality of the problem is in the thousands, even projection...
A Survey of Literature on Function Decomposition  Version IV
, 1995
"... This report surveys the literature on decomposition of binary, multiplevalued, and fuzzy functions. It gives also references to relevant basic logic synthesis papers that concern topics important for decomposition, such as for instance representation of Boolean functions, or symmetry of Boolean f ..."
Abstract

Cited by 10 (0 self)
 Add to MetaCart
This report surveys the literature on decomposition of binary, multiplevalued, and fuzzy functions. It gives also references to relevant basic logic synthesis papers that concern topics important for decomposition, such as for instance representation of Boolean functions, or symmetry of Boolean functions. As a result of the analysis of the most successful decomposition programs for AshenhurstCurtis Decomposition, several conclusions are derived that should allow to create a new program that will be able to outperform all the existing approaches to decomposition. Creating such a superior program is necessary to make it practically useful for applications that are of interest to Pattern Theory group at Avionics Labs of Wright Laboratories. In addition, the program will be also able to solve problems that have been never formulated before. It will be a testbed to develop and compare several known and new partial ideas related to decomposition. Our emphasis is on the following topics: 1. representation of data and efficient algorithms for data manipulation, 2. variable ordering methods for variable partitioning to create bound and free sets of input variables; heuristic approaches and their comparison, 3. column compatibility problem, 4. subfunction encoding problem, 5. use of partial and total symmetries in data to decrease the decomposition search space, 6. methods of dealing with strongly unspecified functions which are typical for machine learning applications, 7. special types of decomposition, that can be efficiently handled (cascades, trees without variable repetition).
Connectionist Speaker Normalization with Generalized Resource Allocating Networks
 Eds.), Advances in Neural Information Processing Systems 7
, 1995
"... The paper presents a rapid speakernormalization technique based on neural network spectral mapping. The neural network is used as a frontend of a continuous speech recognition system (speakerdependent, HMMbased) to normalize the input acoustic data from a new speaker. The spectral difference betw ..."
Abstract

Cited by 7 (2 self)
 Add to MetaCart
The paper presents a rapid speakernormalization technique based on neural network spectral mapping. The neural network is used as a frontend of a continuous speech recognition system (speakerdependent, HMMbased) to normalize the input acoustic data from a new speaker. The spectral difference between speakers can be reduced using a limited amount of new acoustic data (40 phonetically rich sentences). Recognition error of phone units from the acousticphonetic continuous speech corpus APASCI is decreased with an adaptability ratio of 25%. We used local basis networks of elliptical Gaussian kernels, with recursive allocation of units and online optimization of parameters (GRAN model). For this application, the model included a linear term. The results compare favorably with multivariate linear mapping based on constrained orthonormal transformations. 1 INTRODUCTION Speaker normalization methods are designed to minimize interspeaker variations, one of the principal error sources in a...
Adaptive Estimation in Pattern Recognition by Combining Different Procedures
 Statistica Sinica
"... : We study a problem of adaptive estimation of a conditional probability function in a pattern recognition setting. In many applications, for more flexibility, one may want to consider various estimation procedures targeted at different scenarios and/or under different assumptions. For example, when ..."
Abstract

Cited by 6 (3 self)
 Add to MetaCart
: We study a problem of adaptive estimation of a conditional probability function in a pattern recognition setting. In many applications, for more flexibility, one may want to consider various estimation procedures targeted at different scenarios and/or under different assumptions. For example, when the feature dimension is high, to overcome the familiar curse of dimensionality, one may seek a good parsimonious model among a number of candidates such as CART, neural nets, additive models, and others. For such a situation, one wishes to have an automated final procedure performing always as well as the best candidate. In this work, we propose a method to combine a countable collection of procedures for estimating the conditional probability. We show that the combined procedure has a property that its statistical risk is bounded above by that of any of the procedure being considered plus a small penalty. Thus in an asymptotic sense, the strengths of the different estimation procedures i...
Neural networks and logistic regression  Part I
, 1996
"... Feedforward neural networks have recently been applied in situations where an analysis based on the logistic regression model would have been a standard statistical approach; direct comparisons of results, however, are seldomly attempted. We therefore present a comparative investigation of both log ..."
Abstract

Cited by 5 (1 self)
 Add to MetaCart
Feedforward neural networks have recently been applied in situations where an analysis based on the logistic regression model would have been a standard statistical approach; direct comparisons of results, however, are seldomly attempted. We therefore present a comparative investigation of both logistic regression models and feedforward neural networks including some extensions. The theoretical features and properties are reviewed and illustrated in two examples, also discussing practical problems with their application. In Part II of the paper some further important aspects of approximation, overfitting and model selection are investigated in more detail both analytically and by means of simulation studies.
Discrimination and Classification
, 1995
"... The aim of this report is to present methods from statistics, neural networks, nonparametric regression and pattern recognition to perform discrimination and classification. The methods are compared on theoretical and empirical grounds to highlight strengths and weaknesses. A common platform for cla ..."
Abstract

Cited by 3 (3 self)
 Add to MetaCart
The aim of this report is to present methods from statistics, neural networks, nonparametric regression and pattern recognition to perform discrimination and classification. The methods are compared on theoretical and empirical grounds to highlight strengths and weaknesses. A common platform for classification is also outlined. The emphasis is on multiple (more than two) classes. Some keywords: Supervised Classification; Discriminant Analysis; Multiple Classes; Multilayer Perceptrons (MLP). Discrimination and Classification Contents 1 Introduction 3 2 Classification 3 2.1 Decision Theoretic Framework : : : : : : : : : : : : : : : : : : : : : : : : : : 3 2.2 Allocation Principles : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 4 2.3 Discriminant Functions : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 5 2.4 Constructing Classifiers : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 7 2.5 Evaluation Principles : : : : : : : : : : : : : : : : : : ...
Robust Interpretation of NeuralNetwork Models
, 1997
"... Artificial Neural Network seem very promising for regression and classification, especially for large covariate spaces. These methods represent a nonlinear function as a composition of low dimensional ridge functions and therefore appear to be less sensitive to the dimensionality of the covariate s ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
Artificial Neural Network seem very promising for regression and classification, especially for large covariate spaces. These methods represent a nonlinear function as a composition of low dimensional ridge functions and therefore appear to be less sensitive to the dimensionality of the covariate space. However, due to non uniqueness of a global minimum and the existence of (possibly) many local minima, the model revealed by the network is non stable. We introduce a method to interpret neural network results which uses novel robustification techniques. This results in a robust interpretation of the model employed by the network. Simulated data from known models is used to demonstrate the interpretability results and to demonstrate the effects of different regularization methods on the robustness of the model. Graphical methods are introduced to present the interpretation results. We further demonstrate how interaction between covariates can be revealed. From this study we conclude tha...