Results 1  10
of
13
Computing Second Derivatives in FeedForward Networks: a Review
 IEEE Transactions on Neural Networks
, 1994
"... . The calculation of second derivatives is required by recent training and analyses techniques of connectionist networks, such as the elimination of superfluous weights, and the estimation of confidence intervals both for weights and network outputs. We here review and develop exact and approximate ..."
Abstract

Cited by 36 (4 self)
 Add to MetaCart
(Show Context)
. The calculation of second derivatives is required by recent training and analyses techniques of connectionist networks, such as the elimination of superfluous weights, and the estimation of confidence intervals both for weights and network outputs. We here review and develop exact and approximate algorithms for calculating second derivatives. For networks with jwj weights, simply writing the full matrix of second derivatives requires O(jwj 2 ) operations. For networks of radial basis units or sigmoid units, exact calculation of the necessary intermediate terms requires of the order of 2h + 2 backward/forwardpropagation passes where h is the number of hidden units in the network. We also review and compare three approximations (ignoring some components of the second derivative, numerical differentiation, and scoring). Our algorithms apply to arbitrary activation functions, networks, and error functions (for instance, with connections that skip layers, or radial basis functions, or ...
A partitioned neural network approach for vowel classification using smoothed time/frequency features
 IEEE Trans. on Speech and Audio Processing
, 1999
"... A novel pattern classification technique and a new feature extraction method are described and tested for vowel classification. The pattern classification technique partitions an Nway classification task into N*(N1)/2 twoway classification tasks. Each twoway classification task is performed usin ..."
Abstract

Cited by 22 (10 self)
 Add to MetaCart
(Show Context)
A novel pattern classification technique and a new feature extraction method are described and tested for vowel classification. The pattern classification technique partitions an Nway classification task into N*(N1)/2 twoway classification tasks. Each twoway classification task is performed using a neural network classifier that is trained to discriminate the two members of one pair of categories. Multiple twoway classification decisions are then combined to form an Nway decision. Some of the advantages of the new classification approach include the partitioning of the task allowing independent feature and classifier optimization for each pair of categories, lowered sensitivity of classification performance on network parameters, a reduction in the amount of training data required, and potential for superior performance relative to a single large network. The features described in this paper, closely related to the cepstral coefficients and delta cepstra commonly used in speech analysis, are developed using a unified mathematical framework which allows arbitrary nonlinear frequency, amplitude, and time scales to compactly represent the spectral/temporal characteristics of speech. This classification approach, combined with a featureranking algorithm which selected the 35 most discriminative spectral/temporal features for each vowel pair, resulted in 71.5 % accuracy for classification of 16 vowels extracted from the TIMIT database. These results, significantly higher than other published results for the same task, illustrate the potential for the methods presented in this paper. EDICS: SA1.6.3, SA1.6.1
Cost functions to estimate a posteriori probabilities in multiclass problems
 IEEE Trans. Neural Networks
, 1999
"... Abstract—The problem of designing cost functions to estimate a posteriori probabilities in multiclass problems is addressed in this paper. We establish necessary and sufficient conditions that these costs must satisfy in oneclass oneoutput networks whose outputs are consistent with probability law ..."
Abstract

Cited by 4 (2 self)
 Add to MetaCart
(Show Context)
Abstract—The problem of designing cost functions to estimate a posteriori probabilities in multiclass problems is addressed in this paper. We establish necessary and sufficient conditions that these costs must satisfy in oneclass oneoutput networks whose outputs are consistent with probability laws. We focus our attention on a particular subset of the corresponding cost functions; those which verify two usually interesting properties: symmetry and separability (wellknown cost functions, such as the quadratic cost or the cross entropy are particular cases in this subset). Finally, we present a universal stochastic gradient learning rule for singlelayer networks, in the sense of minimizing a general version of these cost functions for a wide family of nonlinear activation functions. Index Terms — Neural networks, pattern classification, probability estimation.
Improving stateoftheart continuous speech recognition systems using the Nbest paradigm with neural networks
 In Proceedings of the DARPA Workshop on Speech and Natural Language
, 1992
"... In an effort to advance the state of the art in continuous speech recognition employing hidden Markov models (HMM), Segmental Neural Nets (SNN) were introduced recently to ameliorate the wellknown limitations of HMMs, namely, the conditionalindependence limitation and the relative difficulty with w ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
(Show Context)
In an effort to advance the state of the art in continuous speech recognition employing hidden Markov models (HMM), Segmental Neural Nets (SNN) were introduced recently to ameliorate the wellknown limitations of HMMs, namely, the conditionalindependence limitation and the relative difficulty with which HMMs can handle segmental features. We describe a hybrid SNN/IIMM system that combines the speed and performance of our HMM system with the segmental modeling capabilities of SNNs. The integration of the two acoustic modeling techniques is achieved successfully via the Nbest rescoring paradigm. The Nbest lists are used not only for recognition, but also during training. This discriminative training using Nbest is demonstrated to improve performance. When tested on the DARPA Resource Management speakerindependent corpus, the hybrid SNN/HMM system decreases the error by about 20% compared to the stateoftheart HMM system.
COMPLEXITY REDUCTION IN NEURAL NETWORKS APPLIED TO TRAFFIC SIGN RECOGNITION TASKS
"... This paper deals with the application of Neural Networks (NNs) to the problem of Traffic Sign Recognition (TSR). The NN chosen to implement the TSR system is the Multilayer Perceptron (MLP). Two ways to reduce the computational cost in order to facilitate the real time implementation are proposed. T ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
(Show Context)
This paper deals with the application of Neural Networks (NNs) to the problem of Traffic Sign Recognition (TSR). The NN chosen to implement the TSR system is the Multilayer Perceptron (MLP). Two ways to reduce the computational cost in order to facilitate the real time implementation are proposed. The first one reduces the number of MLP inputs by preprocessing the traffic sign image (blob). Important information is kept during this operation and only the redundancy contained in the blob is removed. The second one looks for neural networks with reduced complexity by selecting a suitable error criterion for training. Two error criteria are studied: the Least Square error (LS) and the KullbackLeibler error criteria. The best results are obtained using the KullbackLeibler error criterion. 1.
Differential Learning Leads to Efficient Neural Network Classifiers
 In IEEE Proceedings of the 1993 International Conference on Acoustics, Speech, and Signal Processing
, 1992
"... We outline a differential theory of learning for statistical pattern classification. When applied to neural networks, the theory leads to an efficient differential learning strategy based on classification figureofmerit (CFM) objective functions [5]. Differential learning guarantees the highest pr ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
We outline a differential theory of learning for statistical pattern classification. When applied to neural networks, the theory leads to an efficient differential learning strategy based on classification figureofmerit (CFM) objective functions [5]. Differential learning guarantees the highest probability of generalization for a classifier with limited functional complexity, trained with a limited number of examples. The theory is significant for this and two other reasons: ffl It proves that the current probabilistic learning strategy for neural network classifiers (employing error measure objective functions such as meansquared error and the KullbackLeibler distance measure) is inefficient, and therefore suboptimal. ffl It explains why current theoretical estimates of the training sample size and classifier functional complexity needed for generalization are often orders of magnitude higher than the true information/computational resources needed for the classification task. W...
parameters such as cr and A, that is Pr(lw , or) = Pr(). Posterior probabilities of network weights are as follows. For regression with Gaussian error and unknown a,
"... This paper has covered Bayesian theory relevant to the problem of training feedforward connectionist networks. We now sketch out how this might be put together in practice, assuming a standard gradient descent algorithm as used during search ..."
Abstract
 Add to MetaCart
This paper has covered Bayesian theory relevant to the problem of training feedforward connectionist networks. We now sketch out how this might be put together in practice, assuming a standard gradient descent algorithm as used during search
DETECTION IN BREAST CANCER DIAGNOSIS
"... Neural networks (NNs) are customarily used as classifiers aimed at minimizing classification error rates. However, it is known that the NN architectures that compute soft decisions can be used to estimate posterior class probabilities; sometimes, it could be useful to implement general decision rule ..."
Abstract
 Add to MetaCart
Neural networks (NNs) are customarily used as classifiers aimed at minimizing classification error rates. However, it is known that the NN architectures that compute soft decisions can be used to estimate posterior class probabilities; sometimes, it could be useful to implement general decision rules other than the maximum a posteriori
Wave Solder Process Control Modeling Using A Neural Network Approach
 In Intelligent Engineering Systems Through Arti® cial Neural Networks
, 1994
"... : We discuss the formulation and results of a simple backpropagation approach to the control of wave soldering of printed circuit cards. Small lot sizes and a large number of different circuit card designs have complicated selection of the tunable process settings at the large manufacturer we w ..."
Abstract
 Add to MetaCart
: We discuss the formulation and results of a simple backpropagation approach to the control of wave soldering of printed circuit cards. Small lot sizes and a large number of different circuit card designs have complicated selection of the tunable process settings at the large manufacturer we worked with. Use of a neural network predictive model results in improved precision relative to the currently used multivariate linear model. INTRODUCTION The wave solder process involves (1) preheating, (2) fluxing, (3) soldering using a wave of solder, (4) cleaning, and (5) quality control. The process must be adapted according to the design (mass, size, component density, component type, etc.) of the circuit card to optimize quality, i.e. minimize solder connection defects. Process parameters which are controllable are the preheat temperatures and the line speed. Circuit card manufacturers produce products of great diversity in small lot sizes, compounding the selection of good process...