Results 1  10
of
28
Neural networks for classification: a survey
 and Cybernetics  Part C: Applications and Reviews
, 2000
"... Abstract—Classification is one of the most active research and application areas of neural networks. The literature is vast and growing. This paper summarizes the some of the most important developments in neural network classification research. Specifically, the issues of posterior probability esti ..."
Abstract

Cited by 50 (0 self)
 Add to MetaCart
Abstract—Classification is one of the most active research and application areas of neural networks. The literature is vast and growing. This paper summarizes the some of the most important developments in neural network classification research. Specifically, the issues of posterior probability estimation, the link between neural and conventional classifiers, learning and generalization tradeoff in classification, the feature variable selection, as well as the effect of misclassification costs are examined. Our purpose is to provide a synthesis of the published research in this area and stimulate further research interests and efforts in the identified topics. Index Terms—Bayesian classifier, classification, ensemble methods, feature variable selection, learning and generalization, misclassification costs, neural networks. I.
NeuralNetwork Based Measures Of Confidence For Word Recognition
 in Proc. ICASSP
, 1997
"... This paper proposes a probabilistic framework to define and evaluate confidence measures for word recognition. We describe a novel method to combine different knowledge sources and estimate the confidence in a word hypothesis, via a neural network. We also propose a measure of the joint performance ..."
Abstract

Cited by 48 (5 self)
 Add to MetaCart
This paper proposes a probabilistic framework to define and evaluate confidence measures for word recognition. We describe a novel method to combine different knowledge sources and estimate the confidence in a word hypothesis, via a neural network. We also propose a measure of the joint performance of the recognition and confidence systems. The definitions and algorithms are illustrated with results on the Switchboard Corpus. 1. INTRODUCTION In the last few years, a lot of research has been devoted to the development of confidence scores associated with the outputs of automatic speech recognition (ASR) systems. These scores were used mostly to help spot keywords in spontaneous or read texts, and to provide a basis for the rejection of outofvocabulary words (e.g. [411]). Many other ASR applications could also benefit from knowing the level of confidence in correct recognition. For example, textdependent speaker recognition systems could put more emphasis on words recognized with h...
A Global Optimization Technique for Statistical Classifier Design
 IEEE Transactions on Signal Processing
"... A global optimization method is introduced for the design of statistical classifiers that minimize the rate of misclassification. We first derive the theoretical basis for the method, based on which we develop a novel design algorithm and demonstrate its effectiveness and superior performance in the ..."
Abstract

Cited by 25 (9 self)
 Add to MetaCart
A global optimization method is introduced for the design of statistical classifiers that minimize the rate of misclassification. We first derive the theoretical basis for the method, based on which we develop a novel design algorithm and demonstrate its effectiveness and superior performance in the design of practical classifiers for some of the most popular structures currently in use. The method, grounded in ideas from statistical physics and information theory, extends the deterministic annealing approach for optimization, both to incorporate structural constraints on data assignments to classes and to minimize the probability of error as the cost objective. During the design, data are assigned to classes in probability, so as to minimize the expected classification error given a specified level of randomness, as measured by Shannon's entropy. The constrained optimization is equivalent to a free energy minimization, motivating a deterministic annealing approach in which the entropy...
Modular Neural Networks for MAP Classification of Time Series and the Partition Algorithm
, 1996
"... We apply the Partition Algorithm to the problem of time series classification. We assume that the source that generates the time series belongs to a finite set of candidate sources. Classification is based on the computation of posterior probabilities. Prediction error is used to adaptively update t ..."
Abstract

Cited by 13 (7 self)
 Add to MetaCart
We apply the Partition Algorithm to the problem of time series classification. We assume that the source that generates the time series belongs to a finite set of candidate sources. Classification is based on the computation of posterior probabilities. Prediction error is used to adaptively update the posterior probability of each source. The algorithm is implemented by a hierarchical, modular, recurrent network. The bottom (partition) level of the network consists of neural modules, each one trained to predict the output of one candidate source. The top (decision) level consists of a decision module, which computes posterior probabilities and classifies the time series to the source of maximum posterior probability. The classifier network is formed fi'om the composition of the partition and decision levels. This method applies to deterministic as well as probabilistic time series. Source switching can also be accommodated. We give some examples of application to problems of signal detection, phoneme and enzyme classification. In conclusion, the algorithm presented here gives a systematic method for the design of modular classification networks. The method can be extended by various choices of the partition and decision components.
Efficient Training of FeedForward Neural Networks
, 1997
"... : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 61 A.2 Introduction : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 61 A.2.1 Motivation : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 61 A.3 Optimization strategy : : : : : : : : : : : : ..."
Abstract

Cited by 12 (0 self)
 Add to MetaCart
: : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 61 A.2 Introduction : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 61 A.2.1 Motivation : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 61 A.3 Optimization strategy : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 62 A.4 The Backpropagation algorithm : : : : : : : : : : : : : : : : : : : : : : : : 63 A.5 Conjugate direction methods : : : : : : : : : : : : : : : : : : : : : : : : : : 63 A.5.1 Conjugate gradients : : : : : : : : : : : : : : : : : : : : : : : : : : 65 A.5.2 The CGL algorithm : : : : : : : : : : : : : : : : : : : : : : : : : : : 67 A.5.3 The BFGS algorithm : : : : : : : : : : : : : : : : : : : : : : : : : : 67 A.6 The SCG algorithm : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 67 A.7 Test results : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 70 A.7.1 Comparison metric : : : : : : : : : : : : : : : : : : : : : : : :...
Application of DempsterShafer theory in condition monitoring applications: A case study
 PATTERN RECOGNITION LETTERS
, 2001
"... This paper is concerned with the use of DempsterShafer theory in `fusion' classifiers. We argue that the use of predictive accuracy for basic probability assignments can improve the overall system performance when compared to `traditional' mass assignment techniques. We demonstrate the effectivenes ..."
Abstract

Cited by 5 (1 self)
 Add to MetaCart
This paper is concerned with the use of DempsterShafer theory in `fusion' classifiers. We argue that the use of predictive accuracy for basic probability assignments can improve the overall system performance when compared to `traditional' mass assignment techniques. We demonstrate the effectiveness of this approach in a case study involving the detection of static thermostatic valve faults in a diesel engine cooling system.
TextIndependent Speaker Verification over a Telephone Network by Radial Basis Function Networks
 National Tsing Hua University
, 1996
"... This paper presents several textindependent speaker verification experiments based on Radial Basis Function (RBF) and Elliptical Basis Function (EBF) networks. The experiments involve 76 speakers from dialect region 2 of the TIMIT and NTIMIT databases. Each speaker was modelled by a 12input, 2out ..."
Abstract

Cited by 4 (3 self)
 Add to MetaCart
This paper presents several textindependent speaker verification experiments based on Radial Basis Function (RBF) and Elliptical Basis Function (EBF) networks. The experiments involve 76 speakers from dialect region 2 of the TIMIT and NTIMIT databases. Each speaker was modelled by a 12input, 2output network in which one output represents the speaker class while the other represents the antispeaker class. The results show that both RBF and EBF networks are very robust in detecting impostors for clean speech, with the EBF networks being significantly better than the RBF networks in this respect. For clean speech, a false acceptance rate of 0.06% and a false rejection rate of 0.19% have been achieved. However, for telephone speech, the false acceptance rate and the false rejection rate are increased to 11.7% and 8.71%, respectively. It is concluded that better preprocessing techniques are required to reduce the effects of noise and channel variations. 1. INTRODUCTION In recent years...
Speech Processing with Linear and Neural Network Models
, 1996
"... ion, for imposing continuity between models of adjacent speech segments, and learning rate adaptation, for improving backpropagation training, are discussed. For synthesising real speech utterances, an audio tape demonstrates that ARX models produce the highest quality synthetic speech and that the ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
ion, for imposing continuity between models of adjacent speech segments, and learning rate adaptation, for improving backpropagation training, are discussed. For synthesising real speech utterances, an audio tape demonstrates that ARX models produce the highest quality synthetic speech and that the quality is maintained when pitch modifications are applied. The second part of the dissertation studies the operation of recurrent neural networks in classifying patterns of correlated feature vectors. Such patterns are typical of speech classification tasks. The operation of a hidden node with a recurrent connection is explained in terms of a decision boundary which changes position in feature space. The feedback is shown to delay switching from one class to another and to smooth output decisions for sequences of feature vectors from the same class. For networks trained with constant class targets, a sequence of feature vectors from the same class tends to drive the operation of hidden nod
Cost functions to estimate a posteriori probabilities in multiclass problems
 IEEE Trans. Neural Networks
, 1999
"... Abstract—The problem of designing cost functions to estimate a posteriori probabilities in multiclass problems is addressed in this paper. We establish necessary and sufficient conditions that these costs must satisfy in oneclass oneoutput networks whose outputs are consistent with probability law ..."
Abstract

Cited by 4 (2 self)
 Add to MetaCart
Abstract—The problem of designing cost functions to estimate a posteriori probabilities in multiclass problems is addressed in this paper. We establish necessary and sufficient conditions that these costs must satisfy in oneclass oneoutput networks whose outputs are consistent with probability laws. We focus our attention on a particular subset of the corresponding cost functions; those which verify two usually interesting properties: symmetry and separability (wellknown cost functions, such as the quadratic cost or the cross entropy are particular cases in this subset). Finally, we present a universal stochastic gradient learning rule for singlelayer networks, in the sense of minimizing a general version of these cost functions for a wide family of nonlinear activation functions. Index Terms — Neural networks, pattern classification, probability estimation.
A Comparison of RuleBased, KNearest Neighbor, and Neural Net Classifiers for Automated Industrial Inspection
 In Proceedings of the IEEE/ACM International Conference on Developing and Managing Expert System Programs
, 1991
"... Over the last few years the authors have been involved in research aimed at developing a machine vision system for locating and identifying surface defects on materials. The particular problem being studied involves locating surface defects on hardwood lumber in a species independent manner. Obvious ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
Over the last few years the authors have been involved in research aimed at developing a machine vision system for locating and identifying surface defects on materials. The particular problem being studied involves locating surface defects on hardwood lumber in a species independent manner. Obviously, the accurate location and identification of defects is of paramount importance in this system. In the machine vision system that has been developed, initial hypotheses generated by bottomup processing for defect labeling are verified using topdown processing. Thus, the label verification greatly affects the accuracy of the system. For this label verification, a rulebased approach, and a knearest neighbor approach, and a neural network approach have been implemented. An experimental comparison of these approaches together with other considerations have made the neural net approach the preferred choice for doing the label verification in this vision system. 1.