Results 1 - 10
of
22
The use of the area under the ROC curve in the evaluation of machine learning algorithms
- Pattern Recognition
, 1997
"... Abstract--In this paper we investigate the use of the area under the receiver operating characteristic (ROC) curve (AUC) as a performance measure for machine learning algorithms. As a case study we evaluate six machine learning algorithms (C4.5, Multiscale Classifier, Perceptron, Multi-layer Percept ..."
Abstract
-
Cited by 325 (0 self)
- Add to MetaCart
Abstract--In this paper we investigate the use of the area under the receiver operating characteristic (ROC) curve (AUC) as a performance measure for machine learning algorithms. As a case study we evaluate six machine learning algorithms (C4.5, Multiscale Classifier, Perceptron, Multi-layer Perceptron, k-Nearest Neighbours, and a Quadratic Discriminant Function) on six "real world " medical diagnostics data sets. We compare and discuss the use of AUC to the more conventional overall accuracy and find that AUC exhibits a number of desirable properties when compared to overall accuracy: increased sensitivity in Analysis of Variance (ANOVA) tests; a standard error that decreased as both AUC and the number of test samples increased; decision threshold independent; and it is invariant to a priori class probabilities. The paper concludes with the recommendation that AUC be used in preference to overall accuracy for "single number " evaluation of machine
Support vector machines for speech recognition
- Proceedings of the International Conference on Spoken Language Processing
, 1998
"... Statistical techniques based on hidden Markov Models (HMMs) with Gaussian emission densities have dominated signal processing and pattern recognition literature for the past 20 years. However, HMMs trained using maximum likelihood techniques suffer from an inability to learn discriminative informati ..."
Abstract
-
Cited by 47 (2 self)
- Add to MetaCart
Statistical techniques based on hidden Markov Models (HMMs) with Gaussian emission densities have dominated signal processing and pattern recognition literature for the past 20 years. However, HMMs trained using maximum likelihood techniques suffer from an inability to learn discriminative information and are prone to overfitting and over-parameterization. Recent work in machine learning has focused on models, such as the support vector machine (SVM), that automatically control generalization and parameterization as part of the overall optimization process. In this paper, we show that SVMs provide a significant improvement in performance on a static pattern classification task based on the Deterding vowel data. We also describe an application of SVMs to large vocabulary speech recognition, and demonstrate an improvement in error rate on a continuous alphadigit task (OGI Aphadigits) and a large vocabulary conversational speech task (Switchboard). Issues related to the development and optimization of an SVM/HMM hybrid system are discussed.
The Neural Network Pushdown Automaton: Model, Stack and Learning Simulations
, 1993
"... In order for neural networks to learn complex languages or grammars, they must have sufficient computational power or resources to recognize or generate such languages. Though many approaches to effectively utilizing the computational power of neural networks have been discussed, an obvious one is t ..."
Abstract
-
Cited by 16 (2 self)
- Add to MetaCart
In order for neural networks to learn complex languages or grammars, they must have sufficient computational power or resources to recognize or generate such languages. Though many approaches to effectively utilizing the computational power of neural networks have been discussed, an obvious one is to couple a recurrent neural network with an external stack memory- in effect creating a neural network pushdown automata (NNPDA). This NNPDA generalizes the concept of a recurrent network so that the network becomes a more complex computing structure. This paper discusses in detail a NNPDA- its construction, how it can be trained and how useful symbolic information can be extracted from the trained network. To effectively couple the external stack to the neural network, an optimization method is developed which uses an error function that connects the learning of the state automaton of the neural network to the learning of the operation of the external stack: push, pop, and no-operation. To minimize the error function using gradient descent learning, an analog stack is designed such that the action and storage of information in the stack are continuous. One interpretation of a continuous stack is the probabilistic storage of and action on data. After training on sample strings of an unknown source grammar, a quantization procedure extracts from the analog stack and neural network a discrete pushdown automata (PDA). Simulations show that in learning deterministic context-free grammars- the balanced parenthesis language, 1 n 0 n, and the deterministic Palindrome- the extracted PDA is correct in the sense that it can correctly recognize unseen strings of arbitrary length. In addition, the extracted PDAs can be shown to be identical or equivalent to the PDAs of the source grammars which were used to generate the training strings.
Artificial Neural Networks for Document Analysis and Recognition
- IEEE TPAMI
, 2003
"... Artificial neural networks have been extensively applied to document analysis and recogni-tion. Most efforts have been devoted to the recognition of isolated handwritten and printed characters with widely recognized successful results. However, many other document pro-cessing tasks like pre-processi ..."
Abstract
-
Cited by 15 (5 self)
- Add to MetaCart
Artificial neural networks have been extensively applied to document analysis and recogni-tion. Most efforts have been devoted to the recognition of isolated handwritten and printed characters with widely recognized successful results. However, many other document pro-cessing tasks like pre-processing, layout analysis, character segmentation, word recognition, and signature verification have been effectively faced with very promising results. This paper surveys most significant problems in the area of off-line document image processing where connectionist-based approaches have been applied. Similarities and differences between ap-proaches belonging to different categories are discussed. A particular emphasis is given on the crucial role of the prior knowledge for the conception of both appropriate architectures and learning algorithms. Finally, the paper provides a critical analysis on the reviewed approaches and depicts most promising research guidelines in the field. In particular, a sec-ond generation of connectionist-based models are foreseen which are based on appropriate graphical representations of the learning environment.
Toward Global Optimization Of Neural Networks: A Comparison Of The Genetic Algorithm And Backpropagation
, 1998
"... The recent surge in activity of Neural Network research in Business is not surprising since the underlying functions controlling business data are generally unknown and the neural network offers a tool that can approximate the unknown function to any degree of desired accuracy. The vast majority of ..."
Abstract
-
Cited by 11 (0 self)
- Add to MetaCart
The recent surge in activity of Neural Network research in Business is not surprising since the underlying functions controlling business data are generally unknown and the neural network offers a tool that can approximate the unknown function to any degree of desired accuracy. The vast majority of these studies rely on a gradient algorithm, typically a variation of back propagation, to obtain the parameters (weights) of the model. The well-known limitations of gradient search techniques applied to complex nonlinear optimization problems such as artificial neural networks have often resulted in inconsistent and unpredictable performance. Many researchers have attempted to address the problems associated with the training algorithm by imposing constraints on the search space or by restructuring the architecture of the neural network. In this paper we demonstrate that such constraints and restructuring are unnecessary if a sufficiently complex initial architecture and an appropriate glob...
Artificial Neural Networks Optimization by means of Evolutionary Algorithms
, 1997
"... In this paper Evolutionary Algorithms are investigated in the field of Artificial Neural Networks. In particular, the Breeder Genetic Algorithms are compared against Genetic Algorithms in facing contemporaneously the optimization of (i) the design of a neural network architecture and (ii) the choice ..."
Abstract
-
Cited by 6 (3 self)
- Add to MetaCart
In this paper Evolutionary Algorithms are investigated in the field of Artificial Neural Networks. In particular, the Breeder Genetic Algorithms are compared against Genetic Algorithms in facing contemporaneously the optimization of (i) the design of a neural network architecture and (ii) the choice of the best learning method for nonlinear system identification. The performance of the Breeder Genetic Algorithms is further improved by a fuzzy recombination operator. The experimental results for the two mentioned evolutionary optimization methods are presented and discussed. 1. Introduction E VOLUTIONARY Algorithms have been applied successfully to a wide variety of optimization problems. Recently a novel technique, the Breeder Genetic Algorithms (BGAs) [1, 2, 3] which can be seen as a combination of Evolution Strategies (ESs) [4] and Genetic Algorithms (GAs) [5, 6], has been introduced. BGAs use truncation selection which is very similar to the (¯; )--strategy in ESs and the search p...
An Artificial Neural Networks Primer with Financial Applications: Examples in Financial Distress Predictions and Foreign Exchange Hybrid Trading System
, 1997
"... Contents i Table of Contents 1. INTRODUCTION TO ARTIFICIAL INTELLIGENCE AND ARTIFICIAL NEURAL NETWORKS .......................................................................................................................................... 2 1.1 INTRODUCTION .................................. ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
Contents i Table of Contents 1. INTRODUCTION TO ARTIFICIAL INTELLIGENCE AND ARTIFICIAL NEURAL NETWORKS .......................................................................................................................................... 2 1.1 INTRODUCTION ........................................................................................................................... 2 1.2 ARTIFICIAL INTELLIGENCE .......................................................................................................... 2 1.3 ARTIFICIAL INTELLIGENCE IN FINANCE ....................................................................................... 4 1.3.1 Expert System ................................................................................................................... 4 1.3.2 Artificial Neural Networks in Finance..........................................
Applying the Connectionist Inductive Learning and Logic Programming System to Power System Diagnosis
- In: Proceedings of the IEEE International Conference on Neural Networks (ICNN-97
, 1997
"... The Connectionist Inductive Learning and Logic Programming System, C-IL 2 P, integrates the symbolic and connectionist paradigms of Artificial Intelligence through neural networks that perform massively parallel Logic Programming and inductive learning from examples and background knowledge. This ..."
Abstract
-
Cited by 3 (3 self)
- Add to MetaCart
The Connectionist Inductive Learning and Logic Programming System, C-IL 2 P, integrates the symbolic and connectionist paradigms of Artificial Intelligence through neural networks that perform massively parallel Logic Programming and inductive learning from examples and background knowledge. This work presents an extension of C-IL 2 P that allows the implementation of Extended Logic Programs in Neural Networks. This extension makes C-IL 2 P applicable to problems where the background knowledge is represented in a Default Logic. As a case example, we have applied the system for fault diagnosis of a simplified power system generation plant, obtaining good preliminary results.
Recurrent neural networks and pitch representations for music tasks
- In Proc. of the Florida Artificial Intelligence Research Symposium (FLAIRS
, 2004
"... We present results from experiments in using several pitch representations for jazz-oriented musical tasks performed by a recurrent neural network. We have run experiments with several kinds of recurrent networks for this purpose, and have found that Long Short-term Memory networks provide the best ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
We present results from experiments in using several pitch representations for jazz-oriented musical tasks performed by a recurrent neural network. We have run experiments with several kinds of recurrent networks for this purpose, and have found that Long Short-term Memory networks provide the best results. We show that a new pitch representation called Circles of Thirds works as well as two other published representations for these tasks, yet it is more succinct and enables faster learning. Recurrent Neural Networks and Music Many researchers are familiar with feedforward neural networks consisting of 2 or more layers of processing units, each with weighted connections to the next layer. Each unit passes the sum of its weighted inputs through a nonlinear sigmoid function. Each layer’s outputs are fed forward through the network to the next layer, until the output layer is reached. Weights are initialized to small initial random values. Via the back-propagation algorithm (Rumelhart et al. 1986), outputs are compared to targets, and the errors are propagated back through the connection weights. Weights are updated by gradient descent. Through an iterative training procedure, examples (inputs) and targets are presented repeatedly; the network learns a nonlinear function of the inputs. It can then generalize and produce outputs for new examples. These networks have been explored by the computer music community for classifying chords (Laden and Keefe 1991) and other
A Low Level Feature Based Neural Network Segmenter for Fully Cursive Handwritten Words
"... We describe a neural network for segmentation of handwritten words. Typical approaches to segmentation rely on "over segmentation" of the word using simple features. Vocabulary context can then be used to recover the correct segmentation points. However in large vocabulary applications it is much mo ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
We describe a neural network for segmentation of handwritten words. Typical approaches to segmentation rely on "over segmentation" of the word using simple features. Vocabulary context can then be used to recover the correct segmentation points. However in large vocabulary applications it is much more important to have high quality segmentation points in order to reduce the number of alternative candidate words to be considered. The system is constructed using supervised training, and gives improved accuracy for segmentation. Results compare segmentation accuracy with the number of possible segmentation points that are considered, and show that only a small number of "excess" segmentation points are necessary. 1.

