Results 1  10
of
25
The use of the area under the ROC curve in the evaluation of machine learning algorithms
 Pattern Recognition
, 1997
"... AbstractIn this paper we investigate the use of the area under the receiver operating characteristic (ROC) curve (AUC) as a performance measure for machine learning algorithms. As a case study we evaluate six machine learning algorithms (C4.5, Multiscale Classifier, Perceptron, Multilayer Percept ..."
Abstract

Cited by 436 (0 self)
 Add to MetaCart
AbstractIn this paper we investigate the use of the area under the receiver operating characteristic (ROC) curve (AUC) as a performance measure for machine learning algorithms. As a case study we evaluate six machine learning algorithms (C4.5, Multiscale Classifier, Perceptron, Multilayer Perceptron, kNearest Neighbours, and a Quadratic Discriminant Function) on six "real world " medical diagnostics data sets. We compare and discuss the use of AUC to the more conventional overall accuracy and find that AUC exhibits a number of desirable properties when compared to overall accuracy: increased sensitivity in Analysis of Variance (ANOVA) tests; a standard error that decreased as both AUC and the number of test samples increased; decision threshold independent; and it is invariant to a priori class probabilities. The paper concludes with the recommendation that AUC be used in preference to overall accuracy for "single number " evaluation of machine
Support vector machines for speech recognition
 Proceedings of the International Conference on Spoken Language Processing
, 1998
"... Statistical techniques based on hidden Markov Models (HMMs) with Gaussian emission densities have dominated signal processing and pattern recognition literature for the past 20 years. However, HMMs trained using maximum likelihood techniques suffer from an inability to learn discriminative informati ..."
Abstract

Cited by 74 (2 self)
 Add to MetaCart
Statistical techniques based on hidden Markov Models (HMMs) with Gaussian emission densities have dominated signal processing and pattern recognition literature for the past 20 years. However, HMMs trained using maximum likelihood techniques suffer from an inability to learn discriminative information and are prone to overfitting and overparameterization. Recent work in machine learning has focused on models, such as the support vector machine (SVM), that automatically control generalization and parameterization as part of the overall optimization process. In this paper, we show that SVMs provide a significant improvement in performance on a static pattern classification task based on the Deterding vowel data. We also describe an application of SVMs to large vocabulary speech recognition, and demonstrate an improvement in error rate on a continuous alphadigit task (OGI Aphadigits) and a large vocabulary conversational speech task (Switchboard). Issues related to the development and optimization of an SVM/HMM hybrid system are discussed.
Artificial Neural Networks for Document Analysis and Recognition
 IEEE TPAMI
, 2003
"... Artificial neural networks have been extensively applied to document analysis and recognition. Most efforts have been devoted to the recognition of isolated handwritten and printed characters with widely recognized successful results. However, many other document processing tasks like preprocessi ..."
Abstract

Cited by 21 (5 self)
 Add to MetaCart
Artificial neural networks have been extensively applied to document analysis and recognition. Most efforts have been devoted to the recognition of isolated handwritten and printed characters with widely recognized successful results. However, many other document processing tasks like preprocessing, layout analysis, character segmentation, word recognition, and signature verification have been effectively faced with very promising results. This paper surveys most significant problems in the area of offline document image processing where connectionistbased approaches have been applied. Similarities and differences between approaches belonging to different categories are discussed. A particular emphasis is given on the crucial role of the prior knowledge for the conception of both appropriate architectures and learning algorithms. Finally, the paper provides a critical analysis on the reviewed approaches and depicts most promising research guidelines in the field. In particular, a second generation of connectionistbased models are foreseen which are based on appropriate graphical representations of the learning environment.
The Neural Network Pushdown Automaton: Model, Stack and Learning Simulations
, 1993
"... In order for neural networks to learn complex languages or grammars, they must have sufficient computational power or resources to recognize or generate such languages. Though many approaches to effectively utilizing the computational power of neural networks have been discussed, an obvious one is t ..."
Abstract

Cited by 17 (2 self)
 Add to MetaCart
In order for neural networks to learn complex languages or grammars, they must have sufficient computational power or resources to recognize or generate such languages. Though many approaches to effectively utilizing the computational power of neural networks have been discussed, an obvious one is to couple a recurrent neural network with an external stack memory in effect creating a neural network pushdown automata (NNPDA). This NNPDA generalizes the concept of a recurrent network so that the network becomes a more complex computing structure. This paper discusses in detail a NNPDA its construction, how it can be trained and how useful symbolic information can be extracted from the trained network. To effectively couple the external stack to the neural network, an optimization method is developed which uses an error function that connects the learning of the state automaton of the neural network to the learning of the operation of the external stack: push, pop, and nooperation. To minimize the error function using gradient descent learning, an analog stack is designed such that the action and storage of information in the stack are continuous. One interpretation of a continuous stack is the probabilistic storage of and action on data. After training on sample strings of an unknown source grammar, a quantization procedure extracts from the analog stack and neural network a discrete pushdown automata (PDA). Simulations show that in learning deterministic contextfree grammars the balanced parenthesis language, 1 n 0 n, and the deterministic Palindrome the extracted PDA is correct in the sense that it can correctly recognize unseen strings of arbitrary length. In addition, the extracted PDAs can be shown to be identical or equivalent to the PDAs of the source grammars which were used to generate the training strings.
Toward Global Optimization Of Neural Networks: A Comparison Of The Genetic Algorithm And Backpropagation
, 1998
"... The recent surge in activity of Neural Network research in Business is not surprising since the underlying functions controlling business data are generally unknown and the neural network offers a tool that can approximate the unknown function to any degree of desired accuracy. The vast majority of ..."
Abstract

Cited by 15 (0 self)
 Add to MetaCart
The recent surge in activity of Neural Network research in Business is not surprising since the underlying functions controlling business data are generally unknown and the neural network offers a tool that can approximate the unknown function to any degree of desired accuracy. The vast majority of these studies rely on a gradient algorithm, typically a variation of back propagation, to obtain the parameters (weights) of the model. The wellknown limitations of gradient search techniques applied to complex nonlinear optimization problems such as artificial neural networks have often resulted in inconsistent and unpredictable performance. Many researchers have attempted to address the problems associated with the training algorithm by imposing constraints on the search space or by restructuring the architecture of the neural network. In this paper we demonstrate that such constraints and restructuring are unnecessary if a sufficiently complex initial architecture and an appropriate glob...
Artificial Neural Networks Optimization by means of Evolutionary Algorithms
, 1997
"... In this paper Evolutionary Algorithms are investigated in the field of Artificial Neural Networks. In particular, the Breeder Genetic Algorithms are compared against Genetic Algorithms in facing contemporaneously the optimization of (i) the design of a neural network architecture and (ii) the choice ..."
Abstract

Cited by 6 (3 self)
 Add to MetaCart
In this paper Evolutionary Algorithms are investigated in the field of Artificial Neural Networks. In particular, the Breeder Genetic Algorithms are compared against Genetic Algorithms in facing contemporaneously the optimization of (i) the design of a neural network architecture and (ii) the choice of the best learning method for nonlinear system identification. The performance of the Breeder Genetic Algorithms is further improved by a fuzzy recombination operator. The experimental results for the two mentioned evolutionary optimization methods are presented and discussed. 1. Introduction E VOLUTIONARY Algorithms have been applied successfully to a wide variety of optimization problems. Recently a novel technique, the Breeder Genetic Algorithms (BGAs) [1, 2, 3] which can be seen as a combination of Evolution Strategies (ESs) [4] and Genetic Algorithms (GAs) [5, 6], has been introduced. BGAs use truncation selection which is very similar to the (¯; )strategy in ESs and the search p...
An Artificial Neural Networks Primer with Financial Applications: Examples in Financial Distress Predictions and Foreign Exchange Hybrid Trading System
, 1997
"... Contents i Table of Contents 1. INTRODUCTION TO ARTIFICIAL INTELLIGENCE AND ARTIFICIAL NEURAL NETWORKS .......................................................................................................................................... 2 1.1 INTRODUCTION .................................. ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
Contents i Table of Contents 1. INTRODUCTION TO ARTIFICIAL INTELLIGENCE AND ARTIFICIAL NEURAL NETWORKS .......................................................................................................................................... 2 1.1 INTRODUCTION ........................................................................................................................... 2 1.2 ARTIFICIAL INTELLIGENCE .......................................................................................................... 2 1.3 ARTIFICIAL INTELLIGENCE IN FINANCE ....................................................................................... 4 1.3.1 Expert System ................................................................................................................... 4 1.3.2 Artificial Neural Networks in Finance..........................................
Applying the Connectionist Inductive Learning and Logic Programming System to Power System Diagnosis
 In: Proceedings of the IEEE International Conference on Neural Networks (ICNN97
, 1997
"... The Connectionist Inductive Learning and Logic Programming System, CIL 2 P, integrates the symbolic and connectionist paradigms of Artificial Intelligence through neural networks that perform massively parallel Logic Programming and inductive learning from examples and background knowledge. This ..."
Abstract

Cited by 4 (3 self)
 Add to MetaCart
The Connectionist Inductive Learning and Logic Programming System, CIL 2 P, integrates the symbolic and connectionist paradigms of Artificial Intelligence through neural networks that perform massively parallel Logic Programming and inductive learning from examples and background knowledge. This work presents an extension of CIL 2 P that allows the implementation of Extended Logic Programs in Neural Networks. This extension makes CIL 2 P applicable to problems where the background knowledge is represented in a Default Logic. As a case example, we have applied the system for fault diagnosis of a simplified power system generation plant, obtaining good preliminary results.
Using Neural Network for Springback Minimization in a Channel Forming
 Process”, Developments in Sheet Metal Stamping, SAE Paper 98M154, SP1322
, 1998
"... Springback, the geometric difference between the loaded and unloaded configurations, is affected by many factors, such as material properties, sheet thickness, lubrication conditions, tooling geometry and processing parameters. It is extremely difficult to develop an analytical model for springback ..."
Abstract

Cited by 4 (1 self)
 Add to MetaCart
Springback, the geometric difference between the loaded and unloaded configurations, is affected by many factors, such as material properties, sheet thickness, lubrication conditions, tooling geometry and processing parameters. It is extremely difficult to develop an analytical model for springback control including all these factors. The proposed neural network model is an attempt to deal with such a complicated nonlinear system in a predictive way. For demonstration, an aluminum channel forming process is considered in this work. Our previous research [1] has shown that a variable binder force history during the forming operation can reduce the springback amount significantly while maintaining a relatively low maximum strain if an initial low binder force was used followed by a higher binder force. However, when and how much of the increase is depends on the forming conditions of the current process. Here, several numerical simulations using Finite Element Method (FEM) were performed to obtain the teaching data required for training the neural network by means of the backpropagation algorithm. In the predictive mode, different process inputs from the ones used in the previous stage were considered. For each case, the displacement where binder force increases and the level of the high binder force were predicted by the learned neural network and were numerically tested. Consistent low springback angle (< 0.5°) and moderate stretching amount (< 16%) were obtained even in the cases where the process parameters were varied as much as ±25 % on friction coefficient and sheet thickness or ±10 % on material’s mechanical properties. The neural network can be easily implemented in the experiments and/or in real production to resolve the uncertainty of springback amount due to the variations in material properties and friction conditions.
Recurrent neural networks and pitch representations for music tasks
 In Proc. of the Florida Artificial Intelligence Research Symposium (FLAIRS
, 2004
"... We present results from experiments in using several pitch representations for jazzoriented musical tasks performed by a recurrent neural network. We have run experiments with several kinds of recurrent networks for this purpose, and have found that Long Shortterm Memory networks provide the best ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
We present results from experiments in using several pitch representations for jazzoriented musical tasks performed by a recurrent neural network. We have run experiments with several kinds of recurrent networks for this purpose, and have found that Long Shortterm Memory networks provide the best results. We show that a new pitch representation called Circles of Thirds works as well as two other published representations for these tasks, yet it is more succinct and enables faster learning. Recurrent Neural Networks and Music Many researchers are familiar with feedforward neural networks consisting of 2 or more layers of processing units, each with weighted connections to the next layer. Each unit passes the sum of its weighted inputs through a nonlinear sigmoid function. Each layer’s outputs are fed forward through the network to the next layer, until the output layer is reached. Weights are initialized to small initial random values. Via the backpropagation algorithm (Rumelhart et al. 1986), outputs are compared to targets, and the errors are propagated back through the connection weights. Weights are updated by gradient descent. Through an iterative training procedure, examples (inputs) and targets are presented repeatedly; the network learns a nonlinear function of the inputs. It can then generalize and produce outputs for new examples. These networks have been explored by the computer music community for classifying chords (Laden and Keefe 1991) and other