Results 1 -
4 of
4
Gradient-based learning applied to document recognition
- Proceedings of the IEEE
, 1998
"... Multilayer neural networks trained with the back-propagation algorithm constitute the best example of a successful gradientbased learning technique. Given an appropriate network architecture, gradient-based learning algorithms can be used to synthesize a complex decision surface that can classify hi ..."
Abstract
-
Cited by 487 (38 self)
- Add to MetaCart
Multilayer neural networks trained with the back-propagation algorithm constitute the best example of a successful gradientbased learning technique. Given an appropriate network architecture, gradient-based learning algorithms can be used to synthesize a complex decision surface that can classify high-dimensional patterns, such as handwritten characters, with minimal preprocessing. This paper reviews various methods applied to handwritten character recognition and compares them on a standard handwritten digit recognition task. Convolutional neural networks, which are specifically designed to deal with the variability of two dimensional (2-D) shapes, are shown to outperform all other techniques. Real-life document recognition systems are composed of multiple modules including field extraction, segmentation, recognition, and language modeling. A new learning paradigm, called graph transformer networks (GTN’s), allows such multimodule systems to be trained globally using gradient-based methods so as to minimize an overall performance measure. Two systems for online handwriting recognition are described. Experiments demonstrate the advantage of global training, and the flexibility of graph transformer networks. A graph transformer network for reading a bank check is also described. It uses convolutional neural network character recognizers combined with global training techniques to provide record accuracy on business and personal checks. It is deployed commercially and reads several million checks per day.
Discriminative Feature And Model Design For Automatic Speech Recognition
- In Proc. of Eurospeech
, 1997
"... A system for discriminative feature and model design is presented for automatic speech recognition. Training based on minimum classification error with a single objective function is applied for designing a set of parallel networks performing feature transformation and a set of hidden Markov models ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
A system for discriminative feature and model design is presented for automatic speech recognition. Training based on minimum classification error with a single objective function is applied for designing a set of parallel networks performing feature transformation and a set of hidden Markov models performing speech recognition. This paper compares the use of linear and non-linear functional transformations when applied to conventional recognition features, such as spectrum or cepstrum. It also provides a framework for integrated feature and model training when using class-specific transformations. Experimental results on telephone-based connected digit recognition are presented. 1. INTRODUCTION Improving the performance of hidden Markov model (HMM) based automatic speech recognition (ASR) systems has been a central issue that has dominated the entire field of speech recognition during the past two decades. One effort to improving HMMs has been by extending the training paradigm beyo...
Relative Density Nets: A New Way to Combine Backpropagation with HMM's
, 2001
"... Logistic units in the rst hidden layer of a feedforward neural network compute the relative probability of a data point under two Gaussians. This leads us to consider substituting other density models. We present an architecture for performing discriminative learning of Hidden Markov Models usi ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Logistic units in the rst hidden layer of a feedforward neural network compute the relative probability of a data point under two Gaussians. This leads us to consider substituting other density models. We present an architecture for performing discriminative learning of Hidden Markov Models using a network of many small HMM's. Experiments on speech data show it to be superior to the standard method of discriminatively training HMM's.

