Results 1 - 10
of
52
Gradient-based learning applied to document recognition
- Proceedings of the IEEE
, 1998
"... Multilayer neural networks trained with the back-propagation algorithm constitute the best example of a successful gradientbased learning technique. Given an appropriate network architecture, gradient-based learning algorithms can be used to synthesize a complex decision surface that can classify hi ..."
Abstract
-
Cited by 487 (38 self)
- Add to MetaCart
Multilayer neural networks trained with the back-propagation algorithm constitute the best example of a successful gradientbased learning technique. Given an appropriate network architecture, gradient-based learning algorithms can be used to synthesize a complex decision surface that can classify high-dimensional patterns, such as handwritten characters, with minimal preprocessing. This paper reviews various methods applied to handwritten character recognition and compares them on a standard handwritten digit recognition task. Convolutional neural networks, which are specifically designed to deal with the variability of two dimensional (2-D) shapes, are shown to outperform all other techniques. Real-life document recognition systems are composed of multiple modules including field extraction, segmentation, recognition, and language modeling. A new learning paradigm, called graph transformer networks (GTN’s), allows such multimodule systems to be trained globally using gradient-based methods so as to minimize an overall performance measure. Two systems for online handwriting recognition are described. Experiments demonstrate the advantage of global training, and the flexibility of graph transformer networks. A graph transformer network for reading a bank check is also described. It uses convolutional neural network character recognizers combined with global training techniques to provide record accuracy on business and personal checks. It is deployed commercially and reads several million checks per day.
Connectionist Learning Procedures
- ARTIFICIAL INTELLIGENCE
, 1989
"... A major goal of research on networks of neuron-like processing units is to discover efficient learning procedures that allow these networks to construct complex internal representations of their environment. The learning procedures must be capable of modifying the connection strengths in such a way ..."
Abstract
-
Cited by 290 (6 self)
- Add to MetaCart
A major goal of research on networks of neuron-like processing units is to discover efficient learning procedures that allow these networks to construct complex internal representations of their environment. The learning procedures must be capable of modifying the connection strengths in such a way that internal units which are not part of the input or output come to represent important features of the task domain. Several interesting gradient-descent procedures have recently been discovered. Each connection computes the derivative, with respect to the connection strength, of a global measure of the error in the performance of the network. The strength is then adjusted in the direction that decreases the error. These relatively simple, gradient-descent learning procedures work well for small tasks and the new challenge is to find ways of improving their convergence rate and their generalization abilities so that they can be applied to larger, more realistic tasks.
Handwritten Digit Recognition with a Back-Propagation Network
- Advances in Neural Information Processing Systems
, 1990
"... We present an application of back-propagation networks to handwritten digit recognition. Minimal preprocessing of the data was required, but architecture of the network was highly constrained and specifically designed for the task. The input of the network consists of normalized images of isolated d ..."
Abstract
-
Cited by 163 (14 self)
- Add to MetaCart
We present an application of back-propagation networks to handwritten digit recognition. Minimal preprocessing of the data was required, but architecture of the network was highly constrained and specifically designed for the task. The input of the network consists of normalized images of isolated digits. The method has 1% error rate and about a 9% reject rate on zipcode digits provided by the U.S. Postal Service. 1 INTRODUCTION The main point of this paper is to show that large back-propagation (BP) networks can be applied to real image-recognition problems without a large, complex preprocessing stage requiring detailed engineering. Unlike most previous work on the subject (Denker et al., 1989), the learning network is directly fed with images, rather than feature vectors, thus demonstrating the ability of BP networks to deal with large amounts of low level information. Previous work performed on simple digit images (Le Cun, 1989) showed that the architecture of the network strongly...
Shape quantization and recognition with randomized trees
- Neural Computation
, 1997
"... We explore a new approach to shape recognition based on a virtually in nite family of binary features (\queries") of the image data, designed to accommodate prior in-formation about shape invariance and regularity. Each query corresponds to a spatial arrangement ofseveral local topographic code ..."
Abstract
-
Cited by 126 (15 self)
- Add to MetaCart
We explore a new approach to shape recognition based on a virtually in nite family of binary features (\queries") of the image data, designed to accommodate prior in-formation about shape invariance and regularity. Each query corresponds to a spatial arrangement ofseveral local topographic codes (\tags") which are in themselves too primitive and common to be informative about shape. All the discriminating power derives from relative angles and distances among the tags. The important attributes of the queries are (i) a natural partial ordering corresponding to increasing structure and complexity � (ii) semi-invariance, meaning that most shapes of a given class will answer the same way totwo queries which are successive in the ordering � and (iii) stability, since the queries are not based on distinguished points and substructures. No classi er based on the full feature set can be evaluated and it is impossible to determine a priori which arrangements are informative. Our approach istoselect informative features and build tree classi ers at the same time by inductive learning. In e ect, each tree provides an approximation to the full posterior where the features
Unsupervised learning of invariant feature hierarchies with application to object recognition.” CVPR, 2007. 1 Data Driven HMC Algorithm. DDHMC (motion-based proposals) 1: Initialize chain with τo 2: for i = 1 to nsamples do 3: // 1. Data-Driven: Get Propo
- Initialize the Acceptance, H(qo, po), and the Proposal, H ′ (qo, po ) Hamiltonians , τq) 14: po = DMotion(τ ′ i , τq) 15: qo = DF orm(τ ′ i , τq) 16: draw po ∼ N (0, 1) 17: // 2. Perturbation on H ′ using Leapfrog 18: for j=1 to l do 13: qo = DF orm(τ ′ i
"... We present an unsupervised method for learning a hierarchy of sparse feature detectors that are invariant to small shifts and distortions. The resulting feature extractor consists of multiple convolution filters, followed by a pointwise sigmoid non-linearity, and a feature-pooling layer that compute ..."
Abstract
-
Cited by 65 (11 self)
- Add to MetaCart
We present an unsupervised method for learning a hierarchy of sparse feature detectors that are invariant to small shifts and distortions. The resulting feature extractor consists of multiple convolution filters, followed by a pointwise sigmoid non-linearity, and a feature-pooling layer that computes the max of each filter output within adjacent windows. A second level of larger and more invariant features is obtained by training the same algorithm on patches of features from the first level. Training a supervised classifier on these features yields 0.64 % error on MNIST, and 54 % average recognition rate on Caltech 101
Biological constraints on connectionist modelling
- Connectionism in Perspective
, 1989
"... Many researchers interested in connectionist models accept that such models are "neurally inspired " but do not worry too much about whether their models are biologically realistic. While such a position may be perfectly justifiable, the present paper attempts to illustrate how biological ..."
Abstract
-
Cited by 56 (5 self)
- Add to MetaCart
Many researchers interested in connectionist models accept that such models are "neurally inspired " but do not worry too much about whether their models are biologically realistic. While such a position may be perfectly justifiable, the present paper attempts to illustrate how biological information can be used to constrain connectionist models. Two particular areas are discussed. The first section deals with visual information processing in the primate and human visual system. It is argued that speed with which visual information is processed imposes major constraints on the architecture and operation of the visual system. In particular, it seems that a great deal of processing must depend on a single bottum-up pass. The second section deals with biological aspects of learning algorithms. It is argued that although there is good evidence for certain coactivation related synaptic modification schemes, other learning mechanisms, including back-propagation, are not currently supported by experimental data.
Computational Modeling of Spatial Attention
, 1996
"... This book chapter examines the role of spatial attention from a computational perspective. It is intended as an overview for cognitive scientists interested in computational modeling of attentional phenomena. Because the function of attention can be understood only in its relation to visual informat ..."
Abstract
-
Cited by 38 (1 self)
- Add to MetaCart
This book chapter examines the role of spatial attention from a computational perspective. It is intended as an overview for cognitive scientists interested in computational modeling of attentional phenomena. Because the function of attention can be understood only in its relation to visual information processing, we model not only the attentional system itself, but also the process of object recognition. We begin by presenting a basic model of object recognition, showing that interference prevents the system from reliably processing multiple, complex stimuli, and then we show how a simple mechanism of attentional selection can reduce this interference. Our first goal is to present a model that is computationally adequate, that is, a model that has the computational power to perform the sort of visual information processing tasks that people do. We then turn to simulations showing that the model can account for diverse experimental data, including: the benefit of attentional precuing, the time course of attention shifts, the effect of spatial uncertainty, the effect of irrelevant stimuli, the relation of object-based and location-based selection, and visual search. We conclude with a discussion of basic questions about computation modeling, including: Why build computational models? What makes a model compelling? When is a model right or wrong? Should one opt for depth or breadth in model coverage?
A Coarse-to-Fine Strategy for Multi-Class Shape Detection
, 2004
"... Multi-class shape detection, in the sense of recognizing and localizing instances from multiple shape classes, is formulated as a two-step process in which local indexing primes global interpretation. During indexing a list of instantiations (shape identities and poses) is compiled constrained only ..."
Abstract
-
Cited by 30 (8 self)
- Add to MetaCart
Multi-class shape detection, in the sense of recognizing and localizing instances from multiple shape classes, is formulated as a two-step process in which local indexing primes global interpretation. During indexing a list of instantiations (shape identities and poses) is compiled constrained only by no missed detections at the expense of false positives. Global information, such as expected relationships among poses, is incorporated afterward to remove ambiguities. This division is motivated by computational efficiency. In addition, indexing itself is organized as a coarse-to-fine search simultaneously in class and pose. This search can be interpreted as successive approximations to likelihood ratio tests arising from a simple (“naive Bayes”) statistical model for the edge maps extracted from the original images. The key to constructing efficient “hypothesis tests” for multiple classes and poses is local OR’ing; in particular, spread edges provide imprecise but common and locally invariant features. Natural tradeoffs then emerge between discrimination and the pattern of spreading. These are analyzed mathematically within the model-based framework and the whole procedure is illustrated by experiments in reading license plates.
An Analog Neural Network Processor with Programmable Topology
- IEEE JOURNAL OF SOLID-STATE CIRCUITS
, 1991
"... The architecture, implementation, and applications of a special purpose neural network processor are described. The chip performs over 2000 multiplications and additions simultaneously. Its datapath is particularly suitable for the convolutional topologies that are typical in classification networks ..."
Abstract
-
Cited by 18 (7 self)
- Add to MetaCart
The architecture, implementation, and applications of a special purpose neural network processor are described. The chip performs over 2000 multiplications and additions simultaneously. Its datapath is particularly suitable for the convolutional topologies that are typical in classification networks, but can also be configured for fully connected or feedback topologies. Resources can be multiplexed to permit implementation of networks with several hundreds of thousands of connections on a single chip. Computations are performed with 6 Bits accuracy for the weights and 3 Bits for the neuron states. Analog processing is used internally for reduced power dissipation and higher density, but all input/output is digital to simplify system integration. The practicality of the chip is demonstrated with an implementation of a neural network for optical character recognition. This network contains over 130,000 connections and is evaluated in 1 ms.

