Results 1 
6 of
6
Gradientbased learning applied to document recognition
 Proceedings of the IEEE
, 1998
"... Multilayer neural networks trained with the backpropagation algorithm constitute the best example of a successful gradientbased learning technique. Given an appropriate network architecture, gradientbased learning algorithms can be used to synthesize a complex decision surface that can classify hi ..."
Abstract

Cited by 1457 (84 self)
 Add to MetaCart
Multilayer neural networks trained with the backpropagation algorithm constitute the best example of a successful gradientbased learning technique. Given an appropriate network architecture, gradientbased learning algorithms can be used to synthesize a complex decision surface that can classify highdimensional patterns, such as handwritten characters, with minimal preprocessing. This paper reviews various methods applied to handwritten character recognition and compares them on a standard handwritten digit recognition task. Convolutional neural networks, which are specifically designed to deal with the variability of two dimensional (2D) shapes, are shown to outperform all other techniques. Reallife document recognition systems are composed of multiple modules including field extraction, segmentation, recognition, and language modeling. A new learning paradigm, called graph transformer networks (GTN’s), allows such multimodule systems to be trained globally using gradientbased methods so as to minimize an overall performance measure. Two systems for online handwriting recognition are described. Experiments demonstrate the advantage of global training, and the flexibility of graph transformer networks. A graph transformer network for reading a bank check is also described. It uses convolutional neural network character recognizers combined with global training techniques to provide record accuracy on business and personal checks. It is deployed commercially and reads several million checks per day.
A tutorial on energybased learning
 Predicting Structured Data
, 2006
"... EnergyBased Models (EBMs) capture dependencies between variables by associating a scalar energy to each configuration of the variables. Inference consists in clamping the value of observed variables and finding configurations of the remaining variables that minimize the energy. Learning consists in ..."
Abstract

Cited by 55 (6 self)
 Add to MetaCart
EnergyBased Models (EBMs) capture dependencies between variables by associating a scalar energy to each configuration of the variables. Inference consists in clamping the value of observed variables and finding configurations of the remaining variables that minimize the energy. Learning consists in finding an energy function in which observed configurations of the variables are given lower energies than unobserved ones. The EBM approach provides a common theoretical framework for many learning models, including traditional discriminative and generative approaches, as well as graphtransformer networks, conditional random fields, maximum margin Markov networks, and several manifold learning methods. Probabilistic models must be properly normalized, which sometimes requires evaluating intractable integrals over the space of all possible variable configurations. Since EBMs have no requirement for proper normalization, this problem is naturally circumvented. EBMs can be viewed as a form of nonprobabilistic factor graphs, and they provide considerably more flexibility in the design of architectures and training criteria than probabilistic approaches. 1
Loss Functions for Discriminative Training of EnergyBased Models
 In Proc. of the 10th International Workshop on Artificial Intelligence and Statistics (AIStats’05
, 2005
"... ..."
(Show Context)
Factor graphs for relational regression
, 2007
"... would not have been possible without them. My mother is one person who has sacrificed the most during the past 5 years, being alone in India. She has always selflessly supported me and encouraged me in whatever I intended to endeavor. She always managed to provide me with the best possible resources ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
(Show Context)
would not have been possible without them. My mother is one person who has sacrificed the most during the past 5 years, being alone in India. She has always selflessly supported me and encouraged me in whatever I intended to endeavor. She always managed to provide me with the best possible resources, even if at times it involved budgeting herself. My father, my best friend, my mentor, and my inspiration. He was always there to guide me and encourage me at every step of this journey. Whatever I am today is only because of him. He would have been the happiest person on this earth to see the first page of this dissertation. I miss you dad. iii ACKNOWLEDGMENTS It is hard to express in words how thankful I am to my advisor Prof. Yann LeCun, whose guidance and support has made this journey truly magnificent. He has not only been my advisor but also a friend who has taught me so many things beyond research. I will never forget the discussions we had while at Disney Land in Los Angeles, and over countless lunch/dinner meetings. I would also like to especially thank Prof. Foster Provost whose diligent comments and suggestions
Loss Functions for Discriminative Training of EnergyBased Models.
, 2005
"... Probabilistic graphical models associate a probability to each configuration of the relevant variables. ..."
Abstract
 Add to MetaCart
Probabilistic graphical models associate a probability to each configuration of the relevant variables.
Loss Functions for Discriminative Training of EnergyBased Models.
"... Probabilistic graphical models associate a probability to each configuration of the relevant variables. Energybased models (EBM) associate an energy to those configurations, eliminating the need for proper normalization of probability distributions. Making a decision (an inference) with an EBM cons ..."
Abstract
 Add to MetaCart
(Show Context)
Probabilistic graphical models associate a probability to each configuration of the relevant variables. Energybased models (EBM) associate an energy to those configurations, eliminating the need for proper normalization of probability distributions. Making a decision (an inference) with an EBM consists in comparing the energies associated with various configurations of the variable to be predicted, and choosing the one with the smallest energy. Such systems must be trained discriminatively to associate low energies to the desired configurations and higher energies to undesired configurations. A wide variety of loss function can be used for this purpose. We give sufficient conditions that a loss function should satisfy so that its minimization will cause the system to approach to desired behavior. We give many specific examples of suitable loss functions, and show an application to object recognition in images. 1