Results 1 - 10
of
106
Gradient-based learning applied to document recognition
- Proceedings of the IEEE
, 1998
"... Multilayer neural networks trained with the back-propagation algorithm constitute the best example of a successful gradientbased learning technique. Given an appropriate network architecture, gradient-based learning algorithms can be used to synthesize a complex decision surface that can classify hi ..."
Abstract
-
Cited by 487 (38 self)
- Add to MetaCart
Multilayer neural networks trained with the back-propagation algorithm constitute the best example of a successful gradientbased learning technique. Given an appropriate network architecture, gradient-based learning algorithms can be used to synthesize a complex decision surface that can classify high-dimensional patterns, such as handwritten characters, with minimal preprocessing. This paper reviews various methods applied to handwritten character recognition and compares them on a standard handwritten digit recognition task. Convolutional neural networks, which are specifically designed to deal with the variability of two dimensional (2-D) shapes, are shown to outperform all other techniques. Real-life document recognition systems are composed of multiple modules including field extraction, segmentation, recognition, and language modeling. A new learning paradigm, called graph transformer networks (GTN’s), allows such multimodule systems to be trained globally using gradient-based methods so as to minimize an overall performance measure. Two systems for online handwriting recognition are described. Experiments demonstrate the advantage of global training, and the flexibility of graph transformer networks. A graph transformer network for reading a bank check is also described. It uses convolutional neural network character recognizers combined with global training techniques to provide record accuracy on business and personal checks. It is deployed commercially and reads several million checks per day.
A survey of methods and strategies in character segmentation
- IEEE TRANSACTION ON PAMI
, 1996
"... Character segmentation has long been a critical area of the OCR process. The higher recognition rates for isolated characters vs. those obtained for words and connected character strings well illustrate this fact. A good part of recent progress in reading unconstrained printed and written text may b ..."
Abstract
-
Cited by 101 (1 self)
- Add to MetaCart
Character segmentation has long been a critical area of the OCR process. The higher recognition rates for isolated characters vs. those obtained for words and connected character strings well illustrate this fact. A good part of recent progress in reading unconstrained printed and written text may be ascribed to more insightful handling of segmentation. This paper provides a review of these advances. The aim is to provide an appreciation for the range of techniques that have been developed, rather than to simply list sources. Segmentation methods are listed under four main headings. What may be termed the "classical" approach consists of methods that partition the input image into subimages, which are then classified. The operation of attempting to decompose the image into classifiable units is called "dissection". The second class of methods avoids dissection, and segments the image either explicitly, by classification of prespecified windows, or implicitly by classification of subsets of spatial features collected from the image as a whole. The third strategy is a hybrid of the first two, employing dissection together with recombination rules to define potential segments, but using classification to select from the range of admissible segmentation possibilities offered by these subimages. Finally, holistic approaches that avoid segmentation by recognizing entire character strings as units are described.
On the Learnability and Usage of Acyclic Probabilistic Finite Automata
- JOURNAL OF COMPUTER AND SYSTEM SCIENCES
, 1995
"... We propose and analyze a distribution learning algorithm for a subclass of Acyclic Probabilistic Finite Automata (APFA). This subclass is characterized by a certain distinguishability property of the automata's states. Though hardness results are known for learning distributions generated by general ..."
Abstract
-
Cited by 59 (3 self)
- Add to MetaCart
We propose and analyze a distribution learning algorithm for a subclass of Acyclic Probabilistic Finite Automata (APFA). This subclass is characterized by a certain distinguishability property of the automata's states. Though hardness results are known for learning distributions generated by general APFAs, we prove that our algorithm can efficiently learn distributions generated by the subclass of APFAs we consider. In particular, we show that the KL-divergence between the distribution generated by the target source and the distribution generated by our hypothesis can be made arbitrarily small with high confidence in polynomial time. We present two applications of our algorithm. In the first, we show how to model cursively written letters. The resulting models are part of a complete cursive handwriting recognition system. In the second application we demonstrate how APFAs can be used to build multiplepronunciation models for spoken words. We evaluate the APFA based pronunciation models...
On-Line Cursive Script Recognition using Time Delay Neural Networks and Hidden Markov Models
"... We present a writer independent system for on-line handwriting recognition which can handle a variety of writing styles including cursive script and hand-print. The input to our system contains the pen trajectory information, encoded as a time-ordered sequence of feature vectors. A Time Delay Neural ..."
Abstract
-
Cited by 48 (2 self)
- Add to MetaCart
We present a writer independent system for on-line handwriting recognition which can handle a variety of writing styles including cursive script and hand-print. The input to our system contains the pen trajectory information, encoded as a time-ordered sequence of feature vectors. A Time Delay Neural Network is used to estimate a posteriori probabilities for characters in a word. A Hidden Markov Model segments the word in a way which optimizes the global word score, taking a dictionary into account. A geometrical normalization scheme and a fast but efficient dictionary search are also presented. Trained on 20k words from 59 writers, using a 25k word dictionary we reached a 89% character and 80% word recognition rate on test data from a disjoint set of writers. Keywords: Handwriting Recognition, Neural Networks, Cursive Script, Hidden Markov Models, Dictionary Search. 1 Introduction Pen interfaces should replace advantageously both mouse and keyboard in a variety of situations. Users w...
Gestures without Libraries, Toolkits or Training: A $1 Recognizer for User Interface Prototypes
"... Although mobile, tablet, large display, and tabletop computers increasingly present opportunities for using pen, finger, and wand gestures in user interfaces, implementing gesture recognition largely has been the privilege of pattern matching experts, not user interface prototypers. Although some us ..."
Abstract
-
Cited by 43 (3 self)
- Add to MetaCart
Although mobile, tablet, large display, and tabletop computers increasingly present opportunities for using pen, finger, and wand gestures in user interfaces, implementing gesture recognition largely has been the privilege of pattern matching experts, not user interface prototypers. Although some user interface libraries and toolkits offer gesture recognizers, such infrastructure is often unavailable in design-oriented environments like Flash, scripting environments like JavaScript, or brand new off-desktop prototyping environments. To enable novice programmers to incorporate gestures into their UI prototypes, we present a “$1 recognizer ” that is easy, cheap, and usable almost anywhere in about 100 lines of code. In a study comparing our $1 recognizer, Dynamic Time Warping, and the Rubine classifier on user-supplied gestures, we found that $1 obtains over 97 % accuracy with only 1 loaded template and 99 % accuracy with 3+ loaded templates. These results were nearly identical to DTW and superior to Rubine. In addition, we found that medium-speed gestures, in which users balanced speed and accuracy, were recognized better than slow or fast gestures for all three recognizers. We also discuss the effect that the number of templates or training examples has on recognition, the score falloff along recognizers ’ N-best lists, and results for individual gestures. We include detailed pseudocode of the $1 recognizer to aid development, inspection, extension, and testing. ACM Categories & Subject Descriptors: H5.2. [Information interfaces and presentation]: User interfaces – Input devices and strategies. I5.2. [Pattern recognition]: Design methodology – Classifier design and evaluation. I5.5. [Pattern recognition]: Implementation – Interactive systems.
Mathematical Expression Recognition: A Survey
, 2000
"... . Automatic recognition of mathematical expressions is one of the key vehicles in the drive towards transcribing documents in scientific and engineering disciplines into electronic form. This problem typically consists of two major stages, namely, symbol recognition and structural analysis. In this ..."
Abstract
-
Cited by 39 (2 self)
- Add to MetaCart
. Automatic recognition of mathematical expressions is one of the key vehicles in the drive towards transcribing documents in scientific and engineering disciplines into electronic form. This problem typically consists of two major stages, namely, symbol recognition and structural analysis. In this survey paper, we will review most of the existing work with respect to each of the two major stages of the recognition process. In particular, we try to put emphasis on the similarities and differences between systems. Moreover, some important issues in mathematical expression recognition will be addressed in depth. All these together serve to provide a clear overall picture of how this research area has been developed to date. Key words: error detection and correction -- mathematical expression recognition -- performance evaluation -- structural analysis -- symbol recognition 1
On-Line Cursive Handwriting Recognition Using Speech Recognition Methods
, 1994
"... A hidden Markov model (HMM) based continuous speech recognition system is applied to on-line cursive handwriting recognition. The base system is unmodified except for using handwriting feature vectors instead of speech. Due to inherent properties of HMMs, segmentation of the handwritten script sente ..."
Abstract
-
Cited by 35 (5 self)
- Add to MetaCart
A hidden Markov model (HMM) based continuous speech recognition system is applied to on-line cursive handwriting recognition. The base system is unmodified except for using handwriting feature vectors instead of speech. Due to inherent properties of HMMs, segmentation of the handwritten script sentences is unnecessary. A 1.1% word error rate is achieved for a 3050 word lexicon, 52 character, writer-dependent task and 3%-5% word error rates are obtained for six different writers in a 25,595 word lexicon, 86 character, writer-dependent task. Similarities and differences between the continuous speech and on-line cursive handwriting recognition tasks are explored; the handwriting database collected over the past year is described; and specific implementation details of the handwriting system are discussed. 1. INTRODUCTION Traditionally, the first step in handwriting recognition is the segmentation of words into component characters [1]. However, in modern continuous speech recognition ef...
Toward Interface Design for Human Language Technology: Modality and Structure as Determinants of Linguistic Complexity
, 1995
"... Before next-generation human language technology can be designed to function successfully in actual #eld settings, interface techniques will be needed that can guide users' language to coincide with current system capabilities. The present study examines how input modality and presentation struct ..."
Abstract
-
Cited by 34 (13 self)
- Add to MetaCart
Before next-generation human language technology can be designed to function successfully in actual #eld settings, interface techniques will be needed that can guide users' language to coincide with current system capabilities. The present study examines how input modality and presentation structure in#uence the linguistic complexity observed in people's spoken and written input to an interactive system. Using a semi-automatic simulation technique, language was collected during speech-only, writing-only, and combined pen#voice exchanges, and using presentation formats that either were structured or unconstrained. Results indicate that both modality and presentation format substantially in#uence linguistic complexity, although the speci#c nature of their impact di#ers. A comprehensive analysis is provided of how both factors a#ect people's observed language in terms of total words, dis#uencies, utterance length, lexical variability, perplexity, syntactic ambiguity, and semanti...
SHARK2: A large vocabulary shorthand writing system for pen-based computers
- Proc. UIST 2004
, 2004
"... Zhai and Kristensson (2003) presented a method of speedwriting for pen-based computing which utilizes gesturing on a stylus keyboard for familiar words and tapping for others. In SHARK 2, we eliminated the necessity to alternate between the two modes of writing, allowing any word in a large vocabula ..."
Abstract
-
Cited by 34 (9 self)
- Add to MetaCart
Zhai and Kristensson (2003) presented a method of speedwriting for pen-based computing which utilizes gesturing on a stylus keyboard for familiar words and tapping for others. In SHARK 2, we eliminated the necessity to alternate between the two modes of writing, allowing any word in a large vocabulary (e.g. 10,000-20,000 words) to be entered as a shorthand gesture. This new paradigm supports a gradual and seamless transition from visually guided tracing to recall-based gesturing. Based on the use characteristics and human performance observations, we designed and implemented the architecture, algorithms and interfaces of a high-capacity multi-channel pen-gesture recognition system. The system’s key components and performance are also reported.
A Theory of Multiple Classifier Systems And Its Application to Visual Word Recognition
, 1992
"... Despite the success of many pattern recognition systems in constrained domains, problems that involve noisy input and many classes remain difficult. A promising direction is to use several classifiers simultaneously, such that they can complement each other in correctness. This thesis is concerned w ..."
Abstract
-
Cited by 31 (8 self)
- Add to MetaCart
Despite the success of many pattern recognition systems in constrained domains, problems that involve noisy input and many classes remain difficult. A promising direction is to use several classifiers simultaneously, such that they can complement each other in correctness. This thesis is concerned with decision combination in a multiple classifier system that is critical to its success. A multiple classifier system consists of a set of classifiers and a decision combination function. It is a preferred solution to a complex recognition problem because it allows simultaneous use of feature descriptors of many types, corresponding measures of similarity, and many classification procedures. It also allows dynamic selection, so that classifiers adapted to inputs of a particular type may be applied only when those inputs are encountered. Decisions by the classifiers are represented as rankings of the class set that are derivable from the results of feature matching. Rank scores contain more ...

