Results 1 - 10
of
11
Degraded Text Recognition Using Visual And Linguistic Context
, 1995
"... Recognition of degraded text is a challenging problem. To improve the performance of an OCR system on degraded images of text, postprocessing techniques are critical. The objective of postprocessing is to correct errors or to resolve ambiguities in OCR results by using contextual information. Depend ..."
Abstract
-
Cited by 20 (2 self)
- Add to MetaCart
Recognition of degraded text is a challenging problem. To improve the performance of an OCR system on degraded images of text, postprocessing techniques are critical. The objective of postprocessing is to correct errors or to resolve ambiguities in OCR results by using contextual information. Depending on the extent of context used, there are different levels of postprocessing. In current commercial OCR systems, word-level postprocessing methods, such as dictionary-lookup, have been applied successfully. However, many OCR errors cannot be corrected by word-level postprocessing. To overcome this limitation, passage-level postprocessing, in which global contextual information is utilized, is necessary. In most current studies on passage-level postprocessing, linguistic context is the major resource to be exploited. This thesis addresses problems in degraded text recognition and discusses potential solutions through passage-level postprocessing. The objective is to develop a postprocessin...
Low Entropy Coding with Unsupervised Neural Networks
"... ed on visual and speech data. The ability of the network to automatically generate wavelet codes from natural images is demonstrated. These bear a close resemblance to 2-D Gabor functions, which have previously been used to describe physiological receptive fields, and as a means of producing compact ..."
Abstract
-
Cited by 17 (0 self)
- Add to MetaCart
ed on visual and speech data. The ability of the network to automatically generate wavelet codes from natural images is demonstrated. These bear a close resemblance to 2-D Gabor functions, which have previously been used to describe physiological receptive fields, and as a means of producing compact image representations. Keywords: neural networks, unsupervised learning, self-organisation, feature extraction, information theory, redundancy reduction, sparse coding, imaging models, occlusion, image coding, speech coding. Declaration This dissertation is the result of my own original work, except where reference is made to the work of others. No part of it has been submitted for any other university degree or diploma. Its length, including captions, footnotes, appendix and bibliography, is approximately 58000 words. Acknowledgements I would like first and foremost to thank Richard Prager, my supervisor, fo
Category-Based Statistical Language Models
, 1997
"... this document. The first section, in chapter 3, develops a model for syntactic dependencies based on word-category n-grams. The second section, in chapter 4, extends this model by allowing short-range word relations to be captured through the incorporation of selected word n-grams. ..."
Abstract
-
Cited by 11 (2 self)
- Add to MetaCart
this document. The first section, in chapter 3, develops a model for syntactic dependencies based on word-category n-grams. The second section, in chapter 4, extends this model by allowing short-range word relations to be captured through the incorporation of selected word n-grams.
Forward-Backward Retraining of Recurrent Neural Networks
- in Advances in Neural Information Processing Systems 8
, 1996
"... This paper describes the training of a recurrent neural network as the letter posterior probability estimator for a hidden Markov model, off-line handwriting recognition system. The network estimates posterior distributions for each of a series of frames representing sections of a handwritten word. ..."
Abstract
-
Cited by 10 (0 self)
- Add to MetaCart
This paper describes the training of a recurrent neural network as the letter posterior probability estimator for a hidden Markov model, off-line handwriting recognition system. The network estimates posterior distributions for each of a series of frames representing sections of a handwritten word. The supervised training algorithm, backpropagation through time, requires target outputs to be provided for each frame. Three methods for deriving these targets are presented. A novel method based upon the forwardbackward algorithm is found to result in the recognizer with the lowest error rate. 1 Introduction In the field of off-line handwriting recognition, the goal is to read a handwritten document and produce a machine transcription. Such a system could be used for a variety of purposes, from cheque processing and postal sorting to personal correspondence reading for the blind or historical document reading. In a previous publication (Senior 1994) we have described a system based on a ...
A Survey on Off-Line Cursive Script Recognition
, 2002
"... This paper presents a surveyon o#-line Cursive WordRecogM]OyEL The approaches to the problem are described in detail. Each step of the processleading from raw data to the #nal result is analyzed. This survey is divided into two parts, the #rst onedealing with thegey,Hz aspects of Cursive Word ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
This paper presents a surveyon o#-line Cursive WordRecogM]OyEL The approaches to the problem are described in detail. Each step of the processleading from raw data to the #nal result is analyzed. This survey is divided into two parts, the #rst onedealing with thegey,Hz aspects of Cursive WordRecog[zMyEL the second onefocusing on the applications presented in the literature. ? 2002 PatternRecogySzSk Society. Published by Elsevier Science Ltd. AllrigOL reserved. Ke5ti9tz Survey; O#-line cursive wordrecogHO[yEL Handwriting recogiting 1.
Recent Achievements In Off-Line Handwriting Recognition Systems
- In Proceedings of the International Conference on Computational Intelligence and Multimedia Applications
, 1998
"... This paper reviews the current state of the art in handwriting recognition research. The paper deals with issues such as hand-printed character and cursive handwritten word recognition. It describes recent achievements, difficulties, successes and challenges in all aspects of handwriting recognition ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
This paper reviews the current state of the art in handwriting recognition research. The paper deals with issues such as hand-printed character and cursive handwritten word recognition. It describes recent achievements, difficulties, successes and challenges in all aspects of handwriting recognition. It also presents a new approach which dramatically improves current handwriting recognition systems. Some experimental results are included. 1 Introduction
Offline Cursive Handwriting: From Word to Text Recognition
, 2003
"... Contents 1 Introduction 5 2 State of the art 7 2.2 State of the Art . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.3 Structure of a CWR System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.3.1 Normalization . . . . . . . . . . . . . . . ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Contents 1 Introduction 5 2 State of the art 7 2.2 State of the Art . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.3 Structure of a CWR System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.3.1 Normalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.3.2 The segmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.3.3 Feature extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.3.4 Lexicon reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.3.5 The data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 2.3.6 The recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 2.3.7 Human Reading Inspired Systems . . . . . . . . . . . . . . . . . . . . . . . . . 17 2.3.8 Holistic approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Turkish handwritten text recognition: A case of agglutinative languages
- Proc. SPIE
, 2003
"... We describe a system for recognizing unconstrained Turkish handwritten text. Turkish has agglutinative morphology and theoretically an infinite number of words that can be generated by adding more suffixes to the word. This makes lexicon-based recognition approaches, where the most likely word is se ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
We describe a system for recognizing unconstrained Turkish handwritten text. Turkish has agglutinative morphology and theoretically an infinite number of words that can be generated by adding more suffixes to the word. This makes lexicon-based recognition approaches, where the most likely word is selected among all the alternatives in a lexicon, unsuitable for Turkish. We describe our approach to the problem using a Turkish prefix recognizer. First results of the system demonstrates the promise of this approach, with top-10 word recognition rate of about 40 % for a small test data of mixed handprint and cursive writing. The lexicon-based approach with a 17,000 word-lexicon (with test words added) achieves 56 % top-10 word recognition rate.
Bestimmung von Datums- und Signumsbereichen auf der Basis eines CP-Relaxations-Modells
, 1996
"... einer Unterschrift bezogen auf ein Muster zu bestimmen. Einen guten Uberblick uber das Gebiet der automatischen Signaturverifikation findet man u.a. in [1]) und [2]. Generell k onnen Systeme zur Identifizierung von Handschriften in solche, die die Daten direkt von Eingabeger aten erhalten (OnLine) u ..."
Abstract
- Add to MetaCart
einer Unterschrift bezogen auf ein Muster zu bestimmen. Einen guten Uberblick uber das Gebiet der automatischen Signaturverifikation findet man u.a. in [1]) und [2]. Generell k onnen Systeme zur Identifizierung von Handschriften in solche, die die Daten direkt von Eingabeger aten erhalten (OnLine) und jene, die die Identifizierung auf der Grundlage von im Graphikformat vorliegenden Daten durchf uhren (Off-Line), eingeteilt werden. Obwohl die Anwendungen und Methoden beider Systeme unterschiedlich sind, ist die generelle Taxonomie zur Identifikation von Signaturen ahnlich. Die Untersuchungen im vorliegenden Artikel konzentrieren sich auf eine OffLine -Trennung von Datum und Signum, d.h. die Festlegung von Datums- und Signumsbereichen f ur einen vorliegenden Beleg. Eine naheliegende Zeichenerkennung stellt sich als auerst schwierig dar, da einzelne Zeichen im Signum so gut wie nicht detektierbar sind. Einers
A Hybrid Large Vocabulary Handwritten Word Recognition System using
- In proceedings of IWFHR’2002
, 2002
"... In this paper we present a hybrid recognition system that integrates hidden Markov models (HMM) with neural networks (NN) in a probabilistic framework. The input data is processed first by a lexicon--driven word recognizer based on HMMs to generate a list of the candidate # --best-- scoring word hyp ..."
Abstract
- Add to MetaCart
In this paper we present a hybrid recognition system that integrates hidden Markov models (HMM) with neural networks (NN) in a probabilistic framework. The input data is processed first by a lexicon--driven word recognizer based on HMMs to generate a list of the candidate # --best-- scoring word hypotheses as well as the segmentation of such word hypotheses into characters. An NN classifier is used to generate a score for each segmented character and in the end, the scores from the HMM and the NN classifiers are combined to optimize performance. Experimental results show that for an 80,000--word vocabulary, the hybrid HMM/NN system improves by about 10% the word recognition rate over the HMM system alone.

