Results 1 - 10
of
10
Offline Cursive Script Word Recognition -- a Survey
, 1999
"... We review the field of offline cursive word recognition. We mainly deal with the various methods that were proposed to realize the core of recognition in a word recognition system. These methods are discussed in view of the two most important properties of such a system: the size and nature of the l ..."
Abstract
-
Cited by 40 (3 self)
- Add to MetaCart
We review the field of offline cursive word recognition. We mainly deal with the various methods that were proposed to realize the core of recognition in a word recognition system. These methods are discussed in view of the two most important properties of such a system: the size and nature of the lexicon involved, and whether or not a segmentation stage is present. We classify the field into three categories: segmentation-free methods, which compare a sequence of observations derived from a word image with similar references of words in the lexicon; segmentation-based methods, that look for the best match between consecutive sequences of primitive segments and letters of a possible word; and the perception-oriented approach, that relates to methods that perform a human-like reading technique, in which anchor features found all over the word are used to bootstrap a few candidates for a final evaluation phase.
Coding and Comparison of DAGs as a Novel Neural Structure with Applications to On-Line Handwriting Recognition
"... This paper applies Directed Acyclic Graphs (DAGs) to a large class of (temporal) pattern recognition problems and other recognition problems where the data has a linear ordering. The data streams are coded (DAG-coded) into DAGs for robust segmentation. The similarity of two streams can be manifested ..."
Abstract
-
Cited by 18 (4 self)
- Add to MetaCart
This paper applies Directed Acyclic Graphs (DAGs) to a large class of (temporal) pattern recognition problems and other recognition problems where the data has a linear ordering. The data streams are coded (DAG-coded) into DAGs for robust segmentation. The similarity of two streams can be manifested as the path matching score of the two corresponding DAGs. This paper also presents an efficient and robust dynamic programming algorithm for their comparisons (DAG-Compare). Since the DAGCoding methodology directly provides a robust segmentation process, it can be applied recursively to create a novel system architecture. The DAG structure also allows adaptive restructuring, leading to a novel approach to neural information processing. By using these elementary operations on DAGs, we can recognize on average 94.0% (writerdependent) of the isolated handwritten cursive characters. DAG-Coding may also be applied to speech recognition or any other continuous streams where a robust multi-path se...
A Robust, Language-Independent OCR System
"... We present a language-independent optical character recognition (OCR) system that is capable, in principle, of recognizing printed text from most of the world's languages. For each new language or script the system requires sample training data along with ground truth at the text-line level; there i ..."
Abstract
-
Cited by 10 (0 self)
- Add to MetaCart
We present a language-independent optical character recognition (OCR) system that is capable, in principle, of recognizing printed text from most of the world's languages. For each new language or script the system requires sample training data along with ground truth at the text-line level; there is no need to specify the location of either the lines or the words and characters. The system uses hidden Markov modeling (HMM) technology to model each character. In addition to language independence, the technology enhances performance for degraded data, such as fax, by using unsupervised adaptation techniques. Thus far, we have demonstrated the language-independence of this approach for Arabic, English, and Chinese. Recognition results are presented in this paper, including results on faxed data. Keywords: character recognition, OCR, speech recognition, Hidden Markov Models, Arabic OCR, Chinese OCR 1. INTRODUCTION The use of Hidden Markov Models (HMM) in developing continuous speech rec...
Length Estimation of Digit Strings Using Neural Networks with Structure Based Features
- SPIE/IS&T Journal of Electronic Imaging
, 1998
"... Accurate length estimation is very helpful for the successful segmentation and recognition of connected digit strings, in particular, for an off-line recognition system. However, little work has been done in this area due to the difficulties involved. In this paper, a length estimation approach is p ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
Accurate length estimation is very helpful for the successful segmentation and recognition of connected digit strings, in particular, for an off-line recognition system. However, little work has been done in this area due to the difficulties involved. In this paper, a length estimation approach is presented as a part of an automatic off-line digit recognition system. The kernel of our approach is a neural network estimator with a set of structure based features as the inputs. The system outputs are a set of fuzzy membership grades reflecting the degrees of an input digit string for having different lengths. Experimental results on NIST Special Database 3 and other derived digit strings shows that our approach can achieve about 99.4% correct estimation if the best two estimations are considered. Keywords: Length Estimation of Digit Strings, Connected Character Recognition, Structurebased Features, Neural Network Applications. 1 Introduction Many algorithms have been developed for off-...
HMM Based High Accuracy Off-Line Cursive Handwriting Recognition By A Baseline Detection Error Tolerant Feature Extraction Approach
- In Proc. Int. Workshop on Frontiers in Handwriting Recognition
, 2000
"... this paper we present an HMM based ..."
Automatic Recognition of Handwritten Dates on Brazilian Bank Cheques
, 2003
"... In this thesis, an HMM-MLP hybrid system for segmenting and recognizing unconstrained handwritten dates written on Brazilian bank cheques is presented. The system evolves by dealing with many sources of variability, such as heterogeneous data types and styles, variations present in the date field, a ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
In this thesis, an HMM-MLP hybrid system for segmenting and recognizing unconstrained handwritten dates written on Brazilian bank cheques is presented. The system evolves by dealing with many sources of variability, such as heterogeneous data types and styles, variations present in the date field, and difficult cases of segmentation that make the recognizer task particular hard to do. The system takes an HMM-based...
Off-Line Handwritten Word Recognition Using Hidden Markov Models
, 1999
"... Introduction Today, handwriting recognition is one of the most challenging tasks and exciting areas of research in computer science. Indeed, despite the growing interest in this field, no satisfactory solution is available. The difficulties encountered are numerous and include the huge variability ..."
Abstract
- Add to MetaCart
Introduction Today, handwriting recognition is one of the most challenging tasks and exciting areas of research in computer science. Indeed, despite the growing interest in this field, no satisfactory solution is available. The difficulties encountered are numerous and include the huge variability of handwriting such as inter-writer and intra-writer variabilities, writing environment (pen, sheet, support, etc.), the overlap between characters, and the ambiguity that makes many characters unidentifiable without referring to context. Owing to these difficulties, many researchers have integrated the lexicon as a constraint to build lexicon-driven strategies to decrease the problem complexity. For small lexicons, as in bank-check processing, most approaches are global and consider a word as an indivisible entity [1] - [5]. If the lexicon is large, as in postal applications (city name or street name recognition) [6] - [10], one cannot consider a word as one entity, because of the huge num
Lexicon and hidden Markov model-based optimisation
, 2005
"... The Brahmi descended Sinhala script is used by 75% of the 18 million population in Sri Lanka. To the best of our knowledge, none of the Brahmi descended scripts used by hundreds of millions of people in South Asia, possess commercial OCR products. In the process of implementation of an OCR system fo ..."
Abstract
- Add to MetaCart
The Brahmi descended Sinhala script is used by 75% of the 18 million population in Sri Lanka. To the best of our knowledge, none of the Brahmi descended scripts used by hundreds of millions of people in South Asia, possess commercial OCR products. In the process of implementation of an OCR system for the printed Sinhala script which is easily adoptable to similar scripts [Premaratne, L., Assabie, Y., Bigun, J., 2004. Recognition of modification-based scripts using direction tensors. In: 4th Indian Conf. on Computer Vision, Graphics and Image Processing (ICVGIP2004), pp. 587--592]; a segmentation-free recognition method using orientation features has been proposed in [Premaratne, H.L., Bigun, J., 2004. A segmentation-free approach to recognise printed Sinhala script using linear symmetry. Pattern Recognition 37, 2081--2089]. Due to the limitations in image analysis techniques the character level accuracy of the results directly produced by the proposed character recognition algorithm saturates at 94%. The false rejections from the recognition algorithm are initially identified only as `missing character positions' or `blank characters'. It is necessary to identify suitable substitutes for such `missing character positions' and optimise the accuracy of words to an acceptable level. This paper proposes a novel method that explores the lexicon in association with the hidden Markov models to improve the rate of accuracy of the recognised script. The proposed method could easily be extended with minor changes to other modification-based scripts consisting of confusing characters. The word-level accuracy which was at 81.5% is improved to 88.5% by the proposed optimisation algorithm.
Improved Offline Connected Script Recognition Based on Hybrid Strategy
"... In domain of analytic cursive word recognition, there are two main approaches: explicit segmentation based and implicit segmentation based. However, both approaches have their own shortcomings. To overcome individual weaknesses, this paper presents a hybrid strategy for recognition of strings of cha ..."
Abstract
- Add to MetaCart
In domain of analytic cursive word recognition, there are two main approaches: explicit segmentation based and implicit segmentation based. However, both approaches have their own shortcomings. To overcome individual weaknesses, this paper presents a hybrid strategy for recognition of strings of characters (words or numerals). In a two stage dynamic programming based, lexicon driven approach, first an explicit segmentation is applied to segment either cursive handwritten words or numeric strings. However, at this stage, segmentation points are not finalized. In the second verification stage, statistical features are extracted from each segmented area to recognize characters using a trained neural network. To enhance segmentation and recognition accuracy, lexicon is consulted using existing dynamic programming matching techniques. Accordingly, segmentation points are altered to decide true character boundaries by using lexicon feedback. A rigorous experimental protocol shows high performance of the proposed method for cursive handwritten words and numeral strings. Keywords-explicit segmentation, implicit segmentation, hybrid strategy, dynamic programming, cursive character recognition. Improved Offline Cursive Script I.
Extraction and Optimization of B-Spline PBD Templates for Recognition of Connected Handwritten Digit Strings
, 2002
"... Recognition of connected handwritten digit strings is a challenging task due mainly to two problems: poor character segmentation and unreliable isolated character recognition. In this paper, we first present a rational B-spline representation of digit templates based on Pixel-to-Boundary Distance (P ..."
Abstract
- Add to MetaCart
Recognition of connected handwritten digit strings is a challenging task due mainly to two problems: poor character segmentation and unreliable isolated character recognition. In this paper, we first present a rational B-spline representation of digit templates based on Pixel-to-Boundary Distance (PBD) maps. We then present a neural network approach to extract B-spline PBD templates and an evolutionary algorithm to optimize these templates. In total, 1,000 templates (100 templates for each of 10 classes) were extracted from and optimized on 10,426 training samples from the NIST Special Database 3. By using these templates, a nearest neighbor classifier can successfully reject 90.7 percent of nondigit patterns while achieving a 96.4 percent correct classification of isolated test digits. When our classifier is applied to the recognition of 4,958 connected handwritten digit strings (4,555 2-digit, 355 3-digit, and 48 4-digit strings) from the NIST Special Database 3 with a dynamic programming approach, it has a correct classification rate of 82.4 percent with a rejection rate of as low as 0.85 percent. Our classifier compares favorably in terms of correct classification rate and robustness with other classifiers that are tested.

