Results 1 - 10
of
39
A Self-Correcting 100-Font Classifier
, 1994
"... We have developed a practical scheme to take advantage of local typeface homogeneity to improve the accuracy of a character classifier. Given a polyfont classifier which is capable of recognizing any of 100 typefaces moderately well, our method allows it to specialize itself automatically to the sin ..."
Abstract
-
Cited by 59 (35 self)
- Add to MetaCart
We have developed a practical scheme to take advantage of local typeface homogeneity to improve the accuracy of a character classifier. Given a polyfont classifier which is capable of recognizing any of 100 typefaces moderately well, our method allows it to specialize itself automatically to the single --- but otherwise unknown --- typeface it is reading. Essentially, the classifier retrains itself after examining some of the images, guided at first by the preset classification boundaries of the given classifier, and later by the behavior of the retrained classifier. Experimental trials on 6.4M pseudo-randomly distorted images show that the method improves on 95 of the 100 typefaces. It reduces the error rate by a factor of 2.5, averaged over 100 typefaces, when applied to an alphabet of 80 ASCII characters printed at ten point and digitized at 300 pixels/inch. This self-correcting method complements, and does not hinder, other methods for improving OCR accuracy, such as linguistic con...
Anatomy Of A Versatile Page Reader
- Proceedings of the IEEE
, 1992
"... An experimental printed-page reader that is easy to adapt to various languages is described. Changing the target language may involve simultaneous changes in symbol sets, typefaces, sizes of text, page layouts, linguistic contexts, and imaging defects. Our strategy has been to isolate the effects of ..."
Abstract
-
Cited by 38 (5 self)
- Add to MetaCart
An experimental printed-page reader that is easy to adapt to various languages is described. Changing the target language may involve simultaneous changes in symbol sets, typefaces, sizes of text, page layouts, linguistic contexts, and imaging defects. Our strategy has been to isolate the effects of these sources of variation within separate, independent engineering subsystems. In this way, we have been able to construct, with a minimum of manual effort, classifiers for arbitrary combinations of symbols, typefaces, sizes, and imaging defects. We have tried to rid the algorithms of all language-specific rules, relying instead on automatic learning from examples and generalized table-driven methods. For some tasks we have been able to avoid languagedependency altogether: for example, for geometric page layout analysis we have found a global-to-local strategy that requires no prior knowledge of the symbol set. We can exploit linguistic context, such as provided by dictionaries, through da...
A Theory of Multiple Classifier Systems And Its Application to Visual Word Recognition
, 1992
"... Despite the success of many pattern recognition systems in constrained domains, problems that involve noisy input and many classes remain difficult. A promising direction is to use several classifiers simultaneously, such that they can complement each other in correctness. This thesis is concerned w ..."
Abstract
-
Cited by 31 (8 self)
- Add to MetaCart
Despite the success of many pattern recognition systems in constrained domains, problems that involve noisy input and many classes remain difficult. A promising direction is to use several classifiers simultaneously, such that they can complement each other in correctness. This thesis is concerned with decision combination in a multiple classifier system that is critical to its success. A multiple classifier system consists of a set of classifiers and a decision combination function. It is a preferred solution to a complex recognition problem because it allows simultaneous use of feature descriptors of many types, corresponding measures of similarity, and many classification procedures. It also allows dynamic selection, so that classifiers adapted to inputs of a particular type may be applied only when those inputs are encountered. Decisions by the classifiers are represented as rankings of the class set that are derivable from the results of feature matching. Rank scores contain more ...
A shape analysis model with applications to a character recognition system
- IEEE Trans. Pattern Analysis and Machine Intelligence
, 1994
"... A~s~Qc~-A method for the recognition of multifont printed characters is proposed, giving emphasis to the identification of structural descriptions of character shapes using prototypes. Noise and shape variations are modeled as series of transformations from groups of features in the data to features ..."
Abstract
-
Cited by 29 (1 self)
- Add to MetaCart
A~s~Qc~-A method for the recognition of multifont printed characters is proposed, giving emphasis to the identification of structural descriptions of character shapes using prototypes. Noise and shape variations are modeled as series of transformations from groups of features in the data to features in each prototype. Thus, the method manages systematically the relative distortion between a candidate shape and its prototype, accomplishing robustness to noise with less than two prototypes per class, on average. Our method uses a flexible matching between components and a flexible grouping of the individual components to be matched. A number of shape transformations are defined, including filling of gaps, so that the method handles broken characters. Also, a measure of the amount of distortion that these transformations cause is given. Classification of character shapes is defined as a minimization problem among the possible transformations that map an input shape into prototypical shapes. Some tests with hand-printed numerals confirmed the method’s high robustness level. Zndex Terms-Shape distance, graph matching, relative neighborhood graph, broken character recognition, subgraph homeomorphism. I.
Optical Character Recognition and Parsing of Typeset Mathematics
- Journal of Visual Communication and Image Representation
, 1996
"... There is a wealth of mathematical knowledge that could be potentially very useful in many computational applications, but is not available in electronic form. This knowledge comes in the form of mechanically typeset books and journals going back more than one hundred years. Besides these older sourc ..."
Abstract
-
Cited by 26 (4 self)
- Add to MetaCart
There is a wealth of mathematical knowledge that could be potentially very useful in many computational applications, but is not available in electronic form. This knowledge comes in the form of mechanically typeset books and journals going back more than one hundred years. Besides these older sources, there are a great many current publications, filled with useful mathematical information, which are difficult if not impossible to obtain in electronic form. Our work intends to encode, for use by computer algebra systems, integral tables and other documents currently available in hardcopy only. Our strategy is to extract character information from these documents, which is then passed to higher-level parsing routines for further extraction of mathematical content (or any other useful two-dimensional semantic content). This information can then be output as, for example, a Lisp or T E X expression. We have also developed routines for rapid access to this information, specifically for fin...
Reading Cursive Handwriting by Alignment of Letter Prototypes
, 1990
"... We describe a new approach to the visual recognition of cursive handwriting. An effort is made to attain humanlike performance by using a method based on pictorial alignment and on a model of the process of handwriting. The alignment approach permits recognition of character instances that appear em ..."
Abstract
-
Cited by 20 (0 self)
- Add to MetaCart
We describe a new approach to the visual recognition of cursive handwriting. An effort is made to attain humanlike performance by using a method based on pictorial alignment and on a model of the process of handwriting. The alignment approach permits recognition of character instances that appear embedded in connected strings. A system embodying this approach has been implemented and tested on five different word sets. The performance was stable both across words and across writers. The system exhibited a substantial ability to interpret cursive connected strings without recourse to lexical knowledge.
Degraded Text Recognition Using Visual And Linguistic Context
, 1995
"... Recognition of degraded text is a challenging problem. To improve the performance of an OCR system on degraded images of text, postprocessing techniques are critical. The objective of postprocessing is to correct errors or to resolve ambiguities in OCR results by using contextual information. Depend ..."
Abstract
-
Cited by 20 (2 self)
- Add to MetaCart
Recognition of degraded text is a challenging problem. To improve the performance of an OCR system on degraded images of text, postprocessing techniques are critical. The objective of postprocessing is to correct errors or to resolve ambiguities in OCR results by using contextual information. Depending on the extent of context used, there are different levels of postprocessing. In current commercial OCR systems, word-level postprocessing methods, such as dictionary-lookup, have been applied successfully. However, many OCR errors cannot be corrected by word-level postprocessing. To overcome this limitation, passage-level postprocessing, in which global contextual information is utilized, is necessary. In most current studies on passage-level postprocessing, linguistic context is the major resource to be exploited. This thesis addresses problems in degraded text recognition and discusses potential solutions through passage-level postprocessing. The objective is to develop a postprocessin...
Letter Spirit: An Emergent Model of the Perception and Creation of Alphabetic Style
- Center for
, 1993
"... The Letter Spirit project is an attempt to model central aspects of human high-level perception and creativity on a computer, focusing on the creative act of artistic letter-design. The aim is to model the process of rendering the 26 lowercase letters of the roman alphabet in many different, interna ..."
Abstract
-
Cited by 9 (2 self)
- Add to MetaCart
The Letter Spirit project is an attempt to model central aspects of human high-level perception and creativity on a computer, focusing on the creative act of artistic letter-design. The aim is to model the process of rendering the 26 lowercase letters of the roman alphabet in many different, internally coherent styles. Two important and orthogonal aspects of letterforms are basic to the project: the categorical sameness possessed by instances of a single letter in various styles (e.g., the letter `a' in Baskerville, Palatino, and Helvetica) and the stylistic sameness possessed by instances of various letters in a single style (e.g., the letters `a', `b', and `c' in Baskerville). Starting with one or more seed letters representing the beginnings of a style, the program will attempt to create the rest of the alphabet in such a way that all 26 letters share the same style, or spirit. Letters in the domain are formed exclusively from straight segments on a grid in order to make decisions ...
INTEGRATING KNOWLEDGE SOURCES IN Devanagari Text Recognition
, 1999
"... Reading process has been widely studied and there is a general agreement among researchers that knowledge in different forms and at different levels plays a vital role. The same is the underlying philosophy of Devanagari document recognition system described in this work. We have identified variou ..."
Abstract
-
Cited by 9 (0 self)
- Add to MetaCart
Reading process has been widely studied and there is a general agreement among researchers that knowledge in different forms and at different levels plays a vital role. The same is the underlying philosophy of Devanagari document recognition system described in this work. We have identified various relevant knowledge sources which have been integrated using a blackboard model. Some of the knowledge sources are acquired a priori by an automated training process. The efficacy of each of these knowledge sources depends on the coverage of the sample space, the training algorithm and nature of the knowledge source itself. Some of the knowledge sources are constituted from the knowledge extracted from the text as it is processed. These knowledge sources are transient in nature and are meaningful in the domain of the text under consideration. The initial segmentation of text zone in text lines is based on image profile. However, the initial segmentation leaves the overlapping text lines unsegmented. The height information of text lines obtained after initial segmentation is statistically analyzed. The most frequent line height becomes the threshold line height for the text zone under consideration. The threshold line height is used for detecting overlapping text lines. This knowledge also provides clue for the possible segmentation points for these lines. The structural properties of Devanagari script, namely the header line and three horizontal strip of a word due to two dimensional composition of the script are exploited by the segmentation process at word level as well as at character level.
G.: Font adaptive word indexing of modern printed documents
- IEEE Transactions on PAMI
, 2006
"... Abstract—We propose an approach for the word-level indexing of modern printed documents which are difficult to recognize using current OCR engines. By means of word-level indexing, it is possible to retrieve the position of words in a document, enabling queries involving proximity of terms. Web sear ..."
Abstract
-
Cited by 7 (6 self)
- Add to MetaCart
Abstract—We propose an approach for the word-level indexing of modern printed documents which are difficult to recognize using current OCR engines. By means of word-level indexing, it is possible to retrieve the position of words in a document, enabling queries involving proximity of terms. Web search engines implement this kind of indexing, allowing users to retrieve Web pages on the basis of their textual content. Nowadays, digital libraries hold collections of digitized documents that can be retrieved either by browsing the document images or relying on appropriate metadata assembled by domain experts. Word indexing tools would therefore increase the access to these collections. The proposed system is designed to index homogeneous document collections by automatically adapting to different languages and font styles without relying on OCR engines for character recognition. The approach is based on three main ideas: the use of Self Organizing Maps (SOM) to perform unsupervised character clustering, the definition of one suitable vector-based word representation whose size depends on the word aspect-ratio, and the run-time alignment of the query word with indexed words to deal with broken and touching characters. The most appropriate applications are for processing modern printed documents (17th to 19th centuries) where current OCR engines are less accurate. Our experimental analysis addresses six data sets containing documents ranging from books of the 17th century to contemporary journals. Index Terms—Clustering, digital libraries, document image retrieval, heuristic oversegmentation, holistic word representation, modern documents, self organizing map. æ 1

