Results 1 - 10
of
29
Geometric layout analysis techniques for document image understanding: a review
, 1998
"... Document Image Understanding (DIU) is an interesting research area with a large variety of challenging applications. Researchers have worked from decades on this topic, as witnessed by the scientific literature. The main purpose of the present report is to describe the current status of DIU with par ..."
Abstract
-
Cited by 37 (0 self)
- Add to MetaCart
Document Image Understanding (DIU) is an interesting research area with a large variety of challenging applications. Researchers have worked from decades on this topic, as witnessed by the scientific literature. The main purpose of the present report is to describe the current status of DIU with particular attention to two subprocesses: document skew angle estimation and page decomposition. Several algorithms proposed in the literature are synthetically described. They are included in a novel classification scheme. Some methods proposed for the evaluation of page decomposition algorithms are described. Critical discussions are reported about the current status of the field and about the open problems. Some considerations about the logical layout analysis are also reported.
Online Handwriting Recognition Using Multiple Pattern Class Models
, 2000
"... The field of personal computing has begun to make a transition from the desktop to handheld devices, thereby requiring input paradigms that are more suited for single hand entry than a keyboard and recent developments in online handwriting recognition allow for such input modalities. Data entry usin ..."
Abstract
-
Cited by 10 (1 self)
- Add to MetaCart
The field of personal computing has begun to make a transition from the desktop to handheld devices, thereby requiring input paradigms that are more suited for single hand entry than a keyboard and recent developments in online handwriting recognition allow for such input modalities. Data entry using a pen forms a natural, convenient interface. The large number of writing styles and the variability between them makes the problem of writer-independent unconstrained handwriting recognition a very challenging pattern recognition problem. The state-of-the-art in online handwriting recognition is such that it has found practical success in very constrained problems. In this thesis, a method of identifying different writing styles, referred to as lexemes, is described. Approaches for constructing both non-parametric and parametric classifiers are described that take advantage of the identified lexemes to f...
Projection profile based skew estimation algorithm for JBIG compressed images
, 1998
"... A new projection profile based skew estimation algorithm is presented. It extracts fiducial points corresponding to objects on a page by decoding a JBIG compressed image. These points are projected along parallel lines into an accumulator array. The angle of projection within a search interval that ..."
Abstract
-
Cited by 9 (0 self)
- Add to MetaCart
A new projection profile based skew estimation algorithm is presented. It extracts fiducial points corresponding to objects on a page by decoding a JBIG compressed image. These points are projected along parallel lines into an accumulator array. The angle of projection within a search interval that maximizes alignment of the fiducial points is the skew angle. This algorithm and three other algorithms were tested. Results showed that the new algorithm performed comparably to the other algorithms. The JBIG progressive coding scheme reduces the effects of noise and graphics, and the accuracy of the new algorithm on 75 dpi unfiltered images and 300 dpi filtered images was similar.
Recognition Of Cursive Writing On Personal Checks
, 1996
"... this paper applies Hidden Markov technology both to the task of recognizing the cursive legal amount on personal checks and the isolated (numeric) courtesy amount. Throughout the paper, our primary goal is to present methods that will allow the engineer to gain maximum leverage from a limited amount ..."
Abstract
-
Cited by 9 (4 self)
- Add to MetaCart
this paper applies Hidden Markov technology both to the task of recognizing the cursive legal amount on personal checks and the isolated (numeric) courtesy amount. Throughout the paper, our primary goal is to present methods that will allow the engineer to gain maximum leverage from a limited amount of training data. 1 Introduction
Lossless Document Image Compression
, 1999
"... Document image compression reduces the storage requirements for digitised books or documents by using characters as the fundamental unit of compression. Compression gains can be achieved by identifying regions that contain text, isolating unique characters, and storing them in a codebook. This thes ..."
Abstract
-
Cited by 8 (1 self)
- Add to MetaCart
Document image compression reduces the storage requirements for digitised books or documents by using characters as the fundamental unit of compression. Compression gains can be achieved by identifying regions that contain text, isolating unique characters, and storing them in a codebook. This thesis investigates several fundamental areas of the compression process. Algorithms for each area are tested on a corpus of images and the improvements tested for statistical significance. Methods for isolating characters from a bitmap are investigated along with techniques for determining reading order. We introduce the use of the docstrum to aid image compression and show that it improves upon previous methods. The Hough transform is shown to be an accurate method for determining page skew and gives robust results over a range of image resolutions. Compression is shown to improve when the skew of an image is determined automatically, and used to determine reading order. If images can be segm...
Document Image Compression and Analysis
- PhD of the university of Maryland
, 1997
"... Image compression usually considers the minimization of storage space as its main objective. It is desirable, however, to code images so that we have the ability to process the resulting representation directly. In this thesis we explore an approach to document image compression that is efficient in ..."
Abstract
-
Cited by 8 (1 self)
- Add to MetaCart
Image compression usually considers the minimization of storage space as its main objective. It is desirable, however, to code images so that we have the ability to process the resulting representation directly. In this thesis we explore an approach to document image compression that is efficient in both space (storage requirement) and time (processing flexibility). A representation is presented in which component-level redundancy is removed by forming a prototype library and component location table. This representation forms a basis for compression and provides direct access to image components. To generate the prototype library, a new clustering approach is developed which is suitable for document image components. The distance metric is based on a character degradation model so that degraded versions of the same character will be grouped together. To achieve a lossless representation when required, the residuals are encoded efficiently using a structural distance ordering. OCR is...
Document Skew Estimation Without Angle Range Restriction
, 1999
"... The existing skew estimation techniques usually assume that the input image is of high resolution and that the detectable angle range is limited. We present a more generic solution for this task that overcomes these restrictions. Our method is based on determination of the first eigenvector of the d ..."
Abstract
-
Cited by 7 (1 self)
- Add to MetaCart
The existing skew estimation techniques usually assume that the input image is of high resolution and that the detectable angle range is limited. We present a more generic solution for this task that overcomes these restrictions. Our method is based on determination of the first eigenvector of the data covariance matrix. The solution comprises image resolution reduction, connected component analysis, component classification using a fuzzy approach, and skew estimation. Experiments on a large set of various document images and performance comparison with two Hough transformbased methods show a good accuracy and robustness for our method.
Using the Gamera Framework for Building a Lute Tablature Recognition System
- in Proceeding of the 6th International Conference on Music Information Retrieval (ISMIR’05
, 2005
"... In this article we describe an optical recognition system for historic lute tablature prints that we have built with the aid of the Gamera toolkit for document analysis and recognition. We give recognition rates for various historic sources and show that our system works quite well on printed tablat ..."
Abstract
-
Cited by 7 (4 self)
- Add to MetaCart
In this article we describe an optical recognition system for historic lute tablature prints that we have built with the aid of the Gamera toolkit for document analysis and recognition. We give recognition rates for various historic sources and show that our system works quite well on printed tablature sources using movable types. For engraved and manuscript sources, we discuss some principal current limitations of our system and Gamera. Keywords: Optical Music Recognition, Lute Tablature. 1 LUTE TABLATURE From the 16th and early 17th century a large body of lute tablature sources is extant. As a major part of this music is derived from vocal models, it can be an ideal investigation object for music information retrieval questions. Consequently there are efforts like the ECOLM project [1] to build a data base of machine readable tablature encodings of lute music sources. Usual optical music recognition (OMR) systems designed for common music notation (CMN) cannot be used for this purpose because lute music is written in tablature rather than CMN. Figure 1 shows the characteristics of lute tablature notation: rather than specifying the sound of the music, it specifies when and in which frets the strings of the instrument are stopped. The symbols used for fret and rhythm had not been standardized, but almost every historic source used its own unique set of symbols. This is an important difference to CMN, which consists of a limited set of symbols which are consistent across different music scores. Consequently a system for optical tablature recognition (OTR) must not be designed to work with a single set of a priori known symbols, but to be adaptable to differing tablature symbols. We shall see below that the conception of training in Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page.
A Comparison of Hidden Markov Model Features for the Recognition of Cursive Handwriting
, 1996
"... Due to the difficulty of character segmentation in cursive handwriting recognition, much recent research has turned to segmentation free approaches of word recognition. While techniques of feature extraction for presegmented characters have been thoroughly explored in the literature, an evaluation o ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
Due to the difficulty of character segmentation in cursive handwriting recognition, much recent research has turned to segmentation free approaches of word recognition. While techniques of feature extraction for presegmented characters have been thoroughly explored in the literature, an evaluation of features for use with segmentation during recognition techniques remains sparse. The main purpose of this thesis is to provide a comparison of a number of feature extraction techniques applied to the domain of legal amount recognition in bank checks. An experimental system using Hidden Markov Models and a horizontally sliding window is described. Results are presented for the recognition of the entire legal field using a variety of features. Of the experiments presented here, the best results were obtained by concatenating the feature vectors from the present, previous, and next window...
Automatic page analysis for the creation of a digital library from newspaper archives
- International Journal on Digital Libraries (IJODL
, 2000
"... Abstract. Digital preservation of newspaper archives aims both at the salvation of endangered material (paper) and at the creation of digital library services that will allow full utilization of the archives by all interested parties. In this paper, we address a series of issues pertaining to the re ..."
Abstract
-
Cited by 4 (3 self)
- Add to MetaCart
Abstract. Digital preservation of newspaper archives aims both at the salvation of endangered material (paper) and at the creation of digital library services that will allow full utilization of the archives by all interested parties. In this paper, we address a series of issues pertaining to the retro-conversion of newspapers, i.e., the conversion of newspaper pages into digital resources. An integrated approach is presented that provides solutions to problems related to newspaper page image enhancement, segmentation of pages into various items (titles, text, images etc), article identification and reconstruction, and, finally, recognition of the textual components. Emphasis is placed on the most difficult intermediate stages of page segmentation and article identification and reconstruction. Detailed experimental results, obtained from a large testbed of old newspaper issues, are presented which clearly demonstrate the applicability of our methodology to the successful retro-conversion of newspaper material.

