Results 1 -
6 of
6
A Survey of Table Recognition: Models, Observations, Transformations, and Inferences
- International Journal of Document Analysis and Recognition
, 2003
"... Table characteristics vary widely. Consequently, a great variety of computational approaches have been applied to table recognition. In this survey, the table recognition literature is presented as an interaction of table models, observations, transformations and inferences. A table model defines ..."
Abstract
-
Cited by 32 (3 self)
- Add to MetaCart
Table characteristics vary widely. Consequently, a great variety of computational approaches have been applied to table recognition. In this survey, the table recognition literature is presented as an interaction of table models, observations, transformations and inferences. A table model defines the physical and logical structure of tables; the model is used to detect tables, and to analyze and decompose the detected tables. Observations perform feature measurements and data lookup, transformations alter or restructure data, and inferences generate and test hypotheses. This presentation clarifies the decisions that are made by a table recognizer, and the assumptions and inferencing techniques that underlie these decisions.
Statistical-based Approach to Word Segmentation
- In 15th International Conference on Pattern Recognition, ICPR2000
"... This paper presents a text word extraction algorithm that takes a set of bounding boxes of glyphs and their associated text lines of a given document and partitions the glyphs into a set of text words, using only the geometric information of the input glyphs. The algorithm is probability based. An i ..."
Abstract
-
Cited by 9 (2 self)
- Add to MetaCart
This paper presents a text word extraction algorithm that takes a set of bounding boxes of glyphs and their associated text lines of a given document and partitions the glyphs into a set of text words, using only the geometric information of the input glyphs. The algorithm is probability based. An iterative, relaxation-like method is used to find the partitioning solution that maximizes the joint probability. To evaluate the performance of our text word extraction algorithm, we used a 3-fold validation method and developed a quantitative performance measure. The algorithm was evaluated on the UW-III database of some 1600 scanned document image pages. An area-overlap measure was used to find the correspondence between the detected entities and the ground-truth. For a total of -2670 ground truth words, the algorithm identified and segmented words correctly, an accuracy of .
Recent Work in the Document Image Decoding Group at Xerox PARC
- In Proceedings of the DOD-sponsored Symposium on Document Image Understanding Technology (SDIUT 2001
, 2001
"... this paper address both these problems. Multiparameter Case We can extend the notion of a document segmentation scale space that we developed above to a case where thresholds and distances are evaluated di#erently in the x and y directions. In particular, for each pair of connected components, let u ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
this paper address both these problems. Multiparameter Case We can extend the notion of a document segmentation scale space that we developed above to a case where thresholds and distances are evaluated di#erently in the x and y directions. In particular, for each pair of connected components, let us measure two distances, their horizontal distance and their vertical distance. We define the horizontal distance between two connected components to be infinity unless their vertical extent overlaps. If their vertical extents overlap, their distance is simply the distance between their bounding boxes. We can make an analogous definition for the vertical distance of two connected components. A segmentation is now given by picking two thresholds, # x on the horizontal distance and # y on the vertical distance. We can apply the same arguments as above and see that there are at most N di#erent choices for each # x and # y , so there are at most N di#erent two parameter segmentations of the input. 5.2 Computing Document Scale Space To compute and represent the document scale space, we use the following approach. First, the bound- ing boxes of the connected components are stored in a trie data structure, which allows us to determine quickly the nearest neighbors of each connected component in the horizontal and vertical directions. Using this neighborhood information, we construct a graph, in which the connected components are the nodes and the nearest neighbor relationships define the edges. We can now take the set of all horizontal edges and all vertical edges and sort them by their distance
A Language for Specifying and Comparing Table Recognition Strategies
, 2004
"... Table recognition algorithms may be described by models of table location and struc-ture, and decisions made relative to these models. These algorithms are usually defined informally as a sequence of decisions with supporting data observations and transformations. In this investigation, we formalize ..."
Abstract
-
Cited by 7 (3 self)
- Add to MetaCart
Table recognition algorithms may be described by models of table location and struc-ture, and decisions made relative to these models. These algorithms are usually defined informally as a sequence of decisions with supporting data observations and transformations. In this investigation, we formalize these algorithms as strategies in an imitation game, where the goal of the game is to match table interpretations from a chosen procedure as closely as possible. The chosen procedure may be a person or persons producing ‘ground truth, ’ or an algorithm. To describe table recognition strategies we have defined the Recognition Strat-egy Language (RSL). RSL is a simple functional language for describing strategies as sequences of abstract decision types whose results are determined by any suit-able decision method. RSL defines and maintains interpretation trees, a simple data structure for describing recognition results. For each interpretation in an interpreta-tion tree, we annotate hypothesis histories which capture the creation, revision, and rejection of individual hypotheses, such as the logical type and structure of regions. We present a proof-of-concept using two strategies from the literature. We demon-strate how RSL allows strategies to be specified at the level of decisions rather than ii algorithms, and we compare results of our strategy implementations using new tech-niques. In particular, we introduce historical recall and precision metrics. Con-ventional recall and precision characterize hypotheses accepted after a strategy has finished. Historical recall and precision provide additional information by describing all generated hypotheses, including any rejected in the final result. iii
Table Detection via Probability Optimization
- in Proceedings of Document Analysis Systems, (DAS’02
, 2002
"... This paper presents a table detection algorithm using optimization method. We define the table detection problem within the whole page segmentation framework. To reach a good table detection result, we emphasize to optimize the probabilities of the table region , its neighboring text block and their ..."
Abstract
- Add to MetaCart
This paper presents a table detection algorithm using optimization method. We define the table detection problem within the whole page segmentation framework. To reach a good table detection result, we emphasize to optimize the probabilities of the table region , its neighboring text block and their separator. An iterative updating method is used to optimize the whole page segmentation probability. The training and testing data set for the algorithm include document pages having in table entities and a total of cell entities. Compared with our previous work [12], it raised the accuracy rate to 461-3 from 9-361 and to 13-36 from 67-36 . 1
White-Box Evaluation of Computer Vision Algorithms through Explicit Decision-Making
"... Abstract. Traditionally computer vision and pattern recognition algorithms are evaluated by measuring differences between final interpretations and ground truth. These black-box evaluations ignore intermediate results, making it difficult to use intermediate results in diagnosing errors and optimiza ..."
Abstract
- Add to MetaCart
Abstract. Traditionally computer vision and pattern recognition algorithms are evaluated by measuring differences between final interpretations and ground truth. These black-box evaluations ignore intermediate results, making it difficult to use intermediate results in diagnosing errors and optimization. We propose “opening the box, ” representing vision algorithms as sequences of decision points where recognition results are selected from a set of alternatives. For this purpose, we present a domain-specific language for pattern recognition tasks, the Recognition Strategy Language (RSL). At run-time, an RSL interpreter records a complete history of decisions made during recognition, as it applies them to a set of interpretations maintained for the algorithm. Decision histories provide a rich new source of information: recognition errors may be traced back to the specific decisions that caused them, and intermediate interpretations may be recovered and displayed. This additional information also permits new evaluation metrics that include false negatives (correct hypotheses that the algorithm generates and later rejects), such as the percentage of ground truth hypotheses generated (historical recall), and the percentage of generated hypotheses that are correct (historical precision). We illustrate the approach through an analysis of cell detection in two published table recognition algorithms.

