Results 1 -
2 of
2
Joint Induction of Shape Features and Tree Classifiers
- IEEE Trans. PAMI
, 1997
"... We introduce a very large family of binary features for two-dimensional shapes. The salient ones for separating particular shapes are determined by inductive learning during the construction of classi cation trees. There is a feature for every possible geometric arrangement of local topographic code ..."
Abstract
-
Cited by 66 (6 self)
- Add to MetaCart
We introduce a very large family of binary features for two-dimensional shapes. The salient ones for separating particular shapes are determined by inductive learning during the construction of classi cation trees. There is a feature for every possible geometric arrangement of local topographic codes. The arrangements express coarse constraints on relative angles and distances among the code locations and are nearly invariant to substantial a ne and non-linear deformations. They are also partially ordered, which makes it possible to narrow the search for informative ones at each node of the tree. Di erent trees correspond to di erent aspects of shape. They are statistically weakly dependent due to randomization and are aggregated in a simple way. Adapting the algorithm to a shape family is then fully automatic once training samples are provided. As an illustration, we classify handwritten digits from the NIST database � the error rate is:7%.
Automatic page analysis for the creation of a digital library from newspaper archives
- International Journal on Digital Libraries (IJODL
, 2000
"... Abstract. Digital preservation of newspaper archives aims both at the salvation of endangered material (paper) and at the creation of digital library services that will allow full utilization of the archives by all interested parties. In this paper, we address a series of issues pertaining to the re ..."
Abstract
-
Cited by 4 (3 self)
- Add to MetaCart
Abstract. Digital preservation of newspaper archives aims both at the salvation of endangered material (paper) and at the creation of digital library services that will allow full utilization of the archives by all interested parties. In this paper, we address a series of issues pertaining to the retro-conversion of newspapers, i.e., the conversion of newspaper pages into digital resources. An integrated approach is presented that provides solutions to problems related to newspaper page image enhancement, segmentation of pages into various items (titles, text, images etc), article identification and reconstruction, and, finally, recognition of the textual components. Emphasis is placed on the most difficult intermediate stages of page segmentation and article identification and reconstruction. Detailed experimental results, obtained from a large testbed of old newspaper issues, are presented which clearly demonstrate the applicability of our methodology to the successful retro-conversion of newspaper material.

