Results 1 - 10
of
12
Localizing and Segmenting Text in Images and Videos
, 2002
"... Many images---especially those used for page design on web pages---as well as videos contain visible text. If these text occurrences could be detected, segmented, and recognized automatically, they would be a valuable source of high-level semantics for indexing and retrieval. In this paper, we propo ..."
Abstract
-
Cited by 60 (0 self)
- Add to MetaCart
Many images---especially those used for page design on web pages---as well as videos contain visible text. If these text occurrences could be detected, segmented, and recognized automatically, they would be a valuable source of high-level semantics for indexing and retrieval. In this paper, we propose a novel method for localizing and segmenting text in complex images and videos. Text lines are identified by using a complex-valued multilayer feed-forward network trained to detect text at a fixed scale and position. The network's output at all scales and positions is integrated into a single text-saliency map, serving as a starting point for candidate text lines. In the case of video, these candidate text lines are refined by exploiting the temporal redundancy of text in video. Localized text lines are then scaled to a fixed height of 100 pixels and segmented into a binary image with black characters on white background. For videos, temporal redundancy is exploited to improve segmentation performance. Input images and videos can be of any size due to a true multiresolution approach. Moreover, the system is not only able to locate and segment text occurrences into large binary images, but is also able to track each text line with sub-pixel accuracy over the entire occurrence in a video, so that one text bitmap is created for all instances of that text line. Therefore, our text segmentation results can also be used for object -based video encoding such as that enabled by MPEG-4.
Flexible web document analysis for delivery to narrow-bandwidth devices
- in: Proceedings of the 6th International Conference on Document Analysis and Recognition (ICDAR’01
, 2001
"... We propose a set of baseline heuristics for identifying genuinely tabular information and news links in HTML documents. A prototype implementation of these heuristics is described for delivering content from news providers ' home pages to a narrow-bandwidth device such as a portable digital assistan ..."
Abstract
-
Cited by 12 (2 self)
- Add to MetaCart
We propose a set of baseline heuristics for identifying genuinely tabular information and news links in HTML documents. A prototype implementation of these heuristics is described for delivering content from news providers ' home pages to a narrow-bandwidth device such as a portable digital assistant or cellular phone display. Its evaluation on 75 web-sites is provided, along with a discussion of topics for future research. 1.
Locating And Recognizing Text in . . .
- INFORMATION RETRIEVAL 2, 177--206 (2000)
, 2000
"... The explosive growth of the World Wide Web has resulted in a distributed database consisting of hundreds of millions of documents. While existing search engines index a page based on the text that is readily extracted from its HTML encoding, an increasing amount of the information on the Web is embe ..."
Abstract
-
Cited by 8 (0 self)
- Add to MetaCart
The explosive growth of the World Wide Web has resulted in a distributed database consisting of hundreds of millions of documents. While existing search engines index a page based on the text that is readily extracted from its HTML encoding, an increasing amount of the information on the Web is embedded in images. This situation presents a new and exciting challenge for the fields of document analysis and information retrieval, as WWW image text is typically rendered in color and at very low spatial resolutions. In this paper, we survey the results of several years of our work in the area. For the problem of locating text in Web images, we describe a procedure based on clustering in color space followed by a connected-components analysis that seems promising. For character recognition, we discuss techniques using polynomial surface fitting and "fuzzy" n-tuple classifiers. Also presented are the results of several experiments that demonstrate where our methods perform well and where more work needs to be done. We conclude with a discussion of topics for further research.
What Fraction of Images on the Web Contain Text?
- IN PROCEEDINGS OF WEB DOCUMENT ANALYSIS, 2001. ONLINE: HTTP://WWW.CSC.LIV.AC.UK/ WDA2001/ PAPERS/27 KANUNGO WDA2001.PDF
, 2001
"... Web search engines index text represented in symbolic form. However, it is well known that a fraction of the text on the web is present in the form of images, and the textual content of these images is not indexed by the search engines. This fact immediately raises a few questions: i) What fraction ..."
Abstract
-
Cited by 7 (1 self)
- Add to MetaCart
Web search engines index text represented in symbolic form. However, it is well known that a fraction of the text on the web is present in the form of images, and the textual content of these images is not indexed by the search engines. This fact immediately raises a few questions: i) What fraction of the images on the web contain text? ii) What fraction of the text content of these images does not appear in the web page in symbolic form? Answers to these questions will give the web users an idea about the amount of information being missed by the search engines, and, justify whether or not Optical Character Recognition should be a standard part of search engine indexing. To answer these questions we statistically sample the images referenced in the web pages retrieved by a search engine for specific queries and then find the fraction of sampled images that contain text.
Minimum Spanning Tree Based Clustering Algorithms
"... The minimum spanning tree clustering algorithm is known to be capable of detecting clusters with irregular boundaries. In this paper, we propose two minimum spanning tree based clustering algorithms. The first algorithm produces a k-partition of a set of points for any given k. The algorithm constru ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
The minimum spanning tree clustering algorithm is known to be capable of detecting clusters with irregular boundaries. In this paper, we propose two minimum spanning tree based clustering algorithms. The first algorithm produces a k-partition of a set of points for any given k. The algorithm constructs a minimum spanning tree of the point set and removes edges that satisfy a predefined criterion. The process is repeated until k clusters are produced. The second algorithm partitions a point set into a group of clusters by maximizing the overall standard deviation reduction, without a given k value. We present our experimental results comparing our proposed algorithms to k-means and EM. We also apply our algorithms to image color clustering and compare our algorithms to the standard minimum spanning tree clustering algorithm. 1.
Text extraction from web images based on a split-and-merge segmentation method using colour perception
- In: Proc. of the 17th Int’l Conf. on Pattern Recognition (ICPR 2004). Cambridge: IEEE, 2004. 634−637. http://portal.acm.org/citation.cfm?id=1018428.1020791 等:基于最大-最小相似度学习方法的文本提取 629
"... This paper describes a complete approach to the segmentation and extraction of text from Web images for subsequent recognition, to ultimately achieve both effective indexing and presentation by non-visual means (e.g., audio). The method described here (the first in the authors ’ systematic approach ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
This paper describes a complete approach to the segmentation and extraction of text from Web images for subsequent recognition, to ultimately achieve both effective indexing and presentation by non-visual means (e.g., audio). The method described here (the first in the authors ’ systematic approach to exploit human colour perception) enables the extraction of text in complex situations such as in the presence of varying colour (characters and background). More precisely, in addition to using structural features, the segmentation follows a split-and-merge strategy based on the Hue-Lightness-Saturation (HLS) representation of colour as a first approximation of an anthropocentric expression of the differences in chromaticity and lightness. Character-like components are then extracted as forming textlines in a number of orientations and along curves. 1
Flexible Web Document Analysis for Delivery to Narrow-Bandwidth Devices
, 2001
"... We propose a set of baseline heuristics for identifying genuinely tabular information and news links in HTML documents. A prototype implementation of these heuristics is described for delivering content from news providers' home pages to a narrow-bandwidth device such as a portable digital assistant ..."
Abstract
- Add to MetaCart
We propose a set of baseline heuristics for identifying genuinely tabular information and news links in HTML documents. A prototype implementation of these heuristics is described for delivering content from news providers' home pages to a narrow-bandwidth device such as a portable digital assistant or cellular phone display. Its evaluation on 75 web-sites is provided, along with a discussion of topics for future research.
To Search for Images on the Web,
- In Proceedings of the First International Workshop on Web Document Analysis (WDA2001), online at http://www.csc.liv.ac.uk/ wda2001
, 2001
"... this paper, we want to argue that image pro- This material is based upon work supported by the U. S. Department of Defense and by the National Science Foundation under Grant No. 9734102. Additional support was provided by Sun Microsystems ..."
Abstract
- Add to MetaCart
this paper, we want to argue that image pro- This material is based upon work supported by the U. S. Department of Defense and by the National Science Foundation under Grant No. 9734102. Additional support was provided by Sun Microsystems
Text Area Identification . . .
- G.A. VOUROS AND T. PANAYIOTOPOULOS (EDS.): SETN 2004, LNAI 3025, PP. 82--92
, 2004
"... With the explosive growth of the World Wide Web, millions of documents are published and accessed on-line. Statistics show that a significant part of Web text information is encoded in Web images. Since Web images have special characteristics that sometimes distinguish them from other types of i ..."
Abstract
- Add to MetaCart
With the explosive growth of the World Wide Web, millions of documents are published and accessed on-line. Statistics show that a significant part of Web text information is encoded in Web images. Since Web images have special characteristics that sometimes distinguish them from other types of images, commercial OCR products often fail to recognize Web images due to their special characteristics. This paper proposes a novel Web image processing algorithm that aims to locate text areas and prepare them for OCR procedure with better results. Our methodology for text area identification has been fully integrated with an OCR engine and with an Information Extraction system. We present quantitative results for the performance of the OCR engine as well as qualitative results concerning its effects to the Information Extraction system. Experimental

