Results 1 - 10
of
40
Automatic detection and recognition of signs from natural scenes
- IEEE Trans. Image Process
, 2004
"... Abstract—In this paper, we present an approach to automatic detection and recognition of signs from natural scenes, and its application to a sign translation task. The proposed approach embeds multiresolution and multiscale edge detection, adaptive searching, color analysis, and affine rectification ..."
Abstract
-
Cited by 63 (4 self)
- Add to MetaCart
(Show Context)
Abstract—In this paper, we present an approach to automatic detection and recognition of signs from natural scenes, and its application to a sign translation task. The proposed approach embeds multiresolution and multiscale edge detection, adaptive searching, color analysis, and affine rectification in a hierarchical framework for sign detection, with different emphases at each phase to handle the text in different sizes, orientations, color distributions and backgrounds. We use affine rectification to re-cover deformation of the text regions caused by an inappropriate camera view angle. The procedure can significantly improve text detection rate and optical character recognition (OCR) accuracy. Instead of using binary information for OCR, we extract features from an intensity image directly. We propose a local intensity normalization method to effectively handle lighting variations, followed by a Gabor transform to obtain local features, and finally a linear discriminant analysis (LDA) method for feature selection. We have applied the approach in developing a Chinese sign translation system, which can automatically detect and recognize Chinese signs as input from a camera, and translate the recognized text into English. Index Terms—Affine rectification, optical character recognition (OCR), sign detection, sign recognition, text detection. I.
Sign detection in natural images with conditional random fields
- In Proc. of IEEE International Workshop on Machine Learning for Signal Processing
, 2004
"... Abstract. Traditional generative Markov random fields for segmenting images model the image data and corresponding labels jointly, which requires extensive independence assumptions for tractability. We present the conditional random field for an application in sign detection, using typical scale and ..."
Abstract
-
Cited by 17 (6 self)
- Add to MetaCart
(Show Context)
Abstract. Traditional generative Markov random fields for segmenting images model the image data and corresponding labels jointly, which requires extensive independence assumptions for tractability. We present the conditional random field for an application in sign detection, using typical scale and orientation selective texture filters and a nonlinear texture operator based on the grating cell. The resulting model captures dependencies between neighboring image region labels in a data-dependent way that escapes the difficult problem of modeling image formation, instead focusing effort and computation on the labeling task. We compare the results of training the model with pseudo-likelihood against an approximation of the full likelihood with the iterative tree reparameterization algorithm and demonstrate improvement over previous methods.
S.: Learning to detect scene text using a higher-order mrf with belief propagation
- In: Computer Vision and Pattern Recognition (CVPR). (2004) 3
, 2004
"... Detecting text in natural 3D scenes is a challenging problem due to background clutter and photometric/gemetric variations of scene text. Most prior systems adopt approaches based on deterministic rules, lacking a systematic and scalable framework. In this paper, we present a partsbased approach for ..."
Abstract
-
Cited by 13 (1 self)
- Add to MetaCart
(Show Context)
Detecting text in natural 3D scenes is a challenging problem due to background clutter and photometric/gemetric variations of scene text. Most prior systems adopt approaches based on deterministic rules, lacking a systematic and scalable framework. In this paper, we present a partsbased approach for 3D scene text detection using a higherorder MRF model. The higher-order structure is used to capture the spatial-feature relations among multiple parts in scene text. The use of higher-order structure and the feature-dependent potential function represents significant departure from the conventional pairwise MRF, which has been successfully applied in several low-level applications. We further develop a variational approximation method, in the form of belief propagation, for inference in the higherorder model. Our experiments using the ICDAR’03 benchmark showed promising results in detecting scene text with significant geometric variations, background clutter on planar surfaces or non-planar surfaces with limited angles. 1.
Cell phone-based wayfinding for the visually impaired
- st Int. Workshop on Mobile Vision
, 2006
"... Abstract. A major challenge faced by the blind and visually impaired population is that of wayfinding – the ability of a person to find his or her way to a given destination. We propose a new wayfinding aid based on a camera cell phone, which is held by the user to find and read aloud specially desi ..."
Abstract
-
Cited by 11 (3 self)
- Add to MetaCart
(Show Context)
Abstract. A major challenge faced by the blind and visually impaired population is that of wayfinding – the ability of a person to find his or her way to a given destination. We propose a new wayfinding aid based on a camera cell phone, which is held by the user to find and read aloud specially designed machine-readable signs in the environment (labeling locations such as offices and restrooms). Our main technical innovation is that we have designed these machine-readable signs to be detected and located in fractions of a second on the cell phone CPU, even at a distance of several meters. A linear barcode printed on the sign is read using novel decoding algorithms that are robust to noisy images. The information read from the barcode is then read aloud using pre-recorded or synthetic speech. We have implemented a prototype system on the Nokia 7610 cell phone, and preliminary experiments with blind subjects demonstrate the feasibility of using the system as a real-time wayfinding aid. 2 1
A Method for Text Localization and Recognition
- in Real-World Images. LNCS
, 2011
"... real-world images ..."
An Automatic Sign Recognition and Translation System
- IN PROCEEDINGS OF THE WORKSHOP ON PERCEPTIVE USER INTERFACES (PUI’01
, 2001
"... A sign is something that suggests the presence of a fact, condition, or quality. Signs are everywhere in our lives. They make our lives easier when we are familiar with them. But sometimes they pose problems. For example, a tourist might not be able to understand signs in a foreign country. This pap ..."
Abstract
-
Cited by 11 (1 self)
- Add to MetaCart
A sign is something that suggests the presence of a fact, condition, or quality. Signs are everywhere in our lives. They make our lives easier when we are familiar with them. But sometimes they pose problems. For example, a tourist might not be able to understand signs in a foreign country. This paper discusses problems of automatic sign recognition and translation. We present a system capable of capturing images, detecting and recognizing signs, and translating them into a target language. We describe methods for automatic sign extraction and translation. We use a user-centered approach in system development. The approach takes advantage of human intelligence if needed and leverage human capabilities. We are currently working on Chinese sign translation. We have developed a prototype system that can recognize Chinese sign input from a video camera that is a common gadget for a tourist, and translate the signs into English or voice stream. The sign translation, in conjunction with spoken language translation, can help international tourists to overcome language barriers. The technology can also help a visually handicapped person to increase environmental awareness.
Adaptive traffic road sign panels text extraction
- In: ISPRA’06: Proceedings of the 5th WSEAS International Conference on Signal Processing, Robotics and Automation, World Scientific and Engineering Academy and Society (WSEAS), Stevens Point
, 2006
"... Abstract:- In this paper we present an approach to the detection and extraction of text in road sign panels. Text strings, indicators and signs extraction is efficiently performed so OCR algorithms can recognize different characters that may be present on the traffic plane. In a first step, basic co ..."
Abstract
-
Cited by 10 (0 self)
- Add to MetaCart
(Show Context)
Abstract:- In this paper we present an approach to the detection and extraction of text in road sign panels. Text strings, indicators and signs extraction is efficiently performed so OCR algorithms can recognize different characters that may be present on the traffic plane. In a first step, basic color segmentation and shape classification is done for the purpose of detecting possible rectangular planes. Every detected plane is extracted from the original image and then reoriented. Chrominance and luminance histogram analysis and adaptive segmentation is carried out, and connected components labeling and position clustering is finally done for the arrangement of the different characters on the panel. Special emphasis has been placed on the adaptive segmentation. Experimental results have showed that following steps strongly depends on correct separation between the background and foreground objects of the panel. Moreover, OCR systems are highly sensitive to noise, and we have put special attention into it in order that the OCR system could be able to recognize characters properly. Key-Words:- Road-sign, detection, classification, image segmentation. 1
Grouping Using Factor Graphs: an Approach for Finding Text with a Camera Phone
"... Abstract. We introduce a new framework for feature grouping based on factor graphs, which are graphical models that encode interactions among arbitrary numbers of random variables. The ability of factor graphs to express interactions higher than pairwise order (the highest order encountered in most ..."
Abstract
-
Cited by 9 (0 self)
- Add to MetaCart
(Show Context)
Abstract. We introduce a new framework for feature grouping based on factor graphs, which are graphical models that encode interactions among arbitrary numbers of random variables. The ability of factor graphs to express interactions higher than pairwise order (the highest order encountered in most graphical models used in computer vision) is useful for modeling a variety of pattern recognition problems. In particular, we show how this property makes factor graphs a natural framework for performing grouping and segmentation, which we apply to the problem of finding text in natural scenes. We demonstrate an implementation of our factor graph-based algorithm for finding text on a Nokia camera phone, which is intended for eventual use in a camera phone system that finds and reads text (such as street signs) in natural environments for blind users. 1
Automatic detection and translation of text from natural scenes
- In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP ’02
, 2002
"... Large amounts of information are embedded in natural scenes. Signs are good examples of natural objects with high information content. In this paper, we discuss problems in automatic detection and translation of text from natural scenes. We describe the challenges of automatic text detection and pro ..."
Abstract
-
Cited by 9 (3 self)
- Add to MetaCart
(Show Context)
Large amounts of information are embedded in natural scenes. Signs are good examples of natural objects with high information content. In this paper, we discuss problems in automatic detection and translation of text from natural scenes. We describe the challenges of automatic text detection and propose methods to address these challenges. We extend example based machine translation technology for sign translation and present a prototype system for Chinese sign translation. This system is capable of capturing images, automatically detecting and recognizing text, and translating the text into English. The translation can be displayed on a palm size PDA, or synthesized as a voice output message over the earphones. 1.
Robust character recognition in low-resolution images and videos
, 2005
"... Although OCR techniques work very reliably for high-resolution documents, the recognition of superimposed text in low-resolution images or videos with a complex background is still a challenge. Three major parts characterize our system for recognition of su-perimposed text in images and videos: loca ..."
Abstract
-
Cited by 9 (3 self)
- Add to MetaCart
(Show Context)
Although OCR techniques work very reliably for high-resolution documents, the recognition of superimposed text in low-resolution images or videos with a complex background is still a challenge. Three major parts characterize our system for recognition of su-perimposed text in images and videos: localization of text regions, segmentation (binarization) of characters, and recognition. We use standard approaches to locate text regions and focus in this paper on the last two steps. Many approaches (e.g., projection pro-files, k-mean clustering) do not work very well for separating char-acters with very small font sizes. We apply in a vertical direction a shortest-path algorithm to separate the characters in a text line. The recognition of characters is based on the curvature scale space (CSS) approach which smoothes the contour of a character with a Gaussian kernel and tracks its inflection points. A major drawback of the CSS method is its poor representation of convex segments: Convex objects cannot be represented at all due to missing inflec-tion points. We have extended the CSS approach to generate feature points for concave and convex segments of a contour. This generic approach is not only applicable to text characters but to arbitrary objects as well. In the experimental results, we compare our approach against a pattern matching algorithm, two classification algorithms based on contour analysis, and a commercial OCR system. The overall recog-nition results are good enough even for the indexing of low resolu-tion images and videos. 1.