Results 1 - 10
of
20
Object categorization by learned universal visual dictionary
- In ICCV
, 2005
"... Figure 1: Exemplar snapshots of our interactive object categorization demo application. A user selects (sloppily) a region of interest and our algorithm associates an object class label with it. Despite large differences in pose, size, illumination and visual appearance the correct class label (e.g. ..."
Abstract
-
Cited by 114 (8 self)
- Add to MetaCart
Figure 1: Exemplar snapshots of our interactive object categorization demo application. A user selects (sloppily) a region of interest and our algorithm associates an object class label with it. Despite large differences in pose, size, illumination and visual appearance the correct class label (e.g. cow, building, car...) is automatically associated with each selected object instance. Some of these test images were downloaded from the web and none were part of the training set. A video of the interactive demo may be found at the above web site. This paper presents a new algorithm for the automatic recognition of object classes from images (categorization). Compact and yet discriminative appearance-based object class models are automatically learned from a set of training images. The method is simple and extremely fast, making it suitable for many applications such as semantic image retrieval, web search, and interactive image editing. It classifies a region according to the proportions of different visual words (clusters in feature space). The specific visual words and the typical proportions in each object are learned from a segmented training set. The main contribution of this paper is two fold: i) an optimally compact visual dictionary is learned by pair-wise merging of visual words from an initially large dictionary. The final visual words are described by GMMs. ii) A novel statistical measure of discrimination is proposed which is optimized by each merge operation. High classification accuracy is demonstrated for nine object classes on photographs of real objects viewed under general lighting conditions, poses and viewpoints. The set of test images used for validation comprise: i) photographs acquired by us, ii) images from the web and iii) images from the recently released Pascal dataset. The proposed algorithm performs well on both texture-rich objects (e.g. grass, sky, trees) and structure-rich ones (e.g. cars, bikes, planes). 1.
Region Filling and Object Removal by Exemplar-Based Image Inpainting
, 2004
"... A new algorithm is proposed for removing large objects from digital images. The challenge is to fill in the hole that is left behind in a visually plausible way. In the past, this problem has been addressed by two classes of algorithms: 1) “texture synthesis” algorithms for generating large image re ..."
Abstract
-
Cited by 102 (1 self)
- Add to MetaCart
A new algorithm is proposed for removing large objects from digital images. The challenge is to fill in the hole that is left behind in a visually plausible way. In the past, this problem has been addressed by two classes of algorithms: 1) “texture synthesis” algorithms for generating large image regions from sample textures and 2) “inpainting ” techniques for filling in small image gaps. The former has been demonstrated for “textures”—repeating twodimensional patterns with some stochasticity; the latter focus on linear “structures ” which can be thought of as one-dimensional patterns, such as lines and object contours. This paper presents a novel and efficient algorithm that combines the advantages of these two approaches. We first note that exemplar-based texture synthesis contains the essential process required to replicate both texture and structure; the success of structure propagation, however, is highly dependent on the order in which the filling proceeds. We propose a best-first algorithm in which the confidence in the synthesized pixel values is propagated in a manner similar to the propagation of information in inpainting. The actual color values are computed using exemplar-based synthesis. In this paper, the simultaneous propagation of texture and structure information is achieved by a single, efficient algorithm. Computational efficiency is achieved by a block-based sampling process. A number of examples on real and synthetic images demonstrate the effectiveness of our algorithm in removing large occluding objects, as well as thin scratches. Robustness with respect to the shape of the manually selected target region is also demonstrated. Our results compare favorably to those obtained by existing techniques.
Digital color imaging
- IEEE Trans. Image Process
, 1997
"... in the area of digital color imaging. In order to establish the background and lay down terminology, fundamental concepts of color perception and measurement are first presented using vector-space notation and terminology. Present-day color recording and reproduction systems are reviewed along with ..."
Abstract
-
Cited by 66 (8 self)
- Add to MetaCart
in the area of digital color imaging. In order to establish the background and lay down terminology, fundamental concepts of color perception and measurement are first presented using vector-space notation and terminology. Present-day color recording and reproduction systems are reviewed along with the common mathematical models used for representing these devices. Algorithms for processing color images for display and communication are surveyed, and a forecast of research trends is attempted. An extensive bibliography is provided. I.
Principled hybrids of generative and discriminative models
- In CVPR ’06: Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition
, 2006
"... When labelled training data is plentiful, discriminative techniques are widely used since they give excellent generalization performance. However, for large-scale applications such as object recognition, hand labelling of data is expensive, and there is much interest in semi-supervised techniques ba ..."
Abstract
-
Cited by 34 (1 self)
- Add to MetaCart
When labelled training data is plentiful, discriminative techniques are widely used since they give excellent generalization performance. However, for large-scale applications such as object recognition, hand labelling of data is expensive, and there is much interest in semi-supervised techniques based on generative models in which the majority of the training data is unlabelled. Although the generalization performance of generative models can often be improved by ‘training them discriminatively’, they can then no longer make use of unlabelled data. In an attempt to gain the benefit of both generative and discriminative approaches, heuristic procedure have been proposed [2, 3] which interpolate between these two extremes by taking a convex combination of the generative and discriminative objective functions. In this paper we adopt a new perspective which says that there is only one correct way to train a given model, and that a ‘discriminatively trained ’ generative model is fundamentally a new model [7]. From this viewpoint, generative and discriminative models correspond to specific choices for the prior over parameters. As well as giving a principled interpretation of ‘discriminative training’, this approach opens door to very general ways of interpolating between generative and discriminative extremes through alternative choices of prior. We illustrate this framework using both synthetic data and a practical example in the domain of multi-class object recognition. Our results show that, when the supply of labelled training data is limited, the optimum performance corresponds to a balance between the purely generative and the purely discriminative. 1.
Detecting human faces in color images
, 1999
"... A method for detecting human faces in color images is described that first separates skin regions from nonskin regions and then locates faces within skin regions. A chroma chart is prepared via a training process that contains the likelihoods of different colors representing the skin. Using the chro ..."
Abstract
-
Cited by 30 (0 self)
- Add to MetaCart
A method for detecting human faces in color images is described that first separates skin regions from nonskin regions and then locates faces within skin regions. A chroma chart is prepared via a training process that contains the likelihoods of different colors representing the skin. Using the chroma chart, a color image is transformed into a gray scale image in such a way that the gray value at a pixel shows the likelihood of the pixel representing the skin. An obtained gray scale image is then segmented to skin and nonskin regions, and model faces representing front- and side-view faces are used in a template-matching process to detect faces within skin regions. The false-positive and false-negative errors of the proposed face-detection method on color images of size 300 × 220 and containing four or fewer faces are 0.04 or less.
Sequential Scalar Quantization Of Color Images
, 1994
"... We propose an efficient algorithm for color image quantization based on a new VQ technique which we call sequential scalar quantization (SSQ). The scalar components of the 3-D color vector are individually quantized in a predetermined sequence. With this technique, the color palette is designed very ..."
Abstract
-
Cited by 15 (3 self)
- Add to MetaCart
We propose an efficient algorithm for color image quantization based on a new VQ technique which we call sequential scalar quantization (SSQ). The scalar components of the 3-D color vector are individually quantized in a predetermined sequence. With this technique, the color palette is designed very efficiently, while pixel mapping is performed with no computation. In order to obtain an optimal allocation of quantization levels along each color coordinate, we appeal to asymptotic quantization theory, where the number of quantization levels is assumed to be very large. We modify this theory to suit our application, where the number of quantization levels is typically small. In order to utilize the properties of the human visual system (HVS), the quantization is performed in a luminance-chrominance color space. A luminance-chrominance weighting is introduced to account for the greater sensitivity of the HVS to luminance than to chrominance errors. A spatial activity measure is also incor...
Efficient Luminance and Saturation Processing Techniques for Bypassing Color Coordinate Transformations
- Journal of Visual Communication and Image Representation
, 1995
"... In many applications of color image processing, only modification of the luminance component is desired. However, the commonly used coordinate systems, such as HSI, LHS, and YIQ, are not perceptually orthogonal; that is, luminance modification can cause perceptual shifts in the hue and saturation. I ..."
Abstract
-
Cited by 6 (2 self)
- Add to MetaCart
In many applications of color image processing, only modification of the luminance component is desired. However, the commonly used coordinate systems, such as HSI, LHS, and YIQ, are not perceptually orthogonal; that is, luminance modification can cause perceptual shifts in the hue and saturation. In this paper, we present a theoretical analysis of this phenomenon. Efficient techniques are developed for bypassing the costly coordinate transformations when only the luminance or only the saturation is to be modified. Experimental results using histogram equalization support the theoretical analysis. I. Introduction The RGB (red-green-blue) coordinate system is commonly used for representing digital color images. However, coordinate systems related to the human visual system's perceptual attributes (luminance, hue, and saturation) are often useful for processing color images. Much research has been done toward the development of color measurement techniques, but there is not any one col...
Midstream Content Access based on Colour Visual Pattern Coding
- In Storage and Retrieval for Image and Video Databases VIII, volume 3972 of Proceedings of SPIE
, 2000
"... It has been argued that future image coding techniques should allow for "midstream access", i.e. allow image query, retrieval, and modification to proceed on the compressed representation [18]. In a recent work [1], we introduced the colour visual pattern image coding (CVPIC) technique for colour im ..."
Abstract
-
Cited by 4 (4 self)
- Add to MetaCart
It has been argued that future image coding techniques should allow for "midstream access", i.e. allow image query, retrieval, and modification to proceed on the compressed representation [18]. In a recent work [1], we introduced the colour visual pattern image coding (CVPIC) technique for colour image compression. An image is divided into blocks and each block coded locally by mapping it to one of a predefined, universal set of visually significant image patterns consisting of representations for both edge and uniform (smooth) regions. The pattern and colour information is then stored, following a colour quantisation algorithm and an entropy encoding stage. Compression ratios between 40:1 and 60:1 were achieved whilst maintaining high image quality on a variety of natural colour images. It was also shown that CVPIC could achieve comparable performance to state-of-the-art techniques such as JPEG.
LCDs Versus CRTs - Color-Calibration and Gamut Considerations
, 2002
"... This paper presents a comparative evaluation of liquid-crystal display (LCDs) and cathode-ray tube (CRT) displays from a color-rendition and color-calibration perspective. Common display calibration models and assumptions are reviewed and their applicability to LCDs and CRTs is evaluated through an ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
This paper presents a comparative evaluation of liquid-crystal display (LCDs) and cathode-ray tube (CRT) displays from a color-rendition and color-calibration perspective. Common display calibration models and assumptions are reviewed and their applicability to LCDs and CRTs is evaluated through an experimental study. The displays are compared with respect to the color-calibration accuracy, ease of calibration, and achievable color gamut. The offset, matrix, and tone-response correction model commonly employed for CRT color calibration is also suitable for color calibration of LCDs for most applications, though the calibration error for LCDs is higher. For the prototype LCDs used in the experimental study, large color variations significantly above the calibration accuracy are observed with changes in viewing angle. Under typical viewing conditions, LCDs provide a significantly larger color gamut than CRTs primarily due to their higher luminances
The Role of Domain Knowledge in the Detection of Retinal Hard Exudates
- in Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition
, 2001
"... Diabetic retinopathy is a major cause of blindness in the world. Regular screening and timely intervention can halt or reverse the progression of this disease. Digital retinal imaging technologies have become an integral part of eye screening programs worldwide due to their greater accuracy and re ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
Diabetic retinopathy is a major cause of blindness in the world. Regular screening and timely intervention can halt or reverse the progression of this disease. Digital retinal imaging technologies have become an integral part of eye screening programs worldwide due to their greater accuracy and repeatability in staging diabetic retinopathy. These screening programs produce an enormous number of retinal images since diabetic patients typically have both their eyes examined at least once a year. Automated detection of retinal lesions can reduce the workload and increase the efficiency of doctors and other eye-care personnel reading the retinal images and facilitate the follow-up management of diabetic patients. Existing techniques to detect retinal lesions are neither adaptable nor sufficiently sensitive and specific for reallife screening application. In this paper, we demonstrate the role of domain knowledge in improving the accuracy and robustness of detection of hard exudates in retinal images. Experiments on 543 consecutive retinal images of diabetic patients indicate that we are able to achieve 100% sensitivity and 74% specificity in the detection of hard exudates.

