Results 1 - 10
of
40
Image classification for content-based indexing
- IEEE Transactions on Image Processing
, 2001
"... Abstract—Grouping images into (semantically) meaningful categories using low-level visual features is a challenging and important problem in content-based image retrieval. Using binary Bayesian classifiers, we attempt to capture high-level concepts from low-level image features under the constraint ..."
Abstract
-
Cited by 118 (2 self)
- Add to MetaCart
Abstract—Grouping images into (semantically) meaningful categories using low-level visual features is a challenging and important problem in content-based image retrieval. Using binary Bayesian classifiers, we attempt to capture high-level concepts from low-level image features under the constraint that the test image does belong to one of the classes. Specifically, we consider the hierarchical classification of vacation images; at the highest level, images are classified as indoor or outdoor; outdoor images are further classified as city or landscape; finally, a subset of landscape images is classified into sunset, forest, and mountain classes. We demonstrate that a small vector quantizer (whose optimal size is selected using a modified MDL criterion) can be used to model the class-conditional densities of the features, required by the Bayesian methodology. The classifiers have been designed and evaluated on a database of 6931 vacation photographs. Our system achieved a classification accuracy of 90.5 % for indoor/outdoor, 95.3 % for city/landscape, 96.6 % for sunset/forest & mountain, and 96 % for forest/mountain classification problems. We further develop a learning method to incrementally train the classifiers as additional data become available. We also show preliminary results for feature reduction using clustering techniques. Our goal is to combine multiple two-class classifiers into a single hierarchical classifier. Index Terms—Bayesian methods, content-based retrieval, digital libraries, image content analysis, minimum description length, semantic
A Fully Automated Content-Based Video Search Engine Supporting Spatiotemporal Queries
- IEEE Transactions on Circuits and Systems for Video Technology
, 1998
"... The rapidity with which digital information, particularly video, is being generated has necessitated the development of tools for efficient search of these media. Content-based visual queries have been primarily focused on still image retrieval. In this paper, we propose a novel, interactive system ..."
Abstract
-
Cited by 85 (4 self)
- Add to MetaCart
The rapidity with which digital information, particularly video, is being generated has necessitated the development of tools for efficient search of these media. Content-based visual queries have been primarily focused on still image retrieval. In this paper, we propose a novel, interactive system on the Web, based on the visual paradigm, with spatiotemporal attributes playing a key role in video retrieval. We have developed innovative algorithms for automated video object segmentation and tracking, and use real-time video editing techniques while responding to user queries. The resulting system, called VideoQ (demo available at http://www.ctr.columbia.edu/VideoQ/), is the first on-line video search engine supporting automatic objectbased indexing and spatiotemporal queries. The system performs well, with the user being able to retrieve complex video clips such as those of skiers and baseball players with ease. Index Terms---Content based, information retreival, object oriented, spat...
Indexing Animated Objects Using Spatiotemporal Access Methods
- IEEE Transactions on Knowledge and Data Engineering
, 2001
"... AbstractÐWe present a new approach for indexing animated objects and efficiently answering queries about their position in time and space. In particular, we consider an animated movie as a spatiotemporal evolution. A movie is viewed as an ordered sequence of frames, where each frame is a 2D space oc ..."
Abstract
-
Cited by 45 (7 self)
- Add to MetaCart
AbstractÐWe present a new approach for indexing animated objects and efficiently answering queries about their position in time and space. In particular, we consider an animated movie as a spatiotemporal evolution. A movie is viewed as an ordered sequence of frames, where each frame is a 2D space occupied by the objects that appear in that frame. The queries of interest are range queries of the form, ªfind the objects that appear in area S between frames fi and fjº as well as nearest neighbor queries such as, ªfind the q nearest objects to a given position A between frames fi and fj.º The straightforward approach to index such objects considers the frame sequence as another dimension and uses a 3D access method (such as, an R-Tree or its variants). This, however, assigns long ªlifetimeº intervals to objects that appear through many consecutive frames. Long intervals are difficult to cluster efficiently in a 3D index. Instead, we propose to reduce the problem to a partial-persistence problem. Namely, we use a 2D access method that is made partially persistent. We show that this approach leads to faster query performance while still using storage proportional to the total number of changes in the frame evolution. What differentiates this problem from traditional temporal indexing approaches is that objects are allowed to move and/or change their extent continuously between frames. We present novel methods to approximate such object evolutions. We formulate an optimization problem for which we provide an optimal solution for the case where objects move linearly. Finally, we present an extensive experimental study of the proposed methods. While we concentrate on animated movies, our approach is general and can be applied to other spatiotemporal applications as well. Index TermsÐAccess methods, spatiotemporal databases, animated objects, multimedia. 1
A Stochastic Framework For Optimal Key Frame . . .
- MPEG VIDEO DATABASES,” COMPUTER VISION AND IMAGE UNDERSTANDING
, 1999
"... A framework for video content representation is proposed in this paper for extracting limited, but meaningful, information of video data directly from MPEG compressed domain. First, the traditional frame-based representation is transformed to a feature-based one. Then, all features are gathered toge ..."
Abstract
-
Cited by 39 (28 self)
- Add to MetaCart
A framework for video content representation is proposed in this paper for extracting limited, but meaningful, information of video data directly from MPEG compressed domain. First, the traditional frame-based representation is transformed to a feature-based one. Then, all features are gathered together using a fuzzy formulation and extraction of several key frames is performed for each shot in a contentbased rate sampling framework. In particular, our approach is based on minimization of a cross-correlation criterion among video frames of a given shot so as to be located a set of minimally correlated feature vectors. Experimental results indicating the good performance of the proposed scheme are also presented.
Joining Ranked Inputs in Practice
, 2002
"... Joining ranked inputs is an essential requirement for many database applications, such as ranking search results from multiple search engines and answering multi-feature queries for multimedia retrieval systems. We introduce a new practical pipelined query operator, termed NRA-RJ, that produces a gl ..."
Abstract
-
Cited by 37 (9 self)
- Add to MetaCart
Joining ranked inputs is an essential requirement for many database applications, such as ranking search results from multiple search engines and answering multi-feature queries for multimedia retrieval systems. We introduce a new practical pipelined query operator, termed NRA-RJ, that produces a global rank from input ranked streams based on a score function. The output of NRA-RJ can serve as a valid input to other NRA-RJ operators in the query pipeline. Hence, the NRA-RJ operator can support a hierarchy of join operations and can be easily integrated in query processing engines of commercial database systems.
On Image Classification: City Images vs. Landscapes
- PATTERN RECOGNITION
, 1998
"... Grouping images into semantically meaningful categories using low-level visual features is a challenging and important problem in content-based image retrieval. Based on these groupings, effective indices can be built for an image database. In this paper, we show how a specific high-level classifica ..."
Abstract
-
Cited by 37 (1 self)
- Add to MetaCart
Grouping images into semantically meaningful categories using low-level visual features is a challenging and important problem in content-based image retrieval. Based on these groupings, effective indices can be built for an image database. In this paper, we show how a specific high-level classification problem (city images vs. landscapes) can be solved from relatively simple low-level features geared for the particular classes. We have developed a procedure to qualitatively measure the saliency of a feature towards a classification problem based on the plot of the intra-class and inter-class distance distributions. We use this approach to determine the discriminative power of the following features: color histogram, color coherence vector, DCT coefficient, edge direction histogram, and edge direction coherence vector. We determine that the edge direction-based features have the most discriminative power for the classification problem of interest here. A weighted k-NN classifier is use...
Automatic detection of human nudes
- International Journal of Computer Vision
, 1999
"... This paper demonstrates an automatic system for telling whether there are human nudes present in an image. The system marks skin-like pixels using combined color and texture properties. These skin regions are then fed to a specialized grouper, which attempts to group a human figure using geometric c ..."
Abstract
-
Cited by 29 (3 self)
- Add to MetaCart
This paper demonstrates an automatic system for telling whether there are human nudes present in an image. The system marks skin-like pixels using combined color and texture properties. These skin regions are then fed to a specialized grouper, which attempts to group a human figure using geometric constraints on human structure. If the grouper finds a sufficiently complex structure, the system decides a human is present. The approach is shown to be effective for a wide range of shades and colors of skin and human configurations. This approach offers an alternate view of object recognition, where an object model is an organized collection of grouping hints obtained from a combination of constraints on color and texture and constraints on geometric properties such as the structure of individual parts and the relationships between parts. The system demonstrates excellent performance on a test set of 565 uncontrolled images of human nudes, mostly obtained from the internet, and 4289 assorted control images, drawn from a wide variety of sources.
A Fuzzy Video Content Representation For Video Summarization And Content-Based Retrieval
- Signal Processing
, 1997
"... In this paper, a fuzzy representation of visual content is proposed, which is useful for the new emerging multimedia applications, such as content-based image indexing and retrieval, video browsing and summarization. In particular, a multidimensional fuzzy histogram is constructed for each video fra ..."
Abstract
-
Cited by 23 (19 self)
- Add to MetaCart
In this paper, a fuzzy representation of visual content is proposed, which is useful for the new emerging multimedia applications, such as content-based image indexing and retrieval, video browsing and summarization. In particular, a multidimensional fuzzy histogram is constructed for each video frame based on a collection of appropriate features, extracted using video sequence analysis techniques. This approach is then applied both for video summarization, in the context of a content-based sampling algorithm, and for content-based indexing and retrieval. In the "rst case, video summarization is accomplished by discarding shots or frames of similar visual content so that only a small but meaningful amount of information is retained (key-frames). In the second case, a content-based retrieval scheme is investigated, so that the most similar images to a query are extracted. Experimental results and comparison with other known methods are presented to indicate the good performance of the proposed scheme on real-life video recordings. # 2000 Elsevier Science B.V. All rights reserved.
ClassView: Hierarchical Video Shot Classification, Indexing, and Accessing
- IEEE Trans. on Multimedia
, 2004
"... Recent advances in digital video compression and networks have made video more accessible than ever. However, the existing content-based video retrieval systems still suffer from the following problems. 1 ) Semantics---sensitive video classification problem because of the semantic gap between low-le ..."
Abstract
-
Cited by 21 (4 self)
- Add to MetaCart
Recent advances in digital video compression and networks have made video more accessible than ever. However, the existing content-based video retrieval systems still suffer from the following problems. 1 ) Semantics---sensitive video classification problem because of the semantic gap between low-level visual features and high-level semantic visual concepts; 2) Integrated video access problem because of the lack of efficient video database indexing, automatic video annotation, and concept-oriented summary organization techniques. In this paper, we have proposed a novel framework, called ClassView, to make some advances toward more efficient video database indexing and access. 1) A hierarchical semantics-sensitive video classifier is proposed to shorten the semantic gap. The hierarchical tree structure of the semantics-sensitive video classifier is derived from the domain-dependent concept hierarchy of video contents in a database. Relevance analysis is used for selecting the discriminating visual features with suitable importances. The Expectation-Maximization (EM) algorithm is also used to determine the classification rule for each visual concept node in the classifier. 2) A hierarchical video database indexing and summary presentation technique is proposed to support more effective video access over a large-scale database. The hierarchical tree structure of our video database indexing scheme is determined by the domain-dependent concept hierarchy which is also used for video classification. The presentation of visual summary is also integrated with the inherent hierarchical video database indexing tree structure. Integrating video access with efficient database indexing tree structure has provided great opportunity for supporting more powerful video search engines.

