Results 1 - 10
of
19
Content-based image retrieval at the end of the early years
- IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
, 2000
"... The paper presents a review of 200 references in content-based image retrieval. The paper starts with discussing the working conditions of content-based retrieval: patterns of use, types of pictures, the role of semantics, and the sensory gap. Subsequent sections discuss computational steps for imag ..."
Abstract
-
Cited by 873 (16 self)
- Add to MetaCart
The paper presents a review of 200 references in content-based image retrieval. The paper starts with discussing the working conditions of content-based retrieval: patterns of use, types of pictures, the role of semantics, and the sensory gap. Subsequent sections discuss computational steps for image retrieval systems. Step one of the review is image processing for retrieval sorted by color, texture, and local geometry. Features for retrieval are discussed next, sorted by: accumulative and global features, salient points, object and shape features, signs, and structural combinations thereof. Similarity of pictures and objects in pictures is reviewed for each of the feature types, in close connection to the types and means of feedback the user of the systems is capable of giving by interaction. We briefly discuss aspects of system engineering: databases, system architecture, and evaluation. In the concluding section, we present our view on: the driving force of the field, the heritage from computer vision, the influence on computer vision, the role of similarity and of interaction, the need for databases, the problem of evaluation, and the role of the semantic gap.
Medianet: A Multimedia Information Network for Knowledge Representation
, 2000
"... In this paper, we present MediaNet, which is a knowledge representation framework that uses multimedia content for representing semantic and perceptual information. The main components of MediaNet include conceptual entities, which correspond to real word objects, and relationships among concepts. M ..."
Abstract
-
Cited by 41 (12 self)
- Add to MetaCart
In this paper, we present MediaNet, which is a knowledge representation framework that uses multimedia content for representing semantic and perceptual information. The main components of MediaNet include conceptual entities, which correspond to real word objects, and relationships among concepts. MediaNet allows the concepts and relationships to be defined or exemplified by multimedia content such as images, video, audio, graphics, and text. MediaNet models the traditional relationship types such as generalization and aggregation but adds additional functionality by modeling perceptual relationships based on feature similarity. For example, MediaNet allows a concept such as "car" to be deftned as a type of a "transportation vehicle", but which is further defined and illustrated through example images, videos and sounds of cars. In constructing the MediaNet framework, we have built on the basic principles of semiotics and semantic networks in addition to utilizing the audio-visual content description framework being developed as part of the MPEG-7 multimedia content description standard.
Narrowing the Semantic Gap - Improved Text-Based Web Document Retrieval Using Visual Features
- IEEE Transactions on Multimedia
, 2002
"... In this paper, we present the results of our work that seeks to negotiate the gap between low-level features and high-level concepts in the domain of web document retrieval. This work concerns a technique, Latent Semantic Indexing (LSI), which has been used for textual information retrieval for many ..."
Abstract
-
Cited by 41 (2 self)
- Add to MetaCart
In this paper, we present the results of our work that seeks to negotiate the gap between low-level features and high-level concepts in the domain of web document retrieval. This work concerns a technique, Latent Semantic Indexing (LSI), which has been used for textual information retrieval for many years. In this environment, LSI is used to determine clusters of cooccurring keywords, sometimes, called concepts, so that a query which uses a particular keyword can then retrieve documents perhaps not containing this keyword, but containing other keywords from the same cluster. In this paper, we examine the use of this technique for content-based web document retrieval, using both keywords and image features to represent the documents. Two different approaches to image feature representation, namely, color histograms and color anglograms, are adopted and evaluated. Experimental results show that LSI, together with both textual and visual features, is able to extract the underlying semantic structure of web documents, thus helping to improve the retrieval performance significantly.
Negotiating the Semantic Gap: From Feature Maps to Semantic Landscapes
"... In this paper, we present the results of our work that seeks to negotiate the gap between low-level features and high-level concepts in the domain of web document retrieval. This work concerns a technique, latent semantic indexing (LSI), which has been used for textual information retrieval for ..."
Abstract
-
Cited by 26 (1 self)
- Add to MetaCart
In this paper, we present the results of our work that seeks to negotiate the gap between low-level features and high-level concepts in the domain of web document retrieval. This work concerns a technique, latent semantic indexing (LSI), which has been used for textual information retrieval for many years. In this environment, LSI determines clusters of co-occurring keywords, sometimes, called concepts, so that a query which uses a particular keyword can then retrieve documents perhaps not containing this keyword, but containing other keywords from the same cluster. In this paper, we examine the use of this technique for content-based web document retrieval, using both keywords and image features to represent the documents. 1
On image retrieval using salient regions with vector-spaces and latent semantics
- In Image and Video Retrieval: Third International Conference (CIVR
, 2005
"... Abstract. The vector-space retrieval model and Latent Semantic Indexing approaches to retrieval have been used heavily in the field of text information retrieval over the past years. The use of these approaches in image retrieval, however, has been somewhat limited. In this paper, we present methods ..."
Abstract
-
Cited by 13 (6 self)
- Add to MetaCart
Abstract. The vector-space retrieval model and Latent Semantic Indexing approaches to retrieval have been used heavily in the field of text information retrieval over the past years. The use of these approaches in image retrieval, however, has been somewhat limited. In this paper, we present methods for using these techniques in combination with an invariant image representation based on local descriptors of salient regions. The paper also presents an evaluation in which the two techniques are used to find images with similar semantic labels. 1
Bridging the Semantic Gap in Image Retrieval
- Distributed Multimedia Databases: Techniques and Applications
"... the gap between them is still a huge barrier in front of researchers. Intuitive and heuristic approaches do not provide us with satisfactory performance. Therefore, there is an urgent need of finding the latent correlation between low-level features and high-level concepts and merging them from a di ..."
Abstract
-
Cited by 9 (0 self)
- Add to MetaCart
the gap between them is still a huge barrier in front of researchers. Intuitive and heuristic approaches do not provide us with satisfactory performance. Therefore, there is an urgent need of finding the latent correlation between low-level features and high-level concepts and merging them from a different perspective. How to find this new perspective and bridge the gap between visual features and semantic features has been a major challenge in this research field. Our paper addresses these issues. Bridging the Semantic Gap in Image Retrieval 15 INTRODUCTION The emergence of multimedia technology and the rapidly expanding image and video collections on the Internet have attracted significant research efforts in providing tools for effective retrieval and management of visual data. Image retrieval is based on the availability of a representation scheme of image content. Image content descriptors may be visual features such as color, texture, shape, and spa
Region-based video content indexing and retrieval
- in CBMI 2005, Fourth International Workshop on Content-Based Multimedia Indexing
"... In this paper we propose to compare two region-based approaches to content-based video indexing and retrieval. Namely a comparison of a system using the Earth Mover’s Distance and a system using the Latent Semantic Indexing is provided. Region-based methods allow to keep the local information in a w ..."
Abstract
-
Cited by 8 (0 self)
- Add to MetaCart
In this paper we propose to compare two region-based approaches to content-based video indexing and retrieval. Namely a comparison of a system using the Earth Mover’s Distance and a system using the Latent Semantic Indexing is provided. Region-based methods allow to keep the local information in a way that reflects the human perception of the content. Thus, they are very attractive to design efficient Content Based Video Retrieval systems. We presented a region based approach using Latent Semantic Indexing (LSI) in previous work. And now we compare performances of our system with a method using the Earth Mover’s Distance that have the property to keep the original features describing regions. This paper shows that LSA performs better on the task of object retrieval despite the quantification process implied. 1.
Latent Semantic Analysis for an Effective Region-Based Video Shot Retrieval System
- In Proceedings of the ACM International Workshop on Multimedia Information Retrieval
, 2004
"... We present a complete and e#cient framework for video shot indexing and retrieval. Video shots are described by their key-frame, themselves described by their regions. Regionbased approaches su#er from the complexity of segmentation and comparison tasks. A compact region-based shot representation is ..."
Abstract
-
Cited by 5 (4 self)
- Add to MetaCart
We present a complete and e#cient framework for video shot indexing and retrieval. Video shots are described by their key-frame, themselves described by their regions. Regionbased approaches su#er from the complexity of segmentation and comparison tasks. A compact region-based shot representation is usually obtained thanks to vector-quantization method. We thus introduce LSA to reduce the noise inherent to the segmentation and the quantization processes. Then to better capture the content of video shots, we propose two original methods. The first takes advantage of a multi-scale segmentation of frames while the second uses multiple frames to represent a shot. Both approaches require more computation time during the pre-processing but not for indexing and comparison tasks. Indeed the extra information is included in the original signatures of shots. Finally we introduce a relevance feedback loop to optimize the search and propose a new method to optimize the e#ect of LSA. In the experimental section, we make an evaluation of latent semantic analysis and proposed approaches on two problems, namely object retrieval and semantic content estimation.
Ontology based visual information processing
, 2004
"... Despite the dramatic growth of digital image and video data in recent years, many challenges remain in enabling computers to interpret visual content. This thesis addresses the problem of how high-level representations of visual data may be au-tomatically derived by integrating different kinds of ev ..."
Abstract
-
Cited by 3 (3 self)
- Add to MetaCart
Despite the dramatic growth of digital image and video data in recent years, many challenges remain in enabling computers to interpret visual content. This thesis addresses the problem of how high-level representations of visual data may be au-tomatically derived by integrating different kinds of evidence and incorporating prior knowledge. A central contribution of the thesis is the development and reali-sation of an inference framework for computer vision founded on the application of ontologies and ontological languages as a novel methodology for active knowledge representation. Three applications of the approach are presented: The first describes a novel query paradigm called OQUEL (ontological query language) for the content-based retrieval of images. The language is based on an extensible ontology which encompasses both high-level and low-level visual prop-erties and relations. Query sentences are prescriptions of desired image content in terms of English language words. The second application extends the notion of ontological query languages to
Dynamic Learning from Multiple Examples for Semantic Object
- Segmentation and Search,” Computer Vision and Image Understanding
, 2004
"... We present a novel “dynamic learning ” approach for an intelligent image database system to automatically improve object segmentation and labeling without user intervention, as new examples become available, for object-based indexing. The proposed approach is an extension of our earlier work on “lea ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
We present a novel “dynamic learning ” approach for an intelligent image database system to automatically improve object segmentation and labeling without user intervention, as new examples become available, for object-based indexing. The proposed approach is an extension of our earlier work on “learning by example, ” which addressed labeling of similar objects in a set of database images based on a single example. The proposed dynamic learning procedure utilizes multiple example object templates to improve the accuracy of existing object segmentations and labels. Multiple example templates may be images of the same object from different viewing angles, or images of related objects. This paper also introduces a new shape similarity metric called Normalized Area of Symmetric Differences (NASD), which has desired properties for use in the proposed “dynamic learning ” scheme, and is more robust against boundary noise that results from automatic image segmentation. Performance of the dynamic learning procedures has

