Results 1 - 10
of
97
Emergent Semantics Through Interaction in Image Databases
- IEEE Transactions on Knowledge and Data Engineering
, 2001
"... In this paper we discuss briefly some aspects of image semantics and the role that it plays for the design of Image Databases. We argue that images don't have an intrinsic meaning, but that they are endowed with a meaning by placing them in the context of other images and by the user interaction. Fr ..."
Abstract
-
Cited by 58 (6 self)
- Add to MetaCart
In this paper we discuss briefly some aspects of image semantics and the role that it plays for the design of Image Databases. We argue that images don't have an intrinsic meaning, but that they are endowed with a meaning by placing them in the context of other images and by the user interaction. From this observation, we conclude that in an image database users should be allowed to manipulate not only the individual images, but also the relation between them. We present an interface model based on the manipulation of configurations of images. 1 Introduction In this paper we propose some new ideas on image semantics, and study some of their consequences on the interaction with---and the organization of---image databases. Many current Content Based Image Retrieval (CBIR) systems follow a semantic model derived from traditional databases according to which the meaning of a record is a compositional function of its syntactic structure and of the meaning of its elementary constituents. W...
PicSOM—Self-organizing image retrieval with MPEG-7 content descriptions
- Networks, Special Issue on Intelligent Multimedia Processing
, 2002
"... Abstract—Development of content-based image retrieval (CBIR) techniques has suffered from the lack of standardized ways for describing visual image content. Luckily, the MPEG-7, or formally “Moving Pictures Expert Group Multimedia Content Description Interface ” international standard is now emergin ..."
Abstract
-
Cited by 55 (35 self)
- Add to MetaCart
Abstract—Development of content-based image retrieval (CBIR) techniques has suffered from the lack of standardized ways for describing visual image content. Luckily, the MPEG-7, or formally “Moving Pictures Expert Group Multimedia Content Description Interface ” international standard is now emerging as both a general framework for content description and a collection of specific agreed-upon content descriptors. We have developed a neural, self-organizing technique for CBIR. Our system is named PicSOM and it is based on pictorial examples and relevance feedback (RF). The name stems from “picture ” and the self-organizing map (SOM). The PicSOM system is implemented by using tree structured SOMs. In this paper, we apply the visual content descriptors provided by MPEG-7 in the PicSOM system and compare our own image indexing technique with a reference system based on vector quantization (VQ). The results of our experiments show that the MPEG-7-defined content descriptors can be used as such in the PicSOM system even though Euclidean distance calculation, inherently used in the PicSOM system, is not optimal for all of them. Also, the results indicate that the PicSOM technique is a bit slower than the reference system in starting to find relevant images. However, when the strong RF mechanism of PicSOM begins to function, its retrieval precision exceeds that of the reference system. Index Terms—Content-based image retrieval (CBIR), MPEG-7, query by pictorial example (QBPE), relevance feedback (RF), selforganizing map (SOM), visual content description. I.
Learning Similarity Measure for Natural Image Retrieval With Relevance Feedback
- IEEE TRANSACTIONS ON NEURAL NETWORKS
, 2002
"... A new scheme of learning similarity measure is proposed for content-based image retrieval (CBIR). It learns a boundary that separates the images in the database into two clusters. Images inside the boundary are ranked by their Euclidean distances to the query. The scheme is called constrained simila ..."
Abstract
-
Cited by 37 (2 self)
- Add to MetaCart
A new scheme of learning similarity measure is proposed for content-based image retrieval (CBIR). It learns a boundary that separates the images in the database into two clusters. Images inside the boundary are ranked by their Euclidean distances to the query. The scheme is called constrained similarity measure (CSM), which not only takes into consideration the perceptual similarity between images, but also significantly improves the retrieval performance of the Euclidean distance measure. Two techniques, support vector machine (SVM) and AdaBoost from machine learning, are utilized to learn the boundary. They are compared to see their differences in boundary learning. The positive and negative examples used to learn the boundary are provided by the user with relevance feedback. The CSM metric is evaluated in a large database of 10 009 natural images with an accurate ground truth. Experimental results demonstrate the usefulness and effectiveness of the proposed similarity measure for image retrieval.
CLUE: Cluster-based Retrieval of Images by Unsupervised Learning
- IEEE Transactions on Image Processing
, 2003
"... In a typical content-based image retrieval (CBIR) system, query results are a set of images sorted by feature similarities with respect to the query. However, images with high feature similarities to the query may be very di#erent from the query in terms of semantics. This discrepancy between low-le ..."
Abstract
-
Cited by 34 (2 self)
- Add to MetaCart
In a typical content-based image retrieval (CBIR) system, query results are a set of images sorted by feature similarities with respect to the query. However, images with high feature similarities to the query may be very di#erent from the query in terms of semantics. This discrepancy between low-level features and high-level concepts is known as the semantic gap. This paper introduces a novel image retrieval scheme, CLUster-based rEtrieval of images by unsupervised learning (CLUE), which attempts to tackle the semantic gap problem based on a hypothesis that images of the same semantics are similar in a way, images of di#erent semantics are di#erent in their own ways. CLUE attempts to capture semantic concepts by learning the way that images of the same semantics are similar and retrieving image clusters instead of a set of ordered images. Clustering in CLUE is dynamic. In particular, clusters formed depend on which images are retrieved in response to the query. Therefore, the clusters give the algorithm as well as the users semantic relevant clues as to where to navigate. CLUE is a general approach that can be combined with any real-valued symmetric similarity measure (metric or nonmetric). Thus it may be embedded in many current CBIR systems. An experimental image retrieval system using CLUE has been implemented. The performance of the system is evaluated on a database of about 60, 000 images from COREL. Empirical results demonstrate improved performance compared with a typical CBIR system using the same image similarity measure. In addition, preliminary results on images returned by Google's Image Search reveal the potential of applying CLUE to real world image data and integrating CLUE as a part of the interface for keyword-based image retrieval systems.
Efficient Video Similarity Measurement with Video Signature
- IEEE Transactions on Circuits and Systems for Video Technology
, 2003
"... The proliferation of video content on the web makes similarity detection an indispensable tool in web data management, searching, and navigation. In this paper, we propose a number of algorithms to efficiently measure video similarity. We define video as a set of frames, which are represented as hig ..."
Abstract
-
Cited by 32 (5 self)
- Add to MetaCart
The proliferation of video content on the web makes similarity detection an indispensable tool in web data management, searching, and navigation. In this paper, we propose a number of algorithms to efficiently measure video similarity. We define video as a set of frames, which are represented as high dimensional vectors in a feature space. Our goal is to measure Ideal Video Similarity (IVS), defined as the percentage of clusters of similar frames shared between two video sequences. Since IVS is too complex to be deployed in large database applications, we approximate it with Voronoi Video Similarity (VVS), defined as the volume of the intersection between Voronoi Cells of similar clusters. We propose a class of randomized algorithms to estimate VVS by first summarizing each video with a small set of its sampled frames, called the Video Signature (ViSig), and then calculating the distances between corresponding frames from the two ViSig's. By generating samples with a probability distribution that describes the video statistics, and ranking them based upon their likelihood of making an error in the estimation, we show analytically that ViSig can provide an unbiased estimate of IVS. Experimental results on a large dataset of web video and a set of MPEG-7 test sequences with artificially generated similar versions are provided to demonstrate the retrieval performance of our proposed techniques.
Retrieval by Shape Similarity with Perceptual Distance and Effective Indexing
- IEEE TRANSACTIONS ON MULTIMEDIA
, 2000
"... An important problem in accessing and retrieving visual information is to provide efficient similarity matching in large databases. Though much work is being done on the investigation of suitable perceptual models and the automatic extraction of features, little attention is given to the combination ..."
Abstract
-
Cited by 27 (0 self)
- Add to MetaCart
An important problem in accessing and retrieving visual information is to provide efficient similarity matching in large databases. Though much work is being done on the investigation of suitable perceptual models and the automatic extraction of features, little attention is given to the combination of useful representations and similarity models with efficient index structures. In this paper
Image Retrieval Using Wavelet-Based Salient Points
, 2001
"... Content-based image retrieval (CBIR) has become one of the most active research areas in the past few years. Most of the attention from the research has been focused on indexing techniques based on global feature distributions. However, these global distributions have limited discriminating power be ..."
Abstract
-
Cited by 24 (4 self)
- Add to MetaCart
Content-based image retrieval (CBIR) has become one of the most active research areas in the past few years. Most of the attention from the research has been focused on indexing techniques based on global feature distributions. However, these global distributions have limited discriminating power because they are unable to capture local image information. The use of interest points in content-based image retrieval allow image index to represent local properties of the image. Classic corner detectors can be used for this purpose. However, they have drawbacks when applied to various natural images for image retrieval, because visual features need not be corners and corners may gather in small regions. In this paper, we present a salient point detector. The detector is based on wavelet transform to detect global variations as well as local ones. The wavelet-based salient points are evaluated for image retrieval with a retrieval system using color and texture features. The results show that salient points with Gabor feature perform better than the other point detectors from the literature and the randomly chosen points. Significant improvements are achieved in terms of retrieval accuracy, computational complexity when compared to the global feature approaches. 2001 SPIE and IS&T. [DOI: 10.1117/1.1406945] 1
A Framework for Visual Information Retrieval
- Journal of Visual Languages and Computing
, 2002
"... In this paper a visual information retrieval project (VizIR) is presented. The goal of the project is the implementation of an open Contentbased Visual Retrieval (CBVR) prototype as basis for further research on the major problems of CBVR. The motivation behind VizIR is: an open platform would m ..."
Abstract
-
Cited by 21 (17 self)
- Add to MetaCart
In this paper a visual information retrieval project (VizIR) is presented. The goal of the project is the implementation of an open Contentbased Visual Retrieval (CBVR) prototype as basis for further research on the major problems of CBVR. The motivation behind VizIR is: an open platform would make research (especially for smaller institutions) easier and more efficient. The intention of this paper is to let interested researchers know about VizIR's existence and design as well as to invite them to take part in the design and implementation process of this open project. The authors describe the goals of the VizIR project, the intended design of the framework and major implementation issues. The latter includes a sketch on the advantages and drawbacks of the existing cross-platform media processing frameworks: Java Media Framework, OpenML and Microsoft's DirectX (DirectShow).
Real time pattern matching using projection kernels
, 2002
"... A novel approach to pattern matching is presented, which reduces time complexity by two orders of magnitude compared to traditional approaches. The suggested approach uses an efficient projection scheme which bounds the distance between a pattern and an image window using very few operations. The pr ..."
Abstract
-
Cited by 17 (8 self)
- Add to MetaCart
A novel approach to pattern matching is presented, which reduces time complexity by two orders of magnitude compared to traditional approaches. The suggested approach uses an efficient projection scheme which bounds the distance between a pattern and an image window using very few operations. The projection framework is combined with a rejection scheme which allows rapid rejection of image windows that are distant from the pattern. Experiments show that the approach is effective even under very noisy conditions. The approach described here can also be used in classification schemes where the projection values serve as input features that are informative and fast to extract. 1.

