Results 1 - 10
of
11
Automatic Key Video Object Plane Selection Using the Shape Information in the MPEG-4 Compressed Domain
- IEEE Trans. Multimedia
, 2000
"... Object-based video representation, such as the one suggested by the MPEG-4 standard, offers a framework that is better suited for object-based video indexing and retrieval. In such a framework, the concept of a "key frame" is replaced by that of a "key video object plane". In this paper, we propo ..."
Abstract
-
Cited by 17 (4 self)
- Add to MetaCart
Object-based video representation, such as the one suggested by the MPEG-4 standard, offers a framework that is better suited for object-based video indexing and retrieval. In such a framework, the concept of a "key frame" is replaced by that of a "key video object plane". In this paper, we propose a method for key video object plane selection using the shape information in the MPEG-4 compressed domain. The shape of the video object is approximated using information on the shape coding modes in the MPEG-4 bitstream. Two popular shape distance measures, the Hamming and Hausdorff distance measures, are modified to measure the similarities between the approximated shapes of the video objects. Although they feature different computational and implementation complexity tradeoffs, the corresponding algorithms achieve essentially the same performance levels in selecting key video object planes that represent efficiently the salient content of the video objects. Key words: Key video ...
A Hierarchical Access Control Model for Video Database Systems
- ACM TRANS. ON INFO. SYST
, 2003
"... ... In this paper, we propose a novel approach to support multilevel access control in video databases. Our access control technique combines a video database indexing mechanism with a hierarchical organization of visual concepts (i.e., video database indexing units), so that different classes of us ..."
Abstract
-
Cited by 16 (7 self)
- Add to MetaCart
... In this paper, we propose a novel approach to support multilevel access control in video databases. Our access control technique combines a video database indexing mechanism with a hierarchical organization of visual concepts (i.e., video database indexing units), so that different classes of users can access different video elements or even the same video element with different quality levels according to their permissions. These video elements, which, in our access control mechanism, are used for specifying the authorization objects, can be a semantic cluster, a subcluster, a video scene, a video shot, a video frame, or even a salient object (i.e., region of interest). In the paper, we first introduce our techniques for obtaining these multilevel
VORTEX: Video retrieval and tracking from compressed multimedia databases -- multiple object tracking from MPEG-2 bitstream
- JOURNAL OF VISUAL COMMUNICATIONS AND IMAGE REPRESENTATION
, 2000
"... Multimedia data are generally stored in compressed form in order to efficiently utilize the available storage facilities. Access to multimedia archives is thus dependent on our ability to browse compressed information. In this paper, a novel approach to multiple object tracking from compressed multi ..."
Abstract
-
Cited by 14 (7 self)
- Add to MetaCart
Multimedia data are generally stored in compressed form in order to efficiently utilize the available storage facilities. Access to multimedia archives is thus dependent on our ability to browse compressed information. In this paper, a novel approach to multiple object tracking from compressed multimedia databases is presented. This approach is intended to operate in a distributed environment, where users initiate video searches and retrieve relevant video information simultaneously from multiple compressed video archives. The system operates on the compressed video to find and track objects of interest and determine their positions in the image. This enables more complex query formulations in terms of the relative positions of the target objects in the image. The filtering and analysis of motion information (motion vectors) is used to track objects in the video bit stream. Once the search has terminated, the system may decompress and display the query-relevant video sequences upon request.
Video Retrieval Using Spatio-Temporal Descriptors
"... This paper describes a novel methodology for implementing video search functions such as retrieval of near-duplicate videos and recognition of actions in surveillance video. Videos are divided into half-second clips whose stacked frames produce 3D space-time volumes of pixels. Pixel regions with co ..."
Abstract
-
Cited by 14 (0 self)
- Add to MetaCart
This paper describes a novel methodology for implementing video search functions such as retrieval of near-duplicate videos and recognition of actions in surveillance video. Videos are divided into half-second clips whose stacked frames produce 3D space-time volumes of pixels. Pixel regions with consistent color and motion properties are extracted from these 3D volumes by a threshold-free hierarchical space-time segmentation technique. Each region is then described by a high-dimensional point whose components represent the position, motion and, when possible, color of the region. In the indexing phase for a video database, these points are assigned labels that specify their video clip of origin. All the labeled points for all the clips are stored into a single binary tree for e#cient k-nearest neighbor retrieval. The retrieval phase uses video segments as queries. Half-second clips of these queries are again segmented to produce sets of points, and for each point the labels of its nearest neighbors are retrieved. The labels that receive the largest numbers of votes correspond to the database clips that are the most similar to the query video segment. We illustrate this approach for video indexing and retrieval and for action recognition. First, we describe retrieval experiments for dynamic logos, and for video queries that di#er from the indexed broadcasts by the addition of large overlays. Then we describe experiments in which o#ce actions (such as pulling and closing drawers, taking and storing items, picking up and putting down a phone) are recognized. Color information is ignored to insure independence to people's appearance. One of the distinct advantages of using this approach for action recognition is that there is no need for detection or recognition of body pa...
Content-based Video Retrieval: An overview
, 2000
"... Content-based Image Retrieval systems (CBIRS) start ourishing on the Web. Their performances are continuously improving and their base principles span a wide range of diversity. Content-based Video Retrieval systems (CBVRS) are less common and seem at a first glance to be a natural extension of CBIR ..."
Abstract
-
Cited by 8 (0 self)
- Add to MetaCart
Content-based Image Retrieval systems (CBIRS) start ourishing on the Web. Their performances are continuously improving and their base principles span a wide range of diversity. Content-based Video Retrieval systems (CBVRS) are less common and seem at a first glance to be a natural extension of CBIRS. In this document, we summarise advances made in the development of CBVRS and analyse their relationship to CBIRS. While doing so, we show that CBVRS are actually not so obvious extensions of CBIRS.
A Multiresolution Technique for Video Indexing and Retrieval
- in Proc. Intl. Conf. on Image Processing (ICIP’02), Vol.1
, 2002
"... This paper presents a novel approach to the multiresolution analysis and scalability in video indexing and retrieval. A scalable algorithm for video parsing and key-frame extraction is introduced. The technique is based on real-time analysis of MPEG motion variables and scalable metrics simplificati ..."
Abstract
-
Cited by 6 (2 self)
- Add to MetaCart
This paper presents a novel approach to the multiresolution analysis and scalability in video indexing and retrieval. A scalable algorithm for video parsing and key-frame extraction is introduced. The technique is based on real-time analysis of MPEG motion variables and scalable metrics simplification by discrete contour evolution. Furthermore, a hierarchical key-frame retrieval method using scalable colour histogram analysis is presented. It offers customisable levels of detail in the descriptor space, where the relevance order is determined by degradation of the image, and not by degradation of the image histogram. To assess the performance of the approach several experiments have been conducted. Selected results are reported in this paper. 1.
Video Retrieval Of Near-Duplicates Using k-Nearest Neighbor Retrieval Of Spatio-Temporal Descriptors
, 2005
"... This paper describes a novel methodology for retrieval of near-duplicate videos. Videos are divided into half-second clips whose stacked frames produce 3D space-time volumes of pixels. Pixel regions with consistent color and motion properties are extracted from these 3D volumes by a threshold-free h ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
This paper describes a novel methodology for retrieval of near-duplicate videos. Videos are divided into half-second clips whose stacked frames produce 3D space-time volumes of pixels. Pixel regions with consistent color and motion properties are extracted from these 3D volumes by a threshold-free hierarchical space-time segmentation technique. Each region is then described by a point in a 7D space whose components represent the average color, position and motion of the region. In the indexing phase of a video database, the 7D points obtained by segmentation for each half-second clip of the videos are assigned labels that specify the origin of the video clip. All the labeled points for all the clips are stored into a single binary tree for efficient k-nearest neighbor retrieval. The retrieval phase uses video segments as queries. Half-second clips of these queries are again segmented to produce sets of 7D points, and for each point the labels of its nearest neighbors are retrieved. The labels that receive the largest numbers of votes correspond to the clips that are the most similar to the query video segment.
Partitioning of Video Objects into Temporal Segments Using Local Motion Information
- Proceedings of IEEE ICIP Conference
, 2000
"... In this paper we propose an algorithm to partition MPEG-4 video objects into temporal segments with uniform activity levels. These segments can be used for efficient summarization of video objects and/or retrieval of video objects by their motion. The proposed algorithm is based on the shape and tex ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
In this paper we propose an algorithm to partition MPEG-4 video objects into temporal segments with uniform activity levels. These segments can be used for efficient summarization of video objects and/or retrieval of video objects by their motion. The proposed algorithm is based on the shape and texture coding modes of inter coded video object planes. Because the employed filtering techniques are simple and full decoding of the MPEG-4 bitstream is not required, the algorithm is computationally efficient. 1. Introduction Recent advancements in video coding technology have led to the emergence of the MPEG-4 [1] standard, which enables access to individual video objects within the video sequences. Consequently, object based video retrieval is replacing frame based video retrieval. Motion is one of the key low level features employed by video retrieval systems. Global motion of the video object that is represented by MPEG-4 can be easily obtained by extracting the location of the video ...
Shape-based retrieval of video objects
- IEEE Trans. Multimedia
, 2005
"... The increasing availability of object-based video content requires new technologies for automatically extracting and matching of the low level features of arbitrarily shaped video. This paper proposes methods for shape retrieval of arbitrarily shaped video objects. Our methods take into account not ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
The increasing availability of object-based video content requires new technologies for automatically extracting and matching of the low level features of arbitrarily shaped video. This paper proposes methods for shape retrieval of arbitrarily shaped video objects. Our methods take into account not only the still shape features but also the shape deformations that may occur in an object’s lifespan. In this paper, we compute the shape similarity of video objects by comparing the similarity of their representative temporal instances. We also describe motion of a video object via describing the deformations in an object’s shape. Experimental results show that our proposed methods offer very good retrieval performance and match closely with the human ranking. Keywords: object-based retrieval, MPEG-4, video databases, video retrieval, compressed-domain retrieval. EDICS: 4-FEAT, Feature Extraction and Representation
Video Content Modelling: An Overview
- In International Workshop on Frontiers of Information Technology
, 2003
"... This paper provides an overview of different video content modeling techniques employed in existing content-based video indexing and retrieval (CBVIR) systems. Based on the modeling requirements of a hypothetical (somewhat ideal) CBVIR system, we analyze and categorize existing modeling approaches. ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
This paper provides an overview of different video content modeling techniques employed in existing content-based video indexing and retrieval (CBVIR) systems. Based on the modeling requirements of a hypothetical (somewhat ideal) CBVIR system, we analyze and categorize existing modeling approaches. Starting with a review of techniques to model raw video data, we study approaches used to describe physical objects, and conclude with a review on high-level semantic modeling of data with focus on the multimodal analysis. Based on the current status of research in CBVIR systems, we identify the growth potential, future directions, and open research issues. Finally, a hypothetical CBVIR system is outlined in the concluding remarks, which exploits object-based representation of MPEG-4 compressed bitstream and uses multimodal features based on high-level description of video.

