Results 1 -
9 of
9
A Formal Study of Shot Boundary Detection
- Circuit and Systems For Video Technology, 2007, IEEE Transaction on
, 2007
"... Abstract—This paper conducts a formal study of the shot boundary detection problem. First, a general formal framework of shot boundary detection techniques is proposed. Three critical techniques, i.e., the representation of visual content, the construction of continuity signal and the classification ..."
Abstract
-
Cited by 12 (3 self)
- Add to MetaCart
Abstract—This paper conducts a formal study of the shot boundary detection problem. First, a general formal framework of shot boundary detection techniques is proposed. Three critical techniques, i.e., the representation of visual content, the construction of continuity signal and the classification of continuity values, are identified and formulated in the perspective of pattern recognition. Meanwhile, the major challenges to the framework are identified. Second, a comprehensive review of the existing approaches is conducted. The representative approaches are categorized and compared according to their roles in the formal framework. Based on the comparison of the existing approaches, optimal criteria for each module of the framework are discussed, which will provide practical guide for developing novel methods. Third, with all the above issues considered, we present a unified shot boundary detection system based on graph partition model. Extensive experiments are carried out on the platform of TRECVID. The experiments not only verify the optimal criteria discussed above, but also show that the proposed approach is among the best in the evaluation of TRECVID 2005. Finally, we conclude the paper and present some further discussions on what shot boundary detection can learn from other related fields. Index Terms—Formal framework, graph partition model, multiresolution analysis, shot boundary detection, support vector machine (SVM). I.
Motion Analysis and Segmentation through Spatio-temporal Slices Processing
- IEEE Trans. Image Processing
, 2003
"... This paper presents new approaches in characterizing and segmenting the content of video. These approaches are developed based upon the pattern analysis of spatio-temporal slices. While traditional approaches to motion sequence analysis tend to formulate computational methodologies on two or three a ..."
Abstract
-
Cited by 11 (1 self)
- Add to MetaCart
This paper presents new approaches in characterizing and segmenting the content of video. These approaches are developed based upon the pattern analysis of spatio-temporal slices. While traditional approaches to motion sequence analysis tend to formulate computational methodologies on two or three adjacent frames, spatio-temporal slices provide rich visual patterns along a larger temporal scale. In this paper, we first describe a motion computation method based on a structure tensor formulation. This method encodes visual patterns of spatio-temporal slices in a tensor histogram, on one hand, characterizing the temporal changes of motion over time, on the other hand, describing the motion trajectories of di#erent moving objects. By analyzing the tensor histogram of an image sequence, we can temporally segment the sequence into several motion coherent sub-units, in addition, spatially segment the sequence into various motion layers. The temporal segmentation of image sequences expeditiously facilitates the motion annotation and content representation of a video, while the spatial decomposition of image sequences leads to a prominent way of reconstructing background panoramic images and computing foreground objects.
Video summarization and scene detection by graph modeling
- IEEE Trans. Circuits Syst. Video Technol
, 2005
"... Abstract—In this paper, we propose a unified approach for video summarization based on the analysis of video structures and video highlights. Two major components in our approach are scene modeling and highlight detection. Scene modeling is achieved by normalized cut algorithm and temporal graph ana ..."
Abstract
-
Cited by 11 (0 self)
- Add to MetaCart
Abstract—In this paper, we propose a unified approach for video summarization based on the analysis of video structures and video highlights. Two major components in our approach are scene modeling and highlight detection. Scene modeling is achieved by normalized cut algorithm and temporal graph analysis, while highlight detection is accomplished by motion attention modeling. In our proposed approach, a video is represented as a complete undirected graph and the normalized cut algorithm is carried out to globally and optimally partition the graph into video clusters. The resulting clusters form a directed temporal graph and a shortest path algorithm is proposed to efficiently detect video scenes. The attention values are then computed and attached to the scenes, clusters, shots, and subshots in a temporal graph. As a result, the temporal graph can inherently describe the evolution and perceptual importance of a video. In our application, video summaries that emphasize both content balance and perceptual quality can be generated directly from a temporal graph that embeds both the structure and attention information. Index Terms—Attention model, normalized cut, scene modeling, video summarization. I.
Motion driven approaches to shot boundary detection, low-level feature extraction and BBC rush characterization at TRECVID 2005. TRECVID proceedings
, 2005
"... This paper describes our experimental results on shot boundary detection (SB), low-level feature extraction (LLF), and BBC Rushes exploration (BR) at TRECVID 2005. The approaches presented in this paper are mostly based on our previous works [1, 2, 3] grounded on motion analysis with spatio-temporal ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
This paper describes our experimental results on shot boundary detection (SB), low-level feature extraction (LLF), and BBC Rushes exploration (BR) at TRECVID 2005. The approaches presented in this paper are mostly based on our previous works [1, 2, 3] grounded on motion analysis with spatio-temporal slices, optical flows and tensor representation. This year, our aim is to explore and investigate the role of motion in various fundamental tasks including video structuring and characterization for both the edited (in SB and LLF) and unedited (in BR) videos. In SB (system C), we exploit the coherence and patterns of motion texture in spatio-temporal slices for boundary detection and classification. The cut and wipe detectors are based on our work in [1] which performs color-texture segmentation on three slices extracted from videos to determine boundaries. The dissolve detector is based on our work in [3] which is composed of two steps: multi-resolution cut detection and binary classification with Gabor features. We submit 10 runs, depending on the size of training data, flashlight detection capability, and additional statistical features (in addition to Gabor) for classification. Overall, the runs with additional
Detection of documentary scene changes by audio-visual fusion
- In Proc. CIVR
, 2004
"... Abstract. The concept of a documentary scene was inferred from the audio-visual characteristics of certain documentary videos. It was observed that the amount of information from the visual component alone was not enough to convey a semantic context to most portions of these videos, but a joint obse ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
Abstract. The concept of a documentary scene was inferred from the audio-visual characteristics of certain documentary videos. It was observed that the amount of information from the visual component alone was not enough to convey a semantic context to most portions of these videos, but a joint observation of the visual component and the audio component conveyed a better semantic context. From the observations that we made on the video data, we generated an audio score and a visual score. We later generated a weighted audio-visual score within an interval and adaptively expanded or shrunk this interval until we found a local maximum score value. The video ultimately will be divided into a set of intervals that correspond to the documentary scenes in the video. After we obtained a set of documentary scenes, we made a check for any redundant detections. 1
EMD-Based Video Clip Retrieval by Many-to-Many Matching
"... Abstract. This paper presents a new approach for video clip retrieval based on Earth Mover’s Distance (EMD). Instead of imposing one-to-one matching constraint as in [11, 14], our approach allows many-to-many matching methodology and is capable of tolerating errors due to video partitioning and vari ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Abstract. This paper presents a new approach for video clip retrieval based on Earth Mover’s Distance (EMD). Instead of imposing one-to-one matching constraint as in [11, 14], our approach allows many-to-many matching methodology and is capable of tolerating errors due to video partitioning and various video editing effects. We formulate clip-based retrieval as a graph matching problem in two stages. In the first stage, to allow the matching between a query and a long video, an online clip segmentation algorithm is employed to rapidly locate candidate clips for similarity measure. In the second stage, a weighted graph is constructed to model the similarity between two clips. EMD is proposed to compute the minimum cost of the weighted graph as the similarity between two clips. Experimental results show that the proposed approach is better than some existing methods in term of ranking capability. 1
Video Segmentation and Indexing Using Motion Estimation
, 2004
"... Video indexing is a central component necessary to facilitate efficient content-based retrieval and browsing of visual information stored in large multimedia databases. This thesis presents work towards a unified framework for automated video indexing. To create an efficient index, a set of represen ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Video indexing is a central component necessary to facilitate efficient content-based retrieval and browsing of visual information stored in large multimedia databases. This thesis presents work towards a unified framework for automated video indexing. To create an efficient index, a set of representative key frames are selected which capture and encapsulate the entire video content. This is achieved by, firstly, segmenting the video into its constituent shots and, secondly, selecting an optimal number of frames between the identified shot boundaries. The segmentation algorithm is designed to detect both abrupt shot transitions, or cuts, and gradual transitions, such as dissolves and fades. This is achieved by means of a two-component frame differencing metric taking both image structure and colour distributions into account. The application of hierarchical block-based normalised correlation and local colour histogram differences leads to a method which is both accurate and robust.
Automatic Video Content Analysis and Management
"... The principal goal of this proposal is to design an automatic video processing architecture that will enable us to design and develop efficient multimedia applications such as summarization, indexing, retrieval and others. While there are many research projects in this area, many, if not all, rely o ..."
Abstract
- Add to MetaCart
The principal goal of this proposal is to design an automatic video processing architecture that will enable us to design and develop efficient multimedia applications such as summarization, indexing, retrieval and others. While there are many research projects in this area, many, if not all, rely on MPEG-7 for high-level semantic retrieval. The proposed architecture is compatible with MPEG-7 in that the proposed architecture can be annotated using the MPEG-7 standard, but we emphasize on user relevance feedback in the high-level semantic retrieval task. The proposed research is built upon a substantial body of preliminary work that serves as proof of concept for the proposed architecture and other techniques. The preliminary work includes the development of automatic clustering technique using Delaunay Triangulation, automatic video summary generation and automatic shot boundary detection. We address the problem of automatic video indexing and organization and video information retrieval through user interaction at multiple levels. Our work focuses on the following three important areas: (1) temporal video segmentation; (2) construction of
HIGH PERFORMANCE RECORD LINKAGE
"... In current world, the immense size of a data set makes problems in finding similar/identitcal data. In addition, the dirtiness of data, i.e. typos, missing/tilting information, and additional noises usually occurred by careless editing or entry mistakes, makes further difficulty to identify entity-b ..."
Abstract
- Add to MetaCart
In current world, the immense size of a data set makes problems in finding similar/identitcal data. In addition, the dirtiness of data, i.e. typos, missing/tilting information, and additional noises usually occurred by careless editing or entry mistakes, makes further difficulty to identify entity-belongs. Therefore, we focus on the faster detection of data referring the same real-world entity from a large size data set under the error prone environments, while the high accuracy of detection is maintained. In this thesis, we study high-performance linkage algorithms using four different applications. First, we introduce the image linkage algorithm to find near-duplicate images with similar characteristics by bridging two seemingly unrelated fields – Multimedia Information Retrieval and Biology. Under this idea, we study how various image features and gene sequence generation methods affect the accuracy and performance of detecting near-duplicate images. Second, we develop the video linkage algorithm using record linkage methods to detect copied videos from a large multi-media database or sites such as YouTube and Yahoo Videos. The utilization of video characteristics is reflected to the hierarchical structure of

