Results 11 - 20
of
70
The MediaMill TRECVID 2006 semantic video search engine
- In Proceedings of the 4th TRECVID Workshop
, 2006
"... In this paper we describe our TRECVID 2006 experiments. The MediaMill team participated in two tasks: concept detection and search. For concept detection we use the MediaMill Challenge as experimental platform. The MediaMill Challenge divides the generic video indexing problem into a visual-only, te ..."
Abstract
-
Cited by 22 (8 self)
- Add to MetaCart
(Show Context)
In this paper we describe our TRECVID 2006 experiments. The MediaMill team participated in two tasks: concept detection and search. For concept detection we use the MediaMill Challenge as experimental platform. The MediaMill Challenge divides the generic video indexing problem into a visual-only, textual-only, early fusion, late fusion, and combined analysis experiment. We provide a baseline implementation for each experiment together with baseline results, which we made available to the TRECVID community. The Challenge package was downloaded more than 80 times and we anticipate that it has been used by several teams for their 2006 submission. Our Challenge experiments focus specifically on visual-only analysis of video (run id: B MM). We extract image features, on global, regional,
Association and Temporal Rule Mining for Post-Filtering of Semantic Concept Detection in Video
"... Abstract—Automatic semantic concept detection in video is important for effective content-based video retrieval and mining and has gained great attention recently. In this paper, we propose a general post-filtering framework to enhance robustness and accuracy of semantic concept detection using asso ..."
Abstract
-
Cited by 20 (1 self)
- Add to MetaCart
(Show Context)
Abstract—Automatic semantic concept detection in video is important for effective content-based video retrieval and mining and has gained great attention recently. In this paper, we propose a general post-filtering framework to enhance robustness and accuracy of semantic concept detection using association and temporal analysis for concept knowledge discovery. Co-occurrence of several semantic concepts could imply the presence of other concepts. We use association mining techniques to discover such inter-concept association relationships from annotations. With discovered concept association rules, we propose a strategy to combine associated concept classifiers to improve detection accuracy. In addition, because video is often visually smooth and semantically coherent, detection results from temporally adjacent shots could be used for the detection of the current shot. We propose temporal filter designs for inter-shot temporal dependency mining to further improve detection accuracy. Experiments on the TRECVID 2005 dataset show our post-filtering framework is both efficient and effective in improving the accuracy of semantic concept detection in video. Furthermore, it is easy to integrate our framework with existing classifiers to boost their performance. Index Terms—Semantic concept detection, association rule mining, temporal rule mining, post-filtering, content-based video retrieval and mining. I.
THE MEDIAMILL SEMANTIC VIDEO SEARCH ENGINE
"... www.mediamill.nl In this paper we present the methods underlying the MediaMill semantic video search engine. The basis for the engine is a semantic indexing process which is currently based on a lexicon of 491 concept detectors. To support the user in navigating the collection, the system defines a ..."
Abstract
-
Cited by 18 (0 self)
- Add to MetaCart
(Show Context)
www.mediamill.nl In this paper we present the methods underlying the MediaMill semantic video search engine. The basis for the engine is a semantic indexing process which is currently based on a lexicon of 491 concept detectors. To support the user in navigating the collection, the system defines a visual similarity space, a semantic similarity space, a semantic thread space, and browsers to explore them. We compare the different browsers and their utility within the TRECVID benchmark. In 2005, We obtained a top-3 result for 19 out of 24 search topics. In 2006 for 14 out of 24. Index Terms — Video indexing, visualization, retrieval, performance evaluation, semantic threads.
The MediaMill TRECVID 2009 Semantic Video Search Engine
"... In this paper we describe our TRECVID 2009 video retrieval experiments. The MediaMill team participated in three tasks: concept detection, automatic search, and interactive search. Starting point for the MediaMill concept detection approach is our top-performing bag-of-words system of last year, whi ..."
Abstract
-
Cited by 17 (6 self)
- Add to MetaCart
(Show Context)
In this paper we describe our TRECVID 2009 video retrieval experiments. The MediaMill team participated in three tasks: concept detection, automatic search, and interactive search. Starting point for the MediaMill concept detection approach is our top-performing bag-of-words system of last year, which uses multiple color descriptors, codebooks with soft-assignment, and kernel-based supervised learning. We improve upon this baseline system by exploring two novel research directions. Firstly, we study a multi-modal extension by the inclusion of 20 audio concepts and fusing using two novel multi-kernel supervised learning methods. Secondly, with the help of recently proposed algorithmic refinements of bag-of-words, a bag-of-words GPU implementation, and compute clusters, we scale-up the amount of visual information analyzed by an order of magnitude, to a total of 1,000,000 i-frames. Our experiments evaluate the merit of these new components, ultimately leading to 64 robust concept detectors for video retrieval. For retrieval, a robust but limited set of concept detectors necessitates the need to rely on as many auxiliary information channels as possible. For automatic search we therefore explore how we can learn to rank various information channels simultaneously to maximize video search results for a given topic. To improve the video retrieval results further, our interactive search experiments investigate the roles of visualizing preview results for a certain browse-dimension and relevance feedback mechanisms that learn to solve complex search topics by analysis from user browsing behavior. The 2009 edition of the TRECVID benchmark has again been a fruitful participation for the MediaMill team, resulting in the top ranking for both concept detection and interactive search.
Stages as models of scene geometry
- IEEE Transactions on Pattern Analysis and Machine Intelligence, (in press
"... Abstract—Reconstruction of 3D scene geometry is an important element for scene understanding, autonomous vehicle and robot navigation, visual inspection and 3D television. We propose accounting for the inherent structure of the visual world when trying to solve the scene reconstruction problem. Cons ..."
Abstract
-
Cited by 16 (0 self)
- Add to MetaCart
(Show Context)
Abstract—Reconstruction of 3D scene geometry is an important element for scene understanding, autonomous vehicle and robot navigation, visual inspection and 3D television. We propose accounting for the inherent structure of the visual world when trying to solve the scene reconstruction problem. Consequently, we identify scene categorization as the first step towards robust and efficient depth estimation from single images. We introduce 15 typical 3D scene geometries called stages, each with a unique depth profile, which roughly correspond to a large majority of all images. Stage information serves as the first approximation of global depth, narrowing down the search space in depth estimation and object localization. We propose different sets of low-level features for depth estimation, and perform stage classification on two diverse datasets of television broadcasts. Classification results demonstrate that stages can be efficiently learned from low-dimensional image representations. Index Terms—scene geometry, scene structure, depth estimation, scene categorization, stages 1
Depth Information by Stage Classification
, 2007
"... Recently, methods for estimating 3D scene geometry or absolute scene depth information from 2D image content have been proposed. However, general applicability of these methods in depth estimation may not be realizable, as inconsistencies may be introduced due to a large variety of possible pictoria ..."
Abstract
-
Cited by 15 (3 self)
- Add to MetaCart
(Show Context)
Recently, methods for estimating 3D scene geometry or absolute scene depth information from 2D image content have been proposed. However, general applicability of these methods in depth estimation may not be realizable, as inconsistencies may be introduced due to a large variety of possible pictorial content. We identify scene categorization as the first step towards efficient and robust depth estimation from single images. To that end, we describe a limited number of typical 3D scene geometries, called stages, each having a unique depth pattern and thus providing a specific context for stage objects. This type of scene information narrows down the possibilities with respect to individual objects ’ locations, scales and identities. We show how these stage types can be efficiently learned and how they can lead to robust extraction of depth information. Our results indicate that stages without much variation and object clutter can be detected robustly, with up to 60 % success rate.
Rolenet: Movie analysis from the perspective of social networks
- IEEE Transactions on Multimedia
, 2009
"... Abstract—With the idea of social network analysis, we propose a novel way to analyze movie videos from the perspective of social relationships rather than audiovisual features. To appropriately describe role’s relationships in movies, we devise a method to quantify relations and construct role’s soc ..."
Abstract
-
Cited by 11 (0 self)
- Add to MetaCart
(Show Context)
Abstract—With the idea of social network analysis, we propose a novel way to analyze movie videos from the perspective of social relationships rather than audiovisual features. To appropriately describe role’s relationships in movies, we devise a method to quantify relations and construct role’s social networks, called RoleNet. Based on RoleNet, we are able to perform semantic analysis that goes beyond conventional feature-based approaches. In this work, social relations between roles are used to be the context information of video scenes, and leading roles and the corresponding communities can be automatically determined. The results of community identification provide new alternatives in media management and browsing. Moreover, by describing video scenes with role’s context, social-relation-based story segmentation method is developed to pave a new way for this widely-studied topic. Experimental results show the effectiveness of leading role determination and community identification. We also demonstrate that the social-based story segmentation approach works much better than the conventional tempo-based method. Finally, we give extensive discussions and state that the proposed ideas provide insights into context-based video analysis. Index Terms—Community analysis, movie understanding, social network analysis, story segmentation. I.
The MediaMill TRECVID 2007 Semantic Video Search Engine
"... In this paper we describe our TRECVID 2007 experiments. The MediaMill team participated in two tasks: concept detection and search. For concept detection we extract regionbased image features, on grid, keypoint, and segmentation level, which we combine with various supervised learners. In addition, ..."
Abstract
-
Cited by 11 (3 self)
- Add to MetaCart
(Show Context)
In this paper we describe our TRECVID 2007 experiments. The MediaMill team participated in two tasks: concept detection and search. For concept detection we extract regionbased image features, on grid, keypoint, and segmentation level, which we combine with various supervised learners. In addition, we explore the utility of temporal image features. A late fusion approach of all region-based analysis methods using geometric mean was our most successful run. What is more, using MediaMill Challenge and LSCOM annotations, our visual-only approach generalizes to a set of 572 concept detectors. To handle such a large thesaurus in retrieval, an engine is developed which automatically selects a set of relevant concept detectors based on text matching, ontology querying, and visual concept likelihood. The suggestion engine is evaluated as part of the automatic search task and forms the entry point for our interactive search experiments. For this task we experiment with two browsers for interactive exploration: the well-known CrossBrowser and the novel ForkBrowser. It was found that, while retrieval performance varies substantially per topic, the ForkBrowser is able to produce the same overall results as the CrossBrowser. However, the ForkBrowser obtains top-performance for most topics with less user interaction. Indicating the potential of this browser for interactive search. Similar to previous years our best interactive search runs yield high overall performance, ranking 3rd and 4th. 1
Measuring concept similarities in multimedia ontologies: Analysis and evaluations
- IEEE Transactions on Multimedia
, 2007
"... Abstract—The recent development of large-scale multimedia concept ontologies has provided a new momentum for research in the semantic analysis of multimedia repositories. Different methods for generic concept detection have been extensively studied, but the question of how to exploit the structure o ..."
Abstract
-
Cited by 9 (2 self)
- Add to MetaCart
(Show Context)
Abstract—The recent development of large-scale multimedia concept ontologies has provided a new momentum for research in the semantic analysis of multimedia repositories. Different methods for generic concept detection have been extensively studied, but the question of how to exploit the structure of a multimedia ontology and existing inter-concept relations has not received similar attention. In this paper, we present a clustering-based method for modeling semantic concepts on low-level feature spaces and study the evaluation of the quality of such models with entropy-based methods. We cover a variety of methods for assessing the similarity of different concepts in a multimedia ontology. We study three ontologies and apply the proposed techniques in experiments involving the visual and semantic similarities, manual annotation of video, and concept detection. The results show that modeling inter-concept relations can provide a promising resource for many different application areas in semantic multimedia processing. Index Terms—Clustering-based analysis, concept detection, inter-concept relations, multimedia ontology. I.