Results 1 - 10
of
33
The MediaMill TRECVID 2006 semantic video search engine
- In Proceedings of the 4th TRECVID Workshop
, 2006
"... In this paper we describe our TRECVID 2006 experiments. The MediaMill team participated in two tasks: concept detection and search. For concept detection we use the MediaMill Challenge as experimental platform. The MediaMill Challenge divides the generic video indexing problem into a visual-only, te ..."
Abstract
-
Cited by 15 (8 self)
- Add to MetaCart
In this paper we describe our TRECVID 2006 experiments. The MediaMill team participated in two tasks: concept detection and search. For concept detection we use the MediaMill Challenge as experimental platform. The MediaMill Challenge divides the generic video indexing problem into a visual-only, textual-only, early fusion, late fusion, and combined analysis experiment. We provide a baseline implementation for each experiment together with baseline results, which we made available to the TRECVID community. The Challenge package was downloaded more than 80 times and we anticipate that it has been used by several teams for their 2006 submission. Our Challenge experiments focus specifically on visual-only analysis of video (run id: B MM). We extract image features, on global, regional,
Ontology-enriched semantic space for video search
- in ACM Multimedia
, 2007
"... Multimedia-based ontology construction and reasoning have recently been recognized as two important issues in video search, particularly for bridging semantic gap. The lack of coincidence between low-level features and user expectation makes concept-based ontology reasoning an attractive midlevel fr ..."
Abstract
-
Cited by 9 (4 self)
- Add to MetaCart
Multimedia-based ontology construction and reasoning have recently been recognized as two important issues in video search, particularly for bridging semantic gap. The lack of coincidence between low-level features and user expectation makes concept-based ontology reasoning an attractive midlevel framework for interpreting high-level semantics. In this paper, we propose a novel model, namely ontology-enriched semantic space (OSS), to provide a computable platform for modeling and reasoning concepts in a linear space. OSS enlightens the possibility of answering conceptual questions such as a high coverage of semantic space with minimal set of concepts, and the set of concepts to be developed for video search. More importantly, the query-to-concept mapping can be more reasonably conducted by guaranteeing the uniform and consistent comparison of concept scores for video search. We explore OSS for several tasks including conceptbased video search, word sense disambiguation and multimodality fusion. Our empirical findings show that OSS is a feasible solution to timely issues such as the measurement of concept combination and query-concept dependent fusion.
The MediaMill TRECVID 2007 Semantic Video Search Engine
"... In this paper we describe our TRECVID 2007 experiments. The MediaMill team participated in two tasks: concept detection and search. For concept detection we extract regionbased image features, on grid, keypoint, and segmentation level, which we combine with various supervised learners. In addition, ..."
Abstract
-
Cited by 8 (3 self)
- Add to MetaCart
In this paper we describe our TRECVID 2007 experiments. The MediaMill team participated in two tasks: concept detection and search. For concept detection we extract regionbased image features, on grid, keypoint, and segmentation level, which we combine with various supervised learners. In addition, we explore the utility of temporal image features. A late fusion approach of all region-based analysis methods using geometric mean was our most successful run. What is more, using MediaMill Challenge and LSCOM annotations, our visual-only approach generalizes to a set of 572 concept detectors. To handle such a large thesaurus in retrieval, an engine is developed which automatically selects a set of relevant concept detectors based on text matching, ontology querying, and visual concept likelihood. The suggestion engine is evaluated as part of the automatic search task and forms the entry point for our interactive search experiments. For this task we experiment with two browsers for interactive exploration: the well-known CrossBrowser and the novel ForkBrowser. It was found that, while retrieval performance varies substantially per topic, the ForkBrowser is able to produce the same overall results as the CrossBrowser. However, the ForkBrowser obtains top-performance for most topics with less user interaction. Indicating the potential of this browser for interactive search. Similar to previous years our best interactive search runs yield high overall performance, ranking 3rd and 4th. 1
M.: Investigating keyframe selection methods in the novel domain of passively captured visual lifelogs
- In: CIVR ’08: Proceedings of the 2008 international conference on Content-based image and video retrieval, Niagara Falls
, 2008
"... The SenseCam is a passive capture wearable camera, worn around the neck, and when worn continuously it takes an average of 1,900 images per day. It can be used to create a personal lifelog or visual recording of the wearer’s life which can be helpful as an aid to human memory. For such a large amoun ..."
Abstract
-
Cited by 6 (4 self)
- Add to MetaCart
The SenseCam is a passive capture wearable camera, worn around the neck, and when worn continuously it takes an average of 1,900 images per day. It can be used to create a personal lifelog or visual recording of the wearer’s life which can be helpful as an aid to human memory. For such a large amount of visual information to be useful, it needs to be structured into “events”, which can be achieved through automatic segmentation. An important component of this structuring process is the selection of keyframes to represent individual events. This work investigates a variety of techniques for the selection of a single representative keyframe image from each event, in order to provide the user with an instant visual summary of that event. In our experiments we use a large test set of 2,232 lifelog events collected by 5 users over a time period of one month each (equating to 194,857 images). We propose a novel keyframe selection technique which seeks to select the image with the highest “quality” as the keyframe. The inclusion of “quality ” approaches in keyframe selection is demonstrated to be useful owing to the high variability in image visual quality within passively captured image collections. 1.
Rijke. The search behavior of media professionals at an audiovisual archive: A transaction log analysis
- J. American Society for Information Science and Technology
"... Finding audiovisual material for reuse in new programs is an important activity for news producers, documentary makers, and other media professionals. Such professionals are typically served by an audiovisual broadcast archive. We report on a study of the transaction logs of one such archive. The an ..."
Abstract
-
Cited by 5 (3 self)
- Add to MetaCart
Finding audiovisual material for reuse in new programs is an important activity for news producers, documentary makers, and other media professionals. Such professionals are typically served by an audiovisual broadcast archive. We report on a study of the transaction logs of one such archive. The analysis includes an investigation of commercial orders made by the media professionals and a characterization of sessions, queries, and the content of terms recorded in the logs. One of our key findings is that there is a strong demand for short pieces of audiovisual material in the archive. In addition, while searchers are generally able to quickly navigate to a usable audiovisual broadcast, it takes them longer to place an order when purchasing a subsection of a broadcast than when purchasing an entire broadcast. Another key finding is that queries predominantly consist of (parts of) broadcast titles and of proper names. Our observations imply that it may be beneficial to increase support for finegrained access to audiovisual material, for example, through manual segmentation or content-based analysis.
Event Mining in Multimedia Streams
, 2008
"... Events are real-world occurrences that unfold over space and time. Event mining from multimedia streams improves the access and reuse of large media collections, and it has been an active area of research with notable recent progress. This paper contains a survey on the problems and solutions in eve ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
Events are real-world occurrences that unfold over space and time. Event mining from multimedia streams improves the access and reuse of large media collections, and it has been an active area of research with notable recent progress. This paper contains a survey on the problems and solutions in event mining, approached from three aspects: event description, event-modeling components, and current event mining systems. We present a general characterization of multimedia events, motivated by the maxim of five BW[s and one BH [ for reporting real-world events in journalism: when, where, who, what, why, and how. We discuss the causes for semantic variability in real-world descriptions, including multilevel
ABSTRACT The Importance of Query-Concept-Mapping for Automatic Video Retrieval
"... A new video retrieval paradigm of query-by-concept emerges recently. However, it remains unclear how to exploit the detected concepts in retrieval given a multimedia query. In this paper, we point out that it is important to map the query to a few relevant concepts instead of search with all concept ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
A new video retrieval paradigm of query-by-concept emerges recently. However, it remains unclear how to exploit the detected concepts in retrieval given a multimedia query. In this paper, we point out that it is important to map the query to a few relevant concepts instead of search with all concepts. In addition, we show that solving this problem through both text and image inputs are effective for search, and it is possible to determine the number of related concepts by a language modeling approach. Experimental evidence is obtained on the automatic search task of TRECVID 2006 using a large lexicon of 311 learned semantic concept detectors.
Measuring the impact of temporal context on video retrieval
- In CIVR 2008 - ACM International Conference on Image and Video Retrieval
, 2008
"... In this paper we describe the findings from the K-Space interactive video search experiments in TRECVid 2007, which examined the effects of including temporal context in video retrieval. The traditional approach to presenting video search results is to maximise recall by offering a user as many pote ..."
Abstract
-
Cited by 4 (4 self)
- Add to MetaCart
In this paper we describe the findings from the K-Space interactive video search experiments in TRECVid 2007, which examined the effects of including temporal context in video retrieval. The traditional approach to presenting video search results is to maximise recall by offering a user as many potentially relevant shots as possible within a limited amount of time. ‘Context’-oriented systems opt to allocate a portion of the results presentation space to providing additional contextual cues about the returned results. In video retrieval these cues often include temporal information such as a shot’s location within the overall video broadcast and/or its neighbouring shots. We developed two interfaces with identical retrieval functionality in order to measure the effects of such context on user performance. The first system had a ‘recall-oriented ’ interface, where results from a query were presented as a ranked list of shots. The second was ‘contextoriented’, with results presented as a ranked list of broadcasts. 10 users participated in the experiments, of which 8 were novices and 2 experts. Participants completed a number of retrieval topics using both the recall-oriented and context-oriented systems.
Web-based information content and its application to concept-based video retrieval
- In ACM CIVR
, 2008
"... Semantic similarity between words or phrases is frequently used to find matching correlations between search queries and documents when straightforward matching of terms fails. This is particularly important for searching in visual databases, where pictures or video clips have been automatically tag ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
Semantic similarity between words or phrases is frequently used to find matching correlations between search queries and documents when straightforward matching of terms fails. This is particularly important for searching in visual databases, where pictures or video clips have been automatically tagged with a small set of semantic concepts based on analysis and classification of the visual content. Here, the textual description of documents is very limited, and semantic similarity based on WordNet’s cognitive synonym structure, along with information content derived from term frequencies, can help to bridge the gap between an arbitrary textual query and a limited vocabulary of visual concepts. This approach, termed concept-based retrieval, has received significant attention over the last few years, and its success is highly dependent on the quality of the similarity
Semantic annotation and retrieval of video events using multimedia ontologies
"... Effective usage of multimedia digital libraries has to deal with the problem of building efficient content annotation and retrieval tools. In this paper Multimedia Ontologies, that include both linguistic and dynamic visual ontologies, are presented and their implementation for soccer video domain i ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
Effective usage of multimedia digital libraries has to deal with the problem of building efficient content annotation and retrieval tools. In this paper Multimedia Ontologies, that include both linguistic and dynamic visual ontologies, are presented and their implementation for soccer video domain is shown. The structure of the proposed ontology itself, together with reasoning, can be used to perform higher-level annotation of the clips, to generate complex queries that comprise actions and their temporal evolutions and relations and to create extended text commentaries of video sequences. 1.

