• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations

Overview of Video-CLEF 2009: New perspectives on speech-based multimedia content enrichment. (2010)

by M Larson, E Newman, G J F, G J F Jones
Venue:In C.Peters
Add To MetaCart

Tools

Sorted by:
Results 1 - 7 of 7

Crowdsourcing for affective annotation of video: Development of a viewer-reported boredom corpus.

by Mohammad Soleymani , Martha Larson - In SIGIR 2010 Workshop on Crowdsourcing for Search Evaluation (CSE , 2010
"... ABSTRACT Predictions of viewer affective response to video are an important source of information that can be used to enhance the performance of multimedia retrieval and recommendation systems. The development of algorithms for robust prediction of viewer affective response requires corpora accompa ..."
Abstract - Cited by 26 (5 self) - Add to MetaCart
ABSTRACT Predictions of viewer affective response to video are an important source of information that can be used to enhance the performance of multimedia retrieval and recommendation systems. The development of algorithms for robust prediction of viewer affective response requires corpora accompanied by appropriate ground truth. We report on the development a new corpus to be used to evaluate algorithms for prediction of viewer-reported boredom. We make use of crowdsourcing in order to address two shortcomings of previous affective video corpora: small number of annotators and gap between annotators and target viewer group. We describe the design of the Mechanical Turk setup that we used to generate the affective annotations for the corpus. We discuss specific issues that arose and how we resolve them and then present an analysis of the annotations collected. The paper closes with a list of recommended practices for the collection of self-reported affective annotations using crowdsourcing techniques and an outlook on future work.
(Show Context)

Citation Context

...tems. The research in the field of multimedia content analysis for affective understanding of videos lacks significant user studies and only relies on the feedback from limited number of participants =-=[5]-=-[10][13]. Multimedia corpora with affective annotations make it possible to investigate interesting research questions and develop useful algorithms, but are time-consuming to generate. The number of ...

Content-Based Analysis Improves Audiovisual Archive Retrieval

by Bouke Huurnink, Cees G. M. Snoek, Maarten de Rijke, Arnold W. M. Smeulders , 2012
"... Content-based video retrieval is maturing to the point where it can be used in real-world retrieval practices. One such practice is the audiovisual archive, whose users increasingly require fine-grained access to broadcast television content. In this paper, we take into account the information needs ..."
Abstract - Cited by 1 (0 self) - Add to MetaCart
Content-based video retrieval is maturing to the point where it can be used in real-world retrieval practices. One such practice is the audiovisual archive, whose users increasingly require fine-grained access to broadcast television content. In this paper, we take into account the information needs and retrieval data already present in the audiovisual archive, and demonstrate that retrieval performance can be significantly improved when content-based methods are applied to search. Tothebestofour knowledge, this is the first time that the practice of an audiovisual archive has been taken into account for quantitative retrieval evaluation. To arrive at our main result, we propose an evaluation methodology tailored to the specific needs and circumstances of the audiovisual archive, which are typically missed by existing evaluation initiatives. We utilize logged searches, content purchases, session information, and simulators to create realistic query sets and relevance judgments. To reflect the retrieval practice of both the archive and the video retrieval community as closely as possible, our experiments with three video search engines incorporate archive-created catalog entries as well as state-of-the-art multimedia content analysis results. A detailed query-level analysis indicates that individual content-based retrieval methods such as transcript-based retrieval and concept-based retrieval yield approximately equal performance gains. When combined, we find that content-based video retrieval incorporated into the archive’s practice results in significant performance increases for shot retrieval and for retrieving entire television programs. The time has come for audiovisual archives to start accommodating content-based video retrieval methods into their daily practice.
(Show Context)

Citation Context

...tomatically generated, labels. Then, we measure the effect of combining them for queries typical of professionals searching an archive. Existing evaluation initiatives such as TRECVID [38], VideoCLEF =-=[25]-=-, and Mediaeval [26] have been a valuable instigator in the advancement of techniques for content-based video retrieval. However, they are unsuited to assessing the potential impact of such techniques...

Exploiting Speech Recognition Transcripts for Narrative Peak Detection in Short-Form Documentaries

by Martha Larson, Bart Jochems, Ewine Smits, Roeland Ordelman
"... Narrative peaks are points at which the viewer perceives a spike in the level of dramatic tension within the narrative flow of a video. This paper reports on four approaches to narrative peak detection in television documentaries that were developed by a joint team consisting of mem-bers from Delft ..."
Abstract - Cited by 1 (1 self) - Add to MetaCart
Narrative peaks are points at which the viewer perceives a spike in the level of dramatic tension within the narrative flow of a video. This paper reports on four approaches to narrative peak detection in television documentaries that were developed by a joint team consisting of mem-bers from Delft University of Technology and the University of Twente within the framework of the VideoCLEF 2009 Affect Detection task. The approaches make use of speech recog-nition transcripts and seek to exploit various sources of evidence in order to automatically identify narrative peaks. These sources include speaker style (word choice), stylistic devices (use of repetitions), strategies strengthening viewers ’ feelings of involvement (direct audience address) and emotional speech. These approaches are compared to a challenging baseline that predicts the presence of narrative peaks at fixed points in the video, presumed to be dictated by natural narrative rhythm or production convention. Two approaches are tied in delivering top narrative peak detection results. One uses counts of first and second person pronouns to identify points in the video where viewers feel most directly involved. The other uses affective word ratings to calculate scores reflecting emotional language.
(Show Context)

Citation Context

...reated according to different experimental conditions) could be submitted. Further details about the data set and the Affect Detection task for VideoCLEF 2009 can be found in the track overview paper =-=[4]-=-. Participants were provided with additional resources accompanying the test data, including transcripts generated by an automatic speech recognition system [3]. Our approaches, described in the next ...

Identification of Narrative Peaks in Video Clips: Text Features Perform Best

by Joep Johannes Maria, Joep J. M. Kierkels, Mohammad Soleymani
"... A methodology is proposed to identify narrative peaks in video clips. Three basic clip properties are evaluated which reflect on video, audio and text related features in the clip. Furthermore, the expected distribution of narrative peaks throughout the clip is determined and exploited for future pr ..."
Abstract - Add to MetaCart
A methodology is proposed to identify narrative peaks in video clips. Three basic clip properties are evaluated which reflect on video, audio and text related features in the clip. Furthermore, the expected distribution of narrative peaks throughout the clip is determined and exploited for future predictions. Results show that only the text related feature, related to the usage of distinct words throughout the clip, and the expected peak-distribution are of use when finding the peaks. On the training set, our best detector had an accuracy of 47 % in finding narrative peaks. On the test set, this accuracy dropped to 24%.
(Show Context)

Citation Context

...e will attempt to compare and show results for all three modalities.sThe proposed methodology to identify narrative peaks in video clips was presentedsat VideoCLEF 2009 subtask on “Affect and Appeal” =-=[4]-=-. The clips that were given insthis subtask were all taken from a Dutch program called “Beeldenstorm”. They weresin Dutch, had durations between seven and nine minutes, consisted of video andsaudio, a...

Narrative Peak Detection in Short-Form Documentaries Using Speech Recognition Transcripts

by Bart Jochems
"... Narrative peaks are points at which the viewer perceives a spike in the level of dramatic tension within the narrative flow of a video. In this paper we describe two approaches for automatic identification of narrative peaks in short-form documentaries, within the framework of the VideoCLEF 2009 Aff ..."
Abstract - Add to MetaCart
Narrative peaks are points at which the viewer perceives a spike in the level of dramatic tension within the narrative flow of a video. In this paper we describe two approaches for automatic identification of narrative peaks in short-form documentaries, within the framework of the VideoCLEF 2009 Affect Detection task. Both approaches exploit the speech recognition transcript in order to identify the narrative peaks. Our first approach is based on the idea that speech rate increases with high arousal and intensity and the second approach on the idea that narrative peaks are perceived where particular lexical items are used. These approaches are compared to a challenging baseline that predicts the presence of narrative peaks at fixed points in the video, presumed to be dictated by natural narrative rhythm or production convention. The second approach easily outperforms the challenge baseline while the first approach failed to achieve the performance level of the random baseline detector.
(Show Context)

Citation Context

...cipants were required to identify the three highest peaks in each episode. Further details about the data set and the Affect Detection task for VideoCLEF 2009 can be found in the track overview paper =-=[3]-=-. Participants were provided with speech transcripts generated by an automatic speech recognition system [4].2. AFFECTIVE LEVEL OF VIDEO CONTENT In video content there are two basic levels of percept...

Identification of Narrative Peaks in Video Clips: Text Features Perform Best

by Joep J. M. Kierkels, Mohammad Soleymani, Thierry Pun
"... Abstract. A methodology is proposed to identify narrative peaks in video clips. Three basic clip properties are evaluated which reflect on video, audio and text related features in the clip. Furthermore, the expected distribution of narrative peaks throughout the clip is determined and exploited for ..."
Abstract - Add to MetaCart
Abstract. A methodology is proposed to identify narrative peaks in video clips. Three basic clip properties are evaluated which reflect on video, audio and text related features in the clip. Furthermore, the expected distribution of narrative peaks throughout the clip is determined and exploited for future predictions. Results show that only the text related feature, related to the usage of distinct words throughout the clip, and the expected peak-distribution are of use when finding the peaks. On the training set, our best detector had an accuracy of 47% in finding narrative peaks. On the test set, this accuracy dropped to 24%. 1
(Show Context)

Citation Context

...e will attempt to compare and show results for all three modalities. The proposed methodology to identify narrative peaks in video clips was presented at VideoCLEF 2009 subtask on “Affect and Appeal” =-=[4]-=-. The clips that were given in this subtask were all taken from a Dutch program called “Beeldenstorm”. They were in Dutch, had durations between seven and nine minutes, consisted of video and audio, a...

Chemnitz at VideoCLEF 2009: Experiments and Observations on Treating Classification as IR Task

by Maximilian Eibl
"... [ jens.kuersten ∣ maximilian.eibl] at cs.tu-chemnitz.de This paper describes the participation of the Chemnitz University of Technology in the Video-CLEF 2009 classification task. Our motivation lies in its close relation to our research project sachsMedia1. In our second participation in the task w ..."
Abstract - Add to MetaCart
[ jens.kuersten ∣ maximilian.eibl] at cs.tu-chemnitz.de This paper describes the participation of the Chemnitz University of Technology in the Video-CLEF 2009 classification task. Our motivation lies in its close relation to our research project sachsMedia1. In our second participation in the task we experimented with treating the task as IR problem and used the Xtrieval framework [3] to run our experiments. We proposed a auto-matic threshold estimation to limit the number of documents per label since too many returned documents hurt the overall correct classification rate. Although the experimental setup was en-hanced this year and the data sets were changed we found that the IR approach still works quite well. Our query expansion approach performed better than the baseline experiments in terms of mean average precision. We also showed that combining the ASR transcriptions and the archival metadata improves the classification performance, unless query expansion is used in the retrieval phase.
(Show Context)

Citation Context

... and its configuration that was used for our participation in the VideoCLEF classification task. The task was to categorize dual-language video into 46 given classes based on provided ASR transcripts =-=[5]-=- and additional archival metadata. In a mandatory experiment only the ASR transcripts of the videos had to be used as source for classification. Furthermore each of the given video documents can have ...

Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University