Results 1 -
7 of
7
Crowdsourcing for affective annotation of video: Development of a viewer-reported boredom corpus.
- In SIGIR 2010 Workshop on Crowdsourcing for Search Evaluation (CSE
, 2010
"... ABSTRACT Predictions of viewer affective response to video are an important source of information that can be used to enhance the performance of multimedia retrieval and recommendation systems. The development of algorithms for robust prediction of viewer affective response requires corpora accompa ..."
Abstract
-
Cited by 26 (5 self)
- Add to MetaCart
(Show Context)
ABSTRACT Predictions of viewer affective response to video are an important source of information that can be used to enhance the performance of multimedia retrieval and recommendation systems. The development of algorithms for robust prediction of viewer affective response requires corpora accompanied by appropriate ground truth. We report on the development a new corpus to be used to evaluate algorithms for prediction of viewer-reported boredom. We make use of crowdsourcing in order to address two shortcomings of previous affective video corpora: small number of annotators and gap between annotators and target viewer group. We describe the design of the Mechanical Turk setup that we used to generate the affective annotations for the corpus. We discuss specific issues that arose and how we resolve them and then present an analysis of the annotations collected. The paper closes with a list of recommended practices for the collection of self-reported affective annotations using crowdsourcing techniques and an outlook on future work.
Content-Based Analysis Improves Audiovisual Archive Retrieval
, 2012
"... Content-based video retrieval is maturing to the point where it can be used in real-world retrieval practices. One such practice is the audiovisual archive, whose users increasingly require fine-grained access to broadcast television content. In this paper, we take into account the information needs ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
(Show Context)
Content-based video retrieval is maturing to the point where it can be used in real-world retrieval practices. One such practice is the audiovisual archive, whose users increasingly require fine-grained access to broadcast television content. In this paper, we take into account the information needs and retrieval data already present in the audiovisual archive, and demonstrate that retrieval performance can be significantly improved when content-based methods are applied to search. Tothebestofour knowledge, this is the first time that the practice of an audiovisual archive has been taken into account for quantitative retrieval evaluation. To arrive at our main result, we propose an evaluation methodology tailored to the specific needs and circumstances of the audiovisual archive, which are typically missed by existing evaluation initiatives. We utilize logged searches, content purchases, session information, and simulators to create realistic query sets and relevance judgments. To reflect the retrieval practice of both the archive and the video retrieval community as closely as possible, our experiments with three video search engines incorporate archive-created catalog entries as well as state-of-the-art multimedia content analysis results. A detailed query-level analysis indicates that individual content-based retrieval methods such as transcript-based retrieval and concept-based retrieval yield approximately equal performance gains. When combined, we find that content-based video retrieval incorporated into the archiveâs practice results in significant performance increases for shot retrieval and for retrieving entire television programs. The time has come for audiovisual archives to start accommodating content-based video retrieval methods into their daily practice.
Exploiting Speech Recognition Transcripts for Narrative Peak Detection in Short-Form Documentaries
"... Narrative peaks are points at which the viewer perceives a spike in the level of dramatic tension within the narrative flow of a video. This paper reports on four approaches to narrative peak detection in television documentaries that were developed by a joint team consisting of mem-bers from Delft ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
(Show Context)
Narrative peaks are points at which the viewer perceives a spike in the level of dramatic tension within the narrative flow of a video. This paper reports on four approaches to narrative peak detection in television documentaries that were developed by a joint team consisting of mem-bers from Delft University of Technology and the University of Twente within the framework of the VideoCLEF 2009 Affect Detection task. The approaches make use of speech recog-nition transcripts and seek to exploit various sources of evidence in order to automatically identify narrative peaks. These sources include speaker style (word choice), stylistic devices (use of repetitions), strategies strengthening viewers ’ feelings of involvement (direct audience address) and emotional speech. These approaches are compared to a challenging baseline that predicts the presence of narrative peaks at fixed points in the video, presumed to be dictated by natural narrative rhythm or production convention. Two approaches are tied in delivering top narrative peak detection results. One uses counts of first and second person pronouns to identify points in the video where viewers feel most directly involved. The other uses affective word ratings to calculate scores reflecting emotional language.
Identification of Narrative Peaks in Video Clips: Text Features Perform Best
"... A methodology is proposed to identify narrative peaks in video clips. Three basic clip properties are evaluated which reflect on video, audio and text related features in the clip. Furthermore, the expected distribution of narrative peaks throughout the clip is determined and exploited for future pr ..."
Abstract
- Add to MetaCart
(Show Context)
A methodology is proposed to identify narrative peaks in video clips. Three basic clip properties are evaluated which reflect on video, audio and text related features in the clip. Furthermore, the expected distribution of narrative peaks throughout the clip is determined and exploited for future predictions. Results show that only the text related feature, related to the usage of distinct words throughout the clip, and the expected peak-distribution are of use when finding the peaks. On the training set, our best detector had an accuracy of 47 % in finding narrative peaks. On the test set, this accuracy dropped to 24%.
Narrative Peak Detection in Short-Form Documentaries Using Speech Recognition Transcripts
"... Narrative peaks are points at which the viewer perceives a spike in the level of dramatic tension within the narrative flow of a video. In this paper we describe two approaches for automatic identification of narrative peaks in short-form documentaries, within the framework of the VideoCLEF 2009 Aff ..."
Abstract
- Add to MetaCart
(Show Context)
Narrative peaks are points at which the viewer perceives a spike in the level of dramatic tension within the narrative flow of a video. In this paper we describe two approaches for automatic identification of narrative peaks in short-form documentaries, within the framework of the VideoCLEF 2009 Affect Detection task. Both approaches exploit the speech recognition transcript in order to identify the narrative peaks. Our first approach is based on the idea that speech rate increases with high arousal and intensity and the second approach on the idea that narrative peaks are perceived where particular lexical items are used. These approaches are compared to a challenging baseline that predicts the presence of narrative peaks at fixed points in the video, presumed to be dictated by natural narrative rhythm or production convention. The second approach easily outperforms the challenge baseline while the first approach failed to achieve the performance level of the random baseline detector.
Identification of Narrative Peaks in Video Clips: Text Features Perform Best
"... Abstract. A methodology is proposed to identify narrative peaks in video clips. Three basic clip properties are evaluated which reflect on video, audio and text related features in the clip. Furthermore, the expected distribution of narrative peaks throughout the clip is determined and exploited for ..."
Abstract
- Add to MetaCart
(Show Context)
Abstract. A methodology is proposed to identify narrative peaks in video clips. Three basic clip properties are evaluated which reflect on video, audio and text related features in the clip. Furthermore, the expected distribution of narrative peaks throughout the clip is determined and exploited for future predictions. Results show that only the text related feature, related to the usage of distinct words throughout the clip, and the expected peak-distribution are of use when finding the peaks. On the training set, our best detector had an accuracy of 47% in finding narrative peaks. On the test set, this accuracy dropped to 24%. 1
Chemnitz at VideoCLEF 2009: Experiments and Observations on Treating Classification as IR Task
"... [ jens.kuersten ∣ maximilian.eibl] at cs.tu-chemnitz.de This paper describes the participation of the Chemnitz University of Technology in the Video-CLEF 2009 classification task. Our motivation lies in its close relation to our research project sachsMedia1. In our second participation in the task w ..."
Abstract
- Add to MetaCart
(Show Context)
[ jens.kuersten ∣ maximilian.eibl] at cs.tu-chemnitz.de This paper describes the participation of the Chemnitz University of Technology in the Video-CLEF 2009 classification task. Our motivation lies in its close relation to our research project sachsMedia1. In our second participation in the task we experimented with treating the task as IR problem and used the Xtrieval framework [3] to run our experiments. We proposed a auto-matic threshold estimation to limit the number of documents per label since too many returned documents hurt the overall correct classification rate. Although the experimental setup was en-hanced this year and the data sets were changed we found that the IR approach still works quite well. Our query expansion approach performed better than the baseline experiments in terms of mean average precision. We also showed that combining the ASR transcriptions and the archival metadata improves the classification performance, unless query expansion is used in the retrieval phase.