Results 1 - 10
of
56
Video manga: Generating semantically meaningful video summaries
, 1999
"... This paper presents methods for automatically creating pictorial video summaries that resemble comic books. The relative importance of video segments is computed from their length and novelty. Image and audio analysis is used to automatically detect and emphasize meaningful events. Based on this imp ..."
Abstract
-
Cited by 87 (6 self)
- Add to MetaCart
This paper presents methods for automatically creating pictorial video summaries that resemble comic books. The relative importance of video segments is computed from their length and novelty. Image and audio analysis is used to automatically detect and emphasize meaningful events. Based on this importance measure, we choose relevant keyframes. Selected keyframes are sized by importance, and then efficiently packed into a pictorial summary. We present a quantitative measure of how well a summary captures the salient events in a video, and show how it can be used to improve our summaries. The result is a compact and visually pleasing summary that captures semantically important events, and is suitable for printing or Web access. Such a summary can be further enhanced by including text captions derived from OCR or other methods. We describe how the automatically generated summaries are used to simplify access to a large collection of videos. 1.1 Keywords Video summarization and analysis, keyframe selection and layout. 2.
An Interactive Comic Book Presentation for Exploring Video
- In CHI 2000 Conference Proceedings
, 2000
"... This paper presents a method for generating compact pictorial summarizations of video. We developed a novel approach for selecting still images from a video suitable for summarizing the video and for providing entry points into it. Images are laid out in a compact, visually pleasing display reminisc ..."
Abstract
-
Cited by 38 (2 self)
- Add to MetaCart
This paper presents a method for generating compact pictorial summarizations of video. We developed a novel approach for selecting still images from a video suitable for summarizing the video and for providing entry points into it. Images are laid out in a compact, visually pleasing display reminiscent of a comic book or Japanese manga. Users can explore the video by interacting with the presented summary. Links from each keyframe start video playback and/or present additional detail. Captions can be added to presentation frames to include commentary or descriptions such as the minutes of a recorded meeting. We conducted a study to compare variants of our summarization technique. The study participants judged the manga summary to be significantly better than the other two conditions with respect to their suitability for summaries and navigation, and their visual appeal.
Browsing Digital Video
, 1999
"... Video in digital format coupled with digital/programmable playback devices presents opportunities for significantly enhancing the user's viewing experience. For example, time compression can shorten the viewing length of a video and shot boundary frames can provide a visual index into the content. S ..."
Abstract
-
Cited by 28 (3 self)
- Add to MetaCart
Video in digital format coupled with digital/programmable playback devices presents opportunities for significantly enhancing the user's viewing experience. For example, time compression can shorten the viewing length of a video and shot boundary frames can provide a visual index into the content. Such features have primarily been evaluated in isolation with a narrow set of video content types. We investigated as well as implemented the design of a software video browsing application that combines many such features. In addition, we evaluated its use in watching six different video content types and present the resulting data for analysis and discussion. The participants in the evaluation found the browser to be useful and effective for watching the different types of video in a limited amount of time. Also, the results show that both the experience of using the browser and value of each feature varies depending on the content type.
Adjustable Filmstrips and Skims as Abstractions for a Digital Video Library
- IEEE Advances in Digital Libraries Conference
, 1999
"... Filmstrips and video skims are two presentation schemes for abstracting information in a digital video segment. Filmstrips present information all at once in a static form, while video skims are played and disclose information temporally. This paper discusses the evolution of the filmstrip and skim ..."
Abstract
-
Cited by 27 (2 self)
- Add to MetaCart
Filmstrips and video skims are two presentation schemes for abstracting information in a digital video segment. Filmstrips present information all at once in a static form, while video skims are played and disclose information temporally. This paper discusses the evolution of the filmstrip and skim interfaces in the Informedia Digital Video Library. Filmstrips are commonly deployed as interfaces for video and image libraries, but we found initial Informedia filmstrips and skims received little use. We discuss the interface considerations motivating the redesign of filmstrips and skims to adjust their presentations dynamically based on user context and preference.
An intelligent media browser using automatic multimodal analysis
- ACM Multimedia
, 1998
"... Many techniques can extract information from an multimedia stream, such as speaker identity or shot boundaries. We present a browser that uses this information to navigate through stored media. Because automatically-derived information is not wholly reliable, it is transformed into a time-dependent ..."
Abstract
-
Cited by 23 (2 self)
- Add to MetaCart
Many techniques can extract information from an multimedia stream, such as speaker identity or shot boundaries. We present a browser that uses this information to navigate through stored media. Because automatically-derived information is not wholly reliable, it is transformed into a time-dependent “confidence score. ” When presented graphically, confidence scores enable users to make informed decisions about regions of interest in the media, so that non-interesting areas may be skipped. Additionally, index points may be determined automatically for easy navigation, selection, editing, and annotation and will support analysis types other than the speaker identification and shot detection used here. 1.1 Keywords Content-based retrieval, video, speaker identification, automatic analysis, visualization, skimming 2.
How fast is too fast? Evaluating fast forward surrogates for digital video
, 2003
"... To support effective browsing, interfaces to digital video libraries should include video surrogates (i.e., smaller objects that can stand in for the videos in the collection, analogous to abstracts standing in for documents). The current study investigated four variations (i.e., speeds) of one form ..."
Abstract
-
Cited by 19 (1 self)
- Add to MetaCart
To support effective browsing, interfaces to digital video libraries should include video surrogates (i.e., smaller objects that can stand in for the videos in the collection, analogous to abstracts standing in for documents). The current study investigated four variations (i.e., speeds) of one form of video surrogate: a fast forward created by selecting every Nth frame from the full video. In addition, it tested the validity of six measures of user performance when interacting with video surrogates. Forty-five study participants interacted with all four versions of the fast forward surrogate, and completed all six performance tasks with each. Surrogate speed affected performance on four of the measures: object recognition (graphical), action recognition, linguistic gist comprehension (full text), and visual gist comprehension. Based on these results, we recommend a fast forward default speed of 1:64 of the original video keyframes. In addition, users should control the choice of fast forward speed to adjust for content characteristics and personal preferences. 1.
Accessing multimodal meeting data: Systems, problems and possibilities
- Lecture Notes in Computer Science, Machine Learning for Multimodal Interaction
, 2004
"... Abstract. As the amount of multimodal meetings data being recorded increases, so does the need for sophisticated mechanisms for accessing this data. This process is complicated by the different informational needs of users, as well as the range of data collected from meetings. This paper examines th ..."
Abstract
-
Cited by 17 (1 self)
- Add to MetaCart
Abstract. As the amount of multimodal meetings data being recorded increases, so does the need for sophisticated mechanisms for accessing this data. This process is complicated by the different informational needs of users, as well as the range of data collected from meetings. This paper examines the current state of the art in meeting browsers. We examine both systems specifically designed for browsing multimodal meetings data and those designed to browse data collected from different environments, for example broadcast news and lectures. As a result of this analysis, we highlight potential directions for future research- semantic access, filtered presentation, limited display environments, browser evaluation and user requirements capture. 1
Lessons for the Future from a Decade of Informedia Video Analysis Research
- Proc. CIVR (Singapore, July 2005), LNCS 3568
, 2005
"... Abstract. The overarching goal of the Informedia Digital Video Library project has been to achieve machine understanding of video media, including all aspects of search, retrieval, visualization and summarization in both contemporaneous and archival content collections. The base technology developed ..."
Abstract
-
Cited by 16 (0 self)
- Add to MetaCart
Abstract. The overarching goal of the Informedia Digital Video Library project has been to achieve machine understanding of video media, including all aspects of search, retrieval, visualization and summarization in both contemporaneous and archival content collections. The base technology developed by the Informedia project combines speech, image and natural language understanding to automatically transcribe, segment and index broadcast video for intelligent search and image retrieval. While speech processing has been the most influential component in the success of the Informedia project, other modalities can be critical in various situations. Evaluations done in the context of the TRECVID benchmarks show that while some progress has been made, there is still a lot of work ahead. The fundamental “semantic gap ” still exists, but there are a number of promising approaches to bridging it. 1 A Brief History of the Informedia Digital Library Project Vast amounts of video have been archived, and more is produced daily, yet remains untapped as an archival information source for on-demand access because of the difficulty
Creating Music Videos Using Automatic Media Analysis
- ACM Multimedia
, 2002
"... We present methods for automatic and semi-automatic creation of music videos, given an arbitrary audio soundtrack and source video. Significant audio changes are automatically detected; similarly, the source video is automatically segmented and analyzed for suitability based on camera motion and exp ..."
Abstract
-
Cited by 14 (0 self)
- Add to MetaCart
We present methods for automatic and semi-automatic creation of music videos, given an arbitrary audio soundtrack and source video. Significant audio changes are automatically detected; similarly, the source video is automatically segmented and analyzed for suitability based on camera motion and exposure. Video with excessive camera motion or poor contrast is penalized with a high unsuitability score, and is more likely to be discarded in the final edit. High quality video clips are then automatically selected and aligned in time with significant audio changes. Video clips are adjusted to match the audio segments by selecting the most suitable region of the desired length. Besides a fully automated solution, our system can also start with clips manually selected and ordered using a graphical interface. The video is then created by truncating the selected clips (preserving the high quality portions) to produce a video digest that is synchronized with the soundtrack music, thus enhancing the impact of both.
Informedia - Search and Summarization in the Video Medium
, 2000
"... The Informedia system provides "full-content" search and retrieval of current and past TV and radio news and documentary broadcasts. The system implements a fully automatic intelligent process to enable daily content capture, analysis and storage in on-line archives. The current library consists of ..."
Abstract
-
Cited by 14 (0 self)
- Add to MetaCart
The Informedia system provides "full-content" search and retrieval of current and past TV and radio news and documentary broadcasts. The system implements a fully automatic intelligent process to enable daily content capture, analysis and storage in on-line archives. The current library consists of approximately a 2,000 hours, 1.5 terabyte library of daily CNN News captured over the last 3 years and documentaries from public television and government agencies. This database allows for rapid retrieval of individual "video paragraphs" which satisfy an arbitrary spoken or typed subject area query based on a combination of the words in the soundtrack, images recognized in the video, plus closed-captioning when available and informational text overlaid on the screen images. There are also capabilities for matching of similar faces and images, generation of related map-based displays. The latest work attempts to produce a visualization and summarization of the content across all the stories ...

