Results 1 - 10
of
12
Content-based representation and retrieval of visual media: A state-of-the-art review
- Multimedia Tools and Applications
, 1996
"... This paper reviews a number of recently available techniques in contentanalysis of visual media and their application to the indexing, retrieval,abstracting, relevance assessment, interactive perception, annotation and re-use of visualdocuments. 1. Background A few years ago, the problems of represe ..."
Abstract
-
Cited by 117 (2 self)
- Add to MetaCart
This paper reviews a number of recently available techniques in contentanalysis of visual media and their application to the indexing, retrieval,abstracting, relevance assessment, interactive perception, annotation and re-use of visualdocuments. 1. Background A few years ago, the problems of representation and retrieval of visualmedia were confined to specialized image databases (geographical, medical, pilot experimentsin computerized slide libraries), in the professional applications of the audiovisualindustries (production, broadcasting and archives), and in computerized training or education. The presentdevelopment of multimedia technology and information highways has put content processing of visualmedia at the core of key application domains: digital and interactive video, large distributed digital libraries, multimedia publishing. Though the most important investments have been targeted at the information infrastructure (networks, servers, coding and compression, deliverymodels, multimedia systems architecture), a growing number of researchers have realized thatcontent processing will be a key asset in putting together successful applications. The need for contentprocessing techniques has been made evident from a variety of angles, ranging from achievingbetter quality in compression, allowing user choice of programs in video-on-demand, achieving betterproductivity in video production, providing access to large still image databases or integrating still images and video in multimedia publishing and cooperative work. Content-based retrieval of visual media and representation of visualdocuments in human-computer interfaces are based on the availability of content representationdata (time-structure for
MUSART: Music Retrieval Via Aural Queries
- INDIANA UNIVERSITY
, 2001
"... MUSART is a research project developing and studying new techniques for music information retrieval. The MUSART architecture uses a variety of representations to support multiple search modes. Progress is reported on the use of Markov modeling, melodic contour, and phonetic streams for music retriev ..."
Abstract
-
Cited by 41 (3 self)
- Add to MetaCart
MUSART is a research project developing and studying new techniques for music information retrieval. The MUSART architecture uses a variety of representations to support multiple search modes. Progress is reported on the use of Markov modeling, melodic contour, and phonetic streams for music retrieval. To enable large-scale databases and more advanced searches, musical abstraction is studied. The MME subsystem performs theme extraction, and two other analysis systems are described that discover structure in audio representations of music. Theme extraction and structure analysis promise to improve search quality and support better browsing and audio thumbnailing. Integration of these components within a single architecture will enable scientific comparison of different techniques and, ultimately, their use in combination for improved performance and functionality.
Summarisation Of Spoken Audio Through Information Extraction
, 1999
"... Automatic summarisation of spoken audio is a fairly new research pursuit, in large part due to the relative novelty of technology for accurately decoding audio into text. Techniques that account for the peculiarities and potential ambiguities of decoded audio (high error rates, lack of syntactic bou ..."
Abstract
-
Cited by 28 (0 self)
- Add to MetaCart
Automatic summarisation of spoken audio is a fairly new research pursuit, in large part due to the relative novelty of technology for accurately decoding audio into text. Techniques that account for the peculiarities and potential ambiguities of decoded audio (high error rates, lack of syntactic boundaries) appear promising for culling summary information from audio for content-based browsing and skimming. This paper combines acoustic confidence measures with simple information retrieval and extraction techniques in order to obtain accurate, readable summaries of broadcast news programs. It also demonstrates how extracted summaries, full-text speech recogniser output and audio files can be usefully linked together through an audio-visual interface. The results suggest that information extraction based on statistical information can produce viable summaries of decoded audio. 1. APPLICATION CONTEXT Managing this contemporary explosion of audio and video materials calls for intelligent s...
A Survey on Video Indexing
- JOURNAL OF VISUAL COMMUNICATIONS AND IMAGE REPRESENTATION
, 1996
"... Extracting information from the ever growing stream of multimedia data is becoming increasingly difficult. One of the main reasons lies within the unstructured way multimedia data are usually presented. Audio-visual material represents a large part of current multimedia material and can be structure ..."
Abstract
-
Cited by 23 (0 self)
- Add to MetaCart
Extracting information from the ever growing stream of multimedia data is becoming increasingly difficult. One of the main reasons lies within the unstructured way multimedia data are usually presented. Audio-visual material represents a large part of current multimedia material and can be structured in meaningful ways due to the nature of visual communication. This paper surveys several approaches and algorithms that have been recently developed to help in automatically structuring audio-visual data, both for annotation and access
A Multi-Similarity Algebra
- In Proc. ACM SIGMOD98
, 1998
"... The need to automatically extract and classify the contents of multimedia data archives such as images, video, and text documents has led to significant work on similarity based retrieval of data. To date, most work in this area has focused on the creation of index structures for similarity based re ..."
Abstract
-
Cited by 13 (0 self)
- Add to MetaCart
The need to automatically extract and classify the contents of multimedia data archives such as images, video, and text documents has led to significant work on similarity based retrieval of data. To date, most work in this area has focused on the creation of index structures for similarity based retrieval. There is very little work on developing formalisms for querying multimedia databases that support similarity based computations and optimizing such queries, even though it is well known that feature extraction and identification algorithms in media data are very expensive. We introduce a similarity algebra that brings together relational operators and results of multiple similarity implementations in a uniform language. The algebra can be used to specify complex queries that combine different interpretations of similarity values and multiple algorithms for computing these values. We prove equivalence and containment relationships between similarity algebra expressions and develop qu...
Incorporating domain knowledge with video and voice data analysis in news broadcasts
- In ACM International Conference on Knowledge Discovery and Data Mining
, 2000
"... This paper addresses the area of video annotation, indexing and retrieval, and shows how a set of tools can be employed, along with domain knowledge, to detect narrative structure in broadcast news. The initial structure is detected using low-level audio visual processing in conjunction with domain ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
This paper addresses the area of video annotation, indexing and retrieval, and shows how a set of tools can be employed, along with domain knowledge, to detect narrative structure in broadcast news. The initial structure is detected using low-level audio visual processing in conjunction with domain knowledge. Higher level processing may then utilize the initial structure detected to direct processing to improve and extend the initial classification. The structure detected breaks a news broadcast into segments, each of which contains a single topic of discussion. Further the segments are labeled as a) anchor person or reporter, b) footage with a voice over or c) sound bite. This labeling may be used to provide a summary, for example by presenting a thumbnail for each reporter present in a section of the video. The inclusion of domain knowledge in computation allows more directed application of high level processing, giving much greater efficiency of effort expended. This allows valid deductions to be made about structure and semantics of the contents of a news video stream, as demonstrated by our experiments on CNN news broadcasts.
Language Technology: “A Survey of the State of the Art Language Resources
- Multimodal Language Resources”, MITRE Corporation
, 2002
"... This article provides an overview of research in multimodal language processing and associated resources. It defines multimodal processing, describes key challenges, identifies potential benefits, and outlines the major tasks, including multimodal input interpretation, multimodal output generation, ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
This article provides an overview of research in multimodal language processing and associated resources. It defines multimodal processing, describes key challenges, identifies potential benefits, and outlines the major tasks, including multimodal input interpretation, multimodal output generation, and multimodal information access. The article exemplifies the state of the art in multimedia and multimodal processing, describes multimodal language resources and annotation, and finally presents a 2003-2006 roadmap that points the way to the future.
A Web Interface for a Sound Database and Processing System
, 1997
"... The World Wide Web has brought the possibility of distributing multimedia objects through the Internet, becoming a very interesting platform for the development of innovative music and audio applications. Apart from the explosion of commercial plug-ins and compression techniques aimed at real-tim ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
The World Wide Web has brought the possibility of distributing multimedia objects through the Internet, becoming a very interesting platform for the development of innovative music and audio applications. Apart from the explosion of commercial plug-ins and compression techniques aimed at real-time transfer of music and audio data, there are some projects that go beyond that by attempting to use Internet as a musical production environment. In this article we will discuss our view on the technological requirements for an ideal Web-based studio and as a particular application we will show a Web front-end to the Spectral Modeling Synthesis (SMS) system.
A proposal for the description of audio in the context of MPEG-7
- Proceedings of the CBMI'99 European Workshop on Content-Based Multimedia Indexing
, 1999
"... Sound content description is one of the aims of the MPEG-7 initiative. Although MPEG-7 focuses on indexing and retrieval of audio, there are other sound content-based processing applications waiting to be developed once we have a robust set of descriptors and structures for putting them into relatio ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Sound content description is one of the aims of the MPEG-7 initiative. Although MPEG-7 focuses on indexing and retrieval of audio, there are other sound content-based processing applications waiting to be developed once we have a robust set of descriptors and structures for putting them into relation, and for expressing semantic concerns about sound. Spectral Modeling techniques provide one usable framework for extracting and organizing sound content descriptions. In this paper we will introduce one particular approach to spectral modeling, then we will present some sound descriptors that can be derived from them in order to develop sound descriptions, and we will discuss the features of a structure for organizing the information that can be derived from them (a so called "Description Scheme"). All of our current descriptors can be considered low- or mid-level, thus we will not cover the high level description of music (musical forms and styles, roles of characters in a movie, etc.) which is also relevant in MPEG-7 indeed. The descriptors proposed are the result of a sound analysis based on a spectral modeling technique, and for all of them we have devised automatic extraction procedures. The Description Scheme we present is intended to be a generic one that, based on a hierarchical (and recursive in some places) structure, can describe sound at multiple levels of detail, addressing both syntactic (structural) and semantic (content) ways for describing sound.
CAVES: A Configurable Application View Storage System
"... Storage management is a well-known method for improving the efficiency of data intensive and networked applications. Today's data management systems handle many non-traditional data formats, ranging from spatial data to images, video and other hybrid representations. This requires the use of speci ..."
Abstract
- Add to MetaCart
Storage management is a well-known method for improving the efficiency of data intensive and networked applications. Today's data management systems handle many non-traditional data formats, ranging from spatial data to images, video and other hybrid representations. This requires the use of specialized methods to query, extract and transform data from multiple, possibly distributed sources. There is a great need to develop efficient and scalable methodologies for storing and reusing the results of computations in such applications. In this paper, we introduce a configurable storage management system that allows programmers to specify a collection of storage management protocols for managing different types of data requests on top of a shared and possibly distributed pool of resources. In addition, dynamic protocol change rules allow the system to shift from one storage management method to another depending on the availability of system resources. Furthermore, the storage management system can be expanded with application specific methods to look-up and re-use stored items. We show how the storage system can be tuned to specific workload specifications with the help of a simulation model that takes into account both the cost of storage management protocols as well as the methods to look-up and re-use stored items.

