Results 1 -
8 of
8
Muvis: A content-based multimedia indexing and retrieval framework
- Proc. of the Seventh International Symposium on Signal Processing and its Applications, ISSPA 2003
, 2003
"... MUVIS is a series of CBIR systems. The first one has been developed in late 90s to support indexing and retrieval in large image databases using visual and semantic features such as color, texture and shape. During recent years. MUVIS has been reformed to become a PC-based framework, which supports ..."
Abstract
-
Cited by 9 (7 self)
- Add to MetaCart
MUVIS is a series of CBIR systems. The first one has been developed in late 90s to support indexing and retrieval in large image databases using visual and semantic features such as color, texture and shape. During recent years. MUVIS has been reformed to become a PC-based framework, which supports indexing, browsing and querying of various multimedia types such as audio. video, audiohideo interlaced and several image formats. MUVIS system allows real-time audio and video capturing, encoding by last generation codecs such as MPEG-4. H.263+. MP3 and AAC. It supports several audiohideo file format such as AVI, MP4, MP3 and AAC. Furthermore. MWIS system provides a well-defined interface for third parties to integrate their own feature extraction algorithms into the framework and for this reason it has recently been adopted by COST 21 lquat as COST framework for CBR. In this paper. we describe the general system features with underlying applications and outline the main philosophy. 1.
A generic audio classification and segmentation approach for multimedia indexing and retrieval
- In Proceedings of the European Workshop on the Integration of Knowledge, Semantics and Digital Media Technology, EWIMT 2004
, 2004
"... Abstract — We focus the attention on the area of generic and automatic audio classification and segmentation for audio-based multimedia indexing and retrieval applications. In particular, we present a fuzzy approach towards hierarchic audio classification and global segmentation framework based on a ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
Abstract — We focus the attention on the area of generic and automatic audio classification and segmentation for audio-based multimedia indexing and retrieval applications. In particular, we present a fuzzy approach towards hierarchic audio classification and global segmentation framework based on automatic audio analysis providing robust, bi-modal, efficient and parameter invariant classification over global audio segments. The input audio is split into segments, which are classified as speech, music, fuzzy or silent. The proposed method minimizes critical errors of misclassification by fuzzy region modeling, thus increasing the efficiency of both pure and fuzzy classification. The experimental results show that the critical errors are minimized and the proposed framework significantly increases the efficiency and the accuracy of audio-based retrieval especially in large multimedia databases. Index Terms — automatic audio classification and segmentation, perceptual rule-based approach, fuzzy modeling, multimedia indexing and retrieval. I.
BUILDING AUDIO CLASSIFIERS FOR BROADCAST NEWS RETRIEVAL
, 2004
"... The process of building audio classifiers for high-level content descriptors, especially in large datasets, is not trivial. In this paper we describe the design and development of audio classification algorithms for broadcast news retrieval in the context of the TREC 2003 video retrieval evaluation. ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
The process of building audio classifiers for high-level content descriptors, especially in large datasets, is not trivial. In this paper we describe the design and development of audio classification algorithms for broadcast news retrieval in the context of the TREC 2003 video retrieval evaluation. The main focus of this paper is the actual building process itself rather than the final results, although some representative results will be provided. It is our belief that the insights obtained and tools developed in order to work with real world large audio collections are important and frequently unmentioned in existing published work. An important and critical aspect of this process is obtaining ground truth annotations for training the classifiers. Therefore tools and techniques that assist the human annotation of news audio will be described.
Audio-based Multimedia indexing and retrieval scheme
- in MUVIS framework", International Symposium On Intelligent Signal Processing and Communication Systems (ISPACS), Awaji Island
, 2003
"... MUVIS is a PC-based framework, which supports indexing, browsing and querying of various multimedia types such as audio, video, audio/video interlaced in several formats. It allows real-time audio and video capturing, encoding by last generation codecs such as MPEG-4, H.263+, MP3 and AAC. MUVIS also ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
MUVIS is a PC-based framework, which supports indexing, browsing and querying of various multimedia types such as audio, video, audio/video interlaced in several formats. It allows real-time audio and video capturing, encoding by last generation codecs such as MPEG-4, H.263+, MP3 and AAC. MUVIS also supports several audio/video file format such as AVI, MP4, MP3 and AAC. Almost all image types in a PC environment including JPEG-2000 can be rendered, indexed and converted within MUVIS framework. Along with such a wide multimedia coverage, MUVIS has been developed to achieve a global and unified solution for content-based indexing and retrieval problem and to provide user-friendly applications and a generic framework especially for third parties to develop their feature extraction modules. In this paper, we present an overview of the MUVIS system and we shall especially focus on the overall audio-based multimedia indexing and retrieval scheme within MUVIS framework. 1.
A fuzzy approach towards perceptual classification and segmentation
- of MP3/AAC audio”, International Symposium on Control, Communications and Signal Processing
, 2004
"... The paper presents a novel perceptual based fuzzy approach towards classification and segmentation for MP3 and AAC audio in the compressed domain. The input audio is split into segments, which are classified as speech, music, fuzzy or silent. The proposed method minimizes critical errors of misclass ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
The paper presents a novel perceptual based fuzzy approach towards classification and segmentation for MP3 and AAC audio in the compressed domain. The input audio is split into segments, which are classified as speech, music, fuzzy or silent. The proposed method minimizes critical errors of misclassification by fuzzy region modeling, thus increasing the efficiency of both pure and fuzzy classification. The experimental results show that the critical errors are minimized and the method is robust to capturing and encoding parameters of MP3 and AAC bit streams. Due to the efficiency obtained from fuzzy-region modeling and improved accuracy via rulebased semantic approach, the method is designed specifically for the audio-based multimedia indexing and retrieval systems. 1.
Optimal Short-Time Features for Music/Speech Classification of Compressed Audio Data
"... This paper deals with the Music/Speech classification problem, starting from a set of features extracted directly from compressed audio data. The proposed classification system is able to label audio sequences stored as compressed MPEG layer III files. Decoding and analyzing in a unique stage is a f ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
This paper deals with the Music/Speech classification problem, starting from a set of features extracted directly from compressed audio data. The proposed classification system is able to label audio sequences stored as compressed MPEG layer III files. Decoding and analyzing in a unique stage is a fundamental tool for audio streaming applications, such as real time classification. Moreover, the techniques described herein provide useful tools in the management (data tagging, summarization, etc.) of a digital music library. The adopted set of short-time features are computed from the spectral information available in the decoding stage. In this paper, we show that for the classification problem at hand this set of features is redundant and can be dramatically pruned. To this aim we used an optimization strategy based on principal component analysis and genetic algorithms. The results show a very interesting classification accuracy using just one short-time feature. of Fig. 1: a first block performs the features extraction, then these features feed a classifier whose output is the membership class of the input. An optional optimization block can be used to improve the performance of the system. Some systems use only a few features computed in time and/or frequency domain. Other approaches use more complicated features, several of which are motivated by perceptual properties of audio. Regarding the classification model, they apply advanced techniques, including Gaussian mixture models (GMM) and support vector machines (SVM).
A General Audio Classifier based on human perception motivated model
"... The audio channel conveys rich clues for content-based multimedia indexing. Interesting audio analysis includes, besides widely known speech recognition and speaker identification problems, speech/music segmentation, speaker gender detection, special effect recognition such as gun shots or car pursu ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
The audio channel conveys rich clues for content-based multimedia indexing. Interesting audio analysis includes, besides widely known speech recognition and speaker identification problems, speech/music segmentation, speaker gender detection, special effect recognition such as gun shots or car pursuit, and so on. All these problems can be considered as an audio classification problem which needs to generate a label from low audio signal analysis. While most audio analysis techniques in the literature are problem specific, we propose in this paper a general framework for audio classification. The proposed technique uses a perceptually motivated model of the human perception of audio classes in the sense that it makes a judicious use of certain psychophysical results and relies on a neural network for classification. In order to assess the effectiveness of the proposed approach, large experiments on several audio classification problems have been carried out, including speech/music discrimination in Radio/TV programs, gender recognition on a subset of the switchboard database, highlights detection in sports videos, and musical genre recognition. The classification accuracies of the proposed technique are comparable to those obtained by problem specific techniques while offering the basis of a general approach for audio classification.
Genre Classification of Compressed Audio Data
"... Abstract—This paper deals with the musical genre classification problem, starting from a set of features extracted directly from MPEG−1 layer III compressed audio data. The automatic classification of compressed audio signals into a short hierarchy of musical genres is explored. More specifically, t ..."
Abstract
- Add to MetaCart
Abstract—This paper deals with the musical genre classification problem, starting from a set of features extracted directly from MPEG−1 layer III compressed audio data. The automatic classification of compressed audio signals into a short hierarchy of musical genres is explored. More specifically, three feature sets for representing timbre, rhythmic content and energy content are proposed for a four leafs tree genre hierarchy. The adopted set of features are computed from the spectral information available in the MPEG decoding stage. The performance and relative importance of the proposed approach is investigated by training a classification model using the audio collections proposed in musical genre contests. We also used an optimization strategy based on genetic algorithms. The results are comparable to those obtained by PCM−based musical genre classification systems. I.

