Results 1 - 10
of
30
Automatic Musical Genre Classification Of Audio Signals
- IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING
, 2002
"... ... describe music. They are commonly used to structure the increasing amounts of music available in digital form on the Web and are important for music information retrieval. Genre categorization for audio has traditionally been performed manually. A particular musical genre is characterized by sta ..."
Abstract
-
Cited by 422 (22 self)
- Add to MetaCart
... describe music. They are commonly used to structure the increasing amounts of music available in digital form on the Web and are important for music information retrieval. Genre categorization for audio has traditionally been performed manually. A particular musical genre is characterized by statistical properties related to the instrumentation, rhythmic structure and form of its members. In this work, algorithms for the automatic genre categorization of audio signals are described. More specifically, we propose a set of features for representing texture and instrumentation. In addition a novel set of features for representing rhythmic structure and strength is proposed. The performance of those feature sets has been evaluated by training statistical pattern recognition classifiers using real world audio collections. Based on the automatic hierarchical genre classification two graphical user interfaces for browsing and interacting with large audio collections have been developed.
MARSYAS: A framework for audio analysis
, 2000
"... Existing audio tools handle the increasing amount of computer audio data inadequately. The typical tape-recorder paradigm for audio interfaces is inflexible and time consuming, especially for large data sets. On the other hand, completely automatic audio analysis and annotation is impossible using c ..."
Abstract
-
Cited by 89 (16 self)
- Add to MetaCart
Existing audio tools handle the increasing amount of computer audio data inadequately. The typical tape-recorder paradigm for audio interfaces is inflexible and time consuming, especially for large data sets. On the other hand, completely automatic audio analysis and annotation is impossible using current techniques.
The way it sounds : Timbre models for analysis and retrieval of polyphonic music signals
- IEEE Transactions on Multimedia
, 2005
"... Abstract—Electronic Music Distribution is in need of robust and automatically extracted music descriptors. An important attribute of a piece of polyphonic music is what is commonly referred to as “the way it sounds”. While there has been a large quantity of research done to model the timbre of indiv ..."
Abstract
-
Cited by 28 (5 self)
- Add to MetaCart
Abstract—Electronic Music Distribution is in need of robust and automatically extracted music descriptors. An important attribute of a piece of polyphonic music is what is commonly referred to as “the way it sounds”. While there has been a large quantity of research done to model the timbre of individual instruments, little work has been done to analyze “real world ” timbre mixtures such as the ones found in popular music. In this paper, we present our research about such “polyphonic timbres”. We describe an effective way to model the textures found in a given music signal, and show that such timbre models provide new solutions to many issues traditionally encountered in music signal processing and music information retrieval. Notably, we describe their applications for music similarity, segmentation and pattern induction. Index Terms—Feature extraction, information retrieval, multimedia database, music, pattern recognition.
Audio Information Retrieval (AIR) Tools
- IN PROC. INT. SYMPOSIUM ON MUSIC INFORMATION RETRIEVAL (ISMIR
, 2000
"... The majority of work in music information retrieval (IR) has been focused on symbolic representations of music. However, most of the digitally available music is in the form of raw audio signals. Although various attempts at monophonic and polyphonic transcription have been made, none has been succe ..."
Abstract
-
Cited by 24 (4 self)
- Add to MetaCart
The majority of work in music information retrieval (IR) has been focused on symbolic representations of music. However, most of the digitally available music is in the form of raw audio signals. Although various attempts at monophonic and polyphonic transcription have been made, none has been successful and general enough to work with real world signals. In this paper
Sound analysis using mpeg compressed audio
, 2000
"... There is a huge amount of audio data available that is compressed using the MPEG audio compression standard. Sound analysis is based on the computation of short time feature vectors that describe the instantaneous spectral content of the sound. An interesting possibility is the calculation of featur ..."
Abstract
-
Cited by 23 (4 self)
- Add to MetaCart
There is a huge amount of audio data available that is compressed using the MPEG audio compression standard. Sound analysis is based on the computation of short time feature vectors that describe the instantaneous spectral content of the sound. An interesting possibility is the calculation of features directly from compressed data. Since the bulk of the feature calculation is performed during the encoding stage this process has a significant performance advantage if the available data is compressed. Combining decoding and analysis in one stage is also very important for audio streaming applications. In this paper, we describe the calculation of features directly from MPEG audio compressed data. Two of the basic processes of analyzing sound are: segmentation and classification. To illustrate the effectiveness of the calculated features we have implemented two case studies: a general audio segmentation algorithm and a Music/Speech classifier. Experimental data is provided to show that the results obtained are comparable with sound analysis algorithms working directly with audio samples. 1.
Audio Analysis using the Discrete Wavelet Transform
- in Proc. Conf. in Acoustics and Music Theory Applications. WSES
, 2001
"... Abstract:- The Discrete Wavelet Transform (DWT) is a transformation that can be used to analyze the temporal and spectral properties of non-stationary signals like audio. In this paper we describe some applications of the DWT to the problem of extracting information from non-speech audio. More speci ..."
Abstract
-
Cited by 18 (4 self)
- Add to MetaCart
Abstract:- The Discrete Wavelet Transform (DWT) is a transformation that can be used to analyze the temporal and spectral properties of non-stationary signals like audio. In this paper we describe some applications of the DWT to the problem of extracting information from non-speech audio. More specifically automatic classification of various types of audio using the DWT is described and compared with other traditional feature extractors proposed in the literature. In addition, a technique for detecting the beat attributes of music is presented. Both synthetic and real world stimuli were used to evaluate the performance of the beat detection algorithm. Key-Words:- audio analysis, wavelets, classification, beat extraction 1
TotalRecall: Visualization and Semi-Automatic Annotation of Very Large Audio-Visual Corpora ABSTRACT
"... We introduce a system for visualizing, annotating, and analyzing very large collections of longitudinal audio and video recordings. The system, TotalRecall, is designed to address the requirements of projects like the Human Speechome Project [18], for which more than 100,000 hours of multitrack audi ..."
Abstract
-
Cited by 11 (2 self)
- Add to MetaCart
We introduce a system for visualizing, annotating, and analyzing very large collections of longitudinal audio and video recordings. The system, TotalRecall, is designed to address the requirements of projects like the Human Speechome Project [18], for which more than 100,000 hours of multitrack audio and video have been collected over a twentytwo month period. Our goal in this project is to transcribe speech in over 10,000 hours of audio recordings, and to annotate the position and head orientation of multiple people in the 10,000 hours of corresponding video. Higher level behavioral analysis of the corpus will be based on these and other annotations. To efficiently cope with this huge corpus, we are developing semi-automatic data coding methods that are integrated into TotalRecall. Ultimately, this system and the underlying methodology may enable new forms of multimodal behavioral analysis grounded in ultradense longitudinal data.
Toward Automated Holistic Beat Tracking, Music Analysis And Understanding
, 2005
"... Most music processing attempts to focus on one particular feature or structural element such as pitch, beat location, tempo, or genre. This hierarchical approach, in which music is separated into elements that are analyzed independently, is convenient for the scientific researcher, but is at od ..."
Abstract
-
Cited by 10 (0 self)
- Add to MetaCart
Most music processing attempts to focus on one particular feature or structural element such as pitch, beat location, tempo, or genre. This hierarchical approach, in which music is separated into elements that are analyzed independently, is convenient for the scientific researcher, but is at odds with intuition about music perception. Music is
MIR IN MATLAB (II): A TOOLBOX FOR MUSICAL FEATURE EXTRACTION FROM AUDIO
"... We present the MIRtoolbox, an integrated set of functions written in Matlab, dedicated to the extraction of musical features from audio files. The design is based on a modular framework: the different algorithms are decomposed into stages, formalized using a minimal set of elementary mechanisms, and ..."
Abstract
-
Cited by 8 (2 self)
- Add to MetaCart
We present the MIRtoolbox, an integrated set of functions written in Matlab, dedicated to the extraction of musical features from audio files. The design is based on a modular framework: the different algorithms are decomposed into stages, formalized using a minimal set of elementary mechanisms, and integrating different variants proposed by alternative approaches – including new strategies we have developed –, that users can select and parametrize. This paper offers an overview of the set of features, related, among others, to timbre, tonality, rhythm or form, that can be extracted with the MIRtoolbox. One particular analysis is provided as an example. The toolbox also includes functions for statistical analysis, segmentation and clustering. Particular attention has been paid to the design of a syntax that offers both simplicity of use and transparent adaptiveness to a multiplicity of possible input types. Each feature extraction method can accept as argument an audio file, or any preliminary result from intermediary stages of the chain of operations. Also the same syntax can be used for analyses of single audio files, batches of files, series of audio segments, multi-channel signals, etc. For that purpose, the data and methods of the toolbox are organised in an object-oriented architecture. 1
Feature extraction and database design for mu-sic software
- Proceedings of the International Computer Music Conference
, 2004
"... Persistent storage and access of sound/music meta-data is an increasingly relevant topic to the developers of multimedia software. This paper focuses on the design of music signal analysis tools and database formats for modern applications. It is partly tutorial in nature, and partly a discussion of ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
Persistent storage and access of sound/music meta-data is an increasingly relevant topic to the developers of multimedia software. This paper focuses on the design of music signal analysis tools and database formats for modern applications. It is partly tutorial in nature, and partly a discussion of design issues. We begin with a high-level overview of the dimensions of music database (MDB) software, and then walk through the common g feature extraction techniques. A requirements analysis of several application categories will allow us to carefully determine which features might be most useful for them. This leads us to suggest concrete architectural and design criteria, and to close by introducing several of our recent implemented systems. The authors believe that much current MDB software suffers due to ad-hoc design of analysis systems and feature vectors, which often incorporate only low-level features and are not tuned for the application at hand. Our goal is to advance the state of the art of music meta-data extraction and database design by fostering a better engineering practice in the construction of high-level feature vectors and analysis engines for music software. 1

