Results 1 - 10
of
128
Improving timbre similarity: How high is the sky
- Results in Speech and Audio Sciences
"... Abstract. We report on experiments done in an attempt to improve the performance of a music similarity measure which we introduced earlier. The technique aims at comparing music titles on the basis of their global “timbre”, which has many applications in the field of Music Information Retrieval. Suc ..."
Abstract
-
Cited by 181 (14 self)
- Add to MetaCart
Abstract. We report on experiments done in an attempt to improve the performance of a music similarity measure which we introduced earlier. The technique aims at comparing music titles on the basis of their global “timbre”, which has many applications in the field of Music Information Retrieval. Such measures of timbre similarity have seen a growing interest lately, and every contribution (including ours) is yet another instantiation of the same basic pattern recognition architecture, only with different algorithm variants and parameters. Most give encouraging results with a little effort, and imply that near-perfect results would just extrapolate by fine-tuning the algorithms ’ parameters. However, such systematic testing over large, interdependent parameter spaces is both difficult and costly, as it requires to work on a whole general meta-database architecture. This paper contributes in two ways to the current state of the art. We report on extensive tests over very many parameters and algorithmic variants, either already envisioned in the literature or not. This leads to an improvement over existing algorithms of about 15 % R-precision. But most importantly, we describe many variants that surprisingly do not lead to any substancial improvement. Moreover, our simulations suggest the existence of a “glass ceiling ” at R-precision about 65 % which cannot probably be overcome by pursuing such variations on the same theme.
Aggregate features and AdaBoost for music classification
- Machine Learning
, 2006
"... Abstract. We present an algorithm that predicts musical genre and artist from an audio waveform. Our method uses the ensemble learner AdaBoost to select from a set of audio features that have been extracted from segmented audio and then aggregated. Our classifier proved to be the most effective meth ..."
Abstract
-
Cited by 84 (16 self)
- Add to MetaCart
Abstract. We present an algorithm that predicts musical genre and artist from an audio waveform. Our method uses the ensemble learner AdaBoost to select from a set of audio features that have been extracted from segmented audio and then aggregated. Our classifier proved to be the most effective method for genre classification at the recent MIREX 2005 international contests in music information extraction, and the second-best method for recognizing artists. This paper describes our method in detail, from feature extraction to song classification, and presents an evaluation of our method on three genre databases and two artist-recognition databases. Furthermore, we present evidence collected from a variety of popular features and classifiers that the technique of classifying features aggregated over segments of audio is better than classifying either entire songs or individual shorttimescale features.
Evaluation of Feature Extractors and Psycho-Acoustic Transformations for Music Genre Classification
"... We present a study on the importance of psycho-acoustic transformations for effective audio feature calculation. From the results, both crucial and problematic parts of the algorithm for Rhythm Patterns feature extraction are identified. We furthermore introduce two new feature representations in th ..."
Abstract
-
Cited by 77 (20 self)
- Add to MetaCart
We present a study on the importance of psycho-acoustic transformations for effective audio feature calculation. From the results, both crucial and problematic parts of the algorithm for Rhythm Patterns feature extraction are identified. We furthermore introduce two new feature representations in this context: Statistical Spectrum Descriptors and Rhythm Histogram features. Evaluation on both the individual and combined feature sets is accomplished through a music genre classification task, involving 3 reference audio collections. Results are compared to published measures on the same data sets. Experiments confirmed that in all settings the inclusion of psycho-acoustic transformations provides significant improvement of classification accuracy.
Automatic genre classification using large high-level musical feature sets
- In Int. Conf. on Music Information Retrieval, ISMIR 2004
, 2004
"... This paper presents a system that extracts 109 musical features from symbolic recordings (MIDI, in this case) and uses them to classify the recordings by genre. The features used here are based on instrumentation, texture, rhythm, dynamics, pitch statistics, melody and chords. The classification is ..."
Abstract
-
Cited by 69 (3 self)
- Add to MetaCart
(Show Context)
This paper presents a system that extracts 109 musical features from symbolic recordings (MIDI, in this case) and uses them to classify the recordings by genre. The features used here are based on instrumentation, texture, rhythm, dynamics, pitch statistics, melody and chords. The classification is performed hierarchically using different sets of features at different levels of the hierarchy. Which features are used at each level, and their relative weightings, are determined using genetic algorithms. Classification is performed using a novel ensemble of feedforward neural networks and k-nearest neighbour classifiers. Arguments are presented emphasizing the importance of using high-level musical features, something that has been largely neglected in automatic classification systems to date in favour of low-level features. The effect on classification performance of varying the number of candidate features is examined in order to empirically demonstrate the importance of using a large variety of musically meaningful features. Two differently sized hierarchies are used in order to test the performance of the system under different conditions. Very encouraging classification success rates of 98% for root genres and 90 % for leaf genres are obtained for a hierarchical taxonomy consisting of 9 leaf genres.
Social tagging and music information retrieval
- Journal of New Music Research
"... Social tags are free text labels that are applied to items such as artists, albums and songs. Captured in these tags is a great deal of information that is highly relevant to Music Information Retrieval (MIR) researchers including infor-mation about genre, mood, instrumentation, and quality. Unfortu ..."
Abstract
-
Cited by 68 (1 self)
- Add to MetaCart
(Show Context)
Social tags are free text labels that are applied to items such as artists, albums and songs. Captured in these tags is a great deal of information that is highly relevant to Music Information Retrieval (MIR) researchers including infor-mation about genre, mood, instrumentation, and quality. Unfortunately there is also a great deal of irrelevant in-formation and noise in the tags. Imperfect as they may be, social tags are a source of human-generated contex-tual knowledge about music that may become an essential part of the solution to many MIR problems. In this article, we describe the state of the art in com-mercial and research social tagging systems for music. We describe how tags are collected and used in current sys-tems. We explore some of the issues that are encountered when using tags, and we suggest possible areas of explo-ration for future research. 1
Instrument recognition in polyphonic music based on automatic taxonomies
- IEEE Transactions on Speech and Audio Processing
, 2006
"... We propose a new approach to instrument recognition in the context of real music orchestrations ranging from solos to quartets. The strength of our approach is that it does not require prior musical source separation. Thanks to a hierarchical clustering algorithm exploiting robust probabilistic dist ..."
Abstract
-
Cited by 63 (9 self)
- Add to MetaCart
(Show Context)
We propose a new approach to instrument recognition in the context of real music orchestrations ranging from solos to quartets. The strength of our approach is that it does not require prior musical source separation. Thanks to a hierarchical clustering algorithm exploiting robust probabilistic distances, we obtain a taxonomy of musical ensembles which is used to efficiently classify possible combinations of instruments played simultaneously. Moreover, a wide set of acoustic features is studied including some new proposals. In particular, Signal to Mask Ratios are found to be useful features for audio classification. This study focuses on a single music genre (i.e. jazz) but combines a variety of instruments among which are percussion and singing voice. Using a varied database of sound excerpts from commercial recordings, we show that the segmentation of music with respect to the instruments played can be achieved with an average accuracy of 53%.
Evaluating Rhythmic Descriptors for Musical Genre Classification
, 2004
"... Organising or browsing music collections in a musically meaningful way calls for tagging the data in terms of e.g. rhythmic, melodic or harmonic aspects, among others. In some cases, such metadata can be extracted automatically from musical files; in others, a trained listener must extract it by han ..."
Abstract
-
Cited by 45 (11 self)
- Add to MetaCart
Organising or browsing music collections in a musically meaningful way calls for tagging the data in terms of e.g. rhythmic, melodic or harmonic aspects, among others. In some cases, such metadata can be extracted automatically from musical files; in others, a trained listener must extract it by hand. In this article, we consider a specific set of rhythmic descriptors for which we provide procedures of automatic extraction from audio signals. Evaluating the relevance of such descriptors is a di#cult task that can easily become highly subjective. To avoid this pitfall, we assessed the relevance of these descriptors by measuring their rate of success in genre classification experiments. We conclude on the particular relevance of the tempo and a set of 15 MFCC-like descriptors.
Evaluation Methods for Musical Audio Beat Tracking Algorithms
, 2009
"... A fundamental research topic in music information retrieval is the automatic extraction of beat locations from music signals. In this paper we address the under-explored topic of beat tracking evaluation. We present a review of existing evaluation models and, given their strengths and weaknesses, w ..."
Abstract
-
Cited by 44 (13 self)
- Add to MetaCart
(Show Context)
A fundamental research topic in music information retrieval is the automatic extraction of beat locations from music signals. In this paper we address the under-explored topic of beat tracking evaluation. We present a review of existing evaluation models and, given their strengths and weaknesses, we propose a new method based on a novel visualisation for beat tracking performance, the beat error histogram. To investigate the properties of evaluation methods we undertake a large scale beat tracking experiment. We conduct experiments using a new annotated test database which we make available to the research community. We demonstrate that the choice of evaluation method can have a significant impact on the relative performance of different beat tracking algorithms. On this basis we make a set of recommendations for With the development of the Music Information Retrieval Evaluation eXchange (MIREX) [1] evaluation has become a fundamental aspect of research in music information retrieval. Robust evaluation is crucial, not only to determine the individual successes and failures of a given algorithm, but also to measure the relative performance among different algorithms.
Using cultural metadata for artist recommendation
- In Proc WedelMusic Conf
, 2003
"... Our approach to generate recommendations for similar artists follows a recent tradition of authors tackling the problem not with content-based audio analysis. Following this novel procedure we rely on the acquisition, filtering and condensing of unstructured text-based information that can be found ..."
Abstract
-
Cited by 40 (6 self)
- Add to MetaCart
Our approach to generate recommendations for similar artists follows a recent tradition of authors tackling the problem not with content-based audio analysis. Following this novel procedure we rely on the acquisition, filtering and condensing of unstructured text-based information that can be found in the web. The beauty of this approach lies in the possibility to access so-called cultural metadata that is the agglomeration of several independent-originally subjective- perspectives about music. 1.
A Hierarchical Approach To Automatic Musical Genre Classification
- in Proc. Of the 6 th Int. Conf. on Digital Audio Effects (DAFx
, 2003
"... A system for the automatic classification of audio signals according to audio category is presented. The signals are recognized as speech, background noise and one of 13 musical genres. A large number of audio features are evaluated for their suitability in such a classification task, including well ..."
Abstract
-
Cited by 39 (1 self)
- Add to MetaCart
(Show Context)
A system for the automatic classification of audio signals according to audio category is presented. The signals are recognized as speech, background noise and one of 13 musical genres. A large number of audio features are evaluated for their suitability in such a classification task, including well-known physical and perceptual features, audio descriptors defined in the MPEG-7 standard, as well as new features proposed in this work. These are selected with regard to their ability to distinguish between a given set of audio types and to their robustness to noise and bandwidth changes. In contrast to previous systems, the feature selection and the classification process itself are carried out in a hierarchical way. This is motivated by the numerous advantages of such a tree-like structure, which include easy expansion capabilities, flexibility in the design of genre-dependent features and the ability to reduce the probability of costly errors. The resulting application is evaluated with respect to classification accuracy and computational costs.