Results 1 - 10
of
12
Aggregate features and AdaBoost for music classification
- Machine Learning
, 2006
"... Abstract. We present an algorithm that predicts musical genre and artist from an audio waveform. Our method uses the ensemble learner AdaBoost to select from a set of audio features that have been extracted from segmented audio and then aggregated. Our classifier proved to be the most effective meth ..."
Abstract
-
Cited by 34 (11 self)
- Add to MetaCart
Abstract. We present an algorithm that predicts musical genre and artist from an audio waveform. Our method uses the ensemble learner AdaBoost to select from a set of audio features that have been extracted from segmented audio and then aggregated. Our classifier proved to be the most effective method for genre classification at the recent MIREX 2005 international contests in music information extraction, and the second-best method for recognizing artists. This paper describes our method in detail, from feature extraction to song classification, and presents an evaluation of our method on three genre databases and two artist-recognition databases. Furthermore, we present evidence collected from a variety of popular features and classifiers that the technique of classifying features aggregated over segments of audio is better than classifying either entire songs or individual shorttimescale features.
A supervised classification algorithm for note onset detection
- EURASIP Journal on Applied Signal Processing
, 2007
"... This paper presents a novel approach to detecting onsets in music audio files. We use a supervised learning algorithm to classify spectrogram frames extracted from digital audio as being onsets or non-onsets. Frames classified as onsets are then treated with a simple peakpicking algorithm based on a ..."
Abstract
-
Cited by 9 (1 self)
- Add to MetaCart
This paper presents a novel approach to detecting onsets in music audio files. We use a supervised learning algorithm to classify spectrogram frames extracted from digital audio as being onsets or non-onsets. Frames classified as onsets are then treated with a simple peakpicking algorithm based on a moving average. In this paper we present two versions of this approach. The first version uses a single neural network classifier. The second version combines the predictions of several networks trained using different hyperparameters. In the paper we describe the details of the algorithm and summarize the performance of both variants on several datasets. We also examine our choice of hyperparameters by describing results of cross validation experiments done on a custom dataset. We conclude that a supervised learning approach to note onset detection performs well and warrants further investigation. 1
A Supervised Approach for Detecting Boundaries in Music Using Difference Features and Boosting
- In Proceedings of the 5th International Conference on Music Information Retrieval (ISMIR
, 2007
"... A musical boundary is a transition between two musical segments such as a verse and a chorus. Our goal is to automatically detect musical boundaries using temporallylocal audio features. We develop a set of difference features that indicate when there are changes in perceptual aspects (e.g., timbre, ..."
Abstract
-
Cited by 8 (2 self)
- Add to MetaCart
A musical boundary is a transition between two musical segments such as a verse and a chorus. Our goal is to automatically detect musical boundaries using temporallylocal audio features. We develop a set of difference features that indicate when there are changes in perceptual aspects (e.g., timbre, harmony, melody, rhythm) of the music. We show that many individual difference features are useful for detecting boundaries. By combining these features and formulating the problem as a supervised learning problem, we can further improve performance. This is an alternative to previous work on music segmentation which has focused on unsupervised approaches based on notions of self-similarity computed over an entire song. We evaluate performance using a publicly available data set of 100 copyright-cleared pop/rock songs, each of which has been segmented by a human expert. 1
Meta-features and AdaBoost for music classification
- Machine Learning Journal : Special Issue on Machine Learning in Music
, 2006
"... Abstract. One of the biggest challenges facing current methods for classifying music by genre or artist is that features of the sound are computed on very small temporal scales (20 to 50 milliseconds), while the labels need to be assigned at relatively large temporal scales (3 to 5 minutes). We addr ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
Abstract. One of the biggest challenges facing current methods for classifying music by genre or artist is that features of the sound are computed on very small temporal scales (20 to 50 milliseconds), while the labels need to be assigned at relatively large temporal scales (3 to 5 minutes). We address this challenge by partitioning songs into smaller pieces and classifying each one separately. Our choice of features together with an AdaBoost.MH classifier proved to be the most effective method for genre classification at the recent MIREX 2005 international contests in music information extraction, and the second-best method for recognizing artists. This paper describes our method in detail, from feature extraction to song classification, and presents an evaluation of our method on three genre databases and two artist-recognition databases. Furthermore, we present evidence that the method of partitioning songs is better than classifying either entire songs or individual features, using a variety of popular features and classifiers.
Temporal feature integration for music organisation,” Ph.D. dissertation, Informatics and Mathematical Modelling
- University of Denmark
"... This Ph.D. thesis focuses on temporal feature integration for music organisation. Temporal feature integration is the process of combining all the feature vectors of a given time-frame into a single new feature vector in order to capture relevant information in the frame. Several existing methods fo ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
This Ph.D. thesis focuses on temporal feature integration for music organisation. Temporal feature integration is the process of combining all the feature vectors of a given time-frame into a single new feature vector in order to capture relevant information in the frame. Several existing methods for handling sequences of features are formulated in the temporal feature integration framework. Two datasets for music genre classification have been considered as valid test-beds for music organisation. Human evaluations of these, have been obtained to access the subjectivity on the datasets. Temporal feature integration has been used for ranking various short-time features at different time-scales. This include short-time features such as the Mel frequency cepstral coefficients (MFCC), linear predicting coding coefficients (LPC) and various MPEG-7 short-time features. The ‘consensus sensitivity ranking ’ approach is proposed for ranking the short-time features at larger time-scales according to their discriminative power in a music genre classification task.
Nejdl: “Improving music genre classification using collaborative tagging data,” WSDM
, 2009
"... As a fundamental and critical component of music information retrieval (MIR) systems, music genre classification has attracted considerable research attention. Automatically classifying music by genre is, however, a challenging problem due to the fact that music is an evolving art. While most of the ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
As a fundamental and critical component of music information retrieval (MIR) systems, music genre classification has attracted considerable research attention. Automatically classifying music by genre is, however, a challenging problem due to the fact that music is an evolving art. While most of the existing work categorizes music using features extracted from music audio signals, in this paper, we propose to exploit the semantic information embedded in tags supplied by users of social networking websites. Particularly, we consider the tag information by creating a graph of tracks so that tracks are neighbors if they are similar in terms of their associated tags. Two classification methods based on the track graph are developed. The first one employs a classification scheme which simultaneously considers the audio content and neighborhood of tracks. In contrast, the second one is a two-level classifier which initializes genre label for unknown tracks using their audio content, and then iteratively updates the genres considering the influence from their neighbors. A set of optimizing strategies are designed for the purpose of further enhancing the quality of the twolevel classifier. Extensive experiments are conducted on realworld data collected from Last.fm. Promising experimental results demonstrate the benefit of using tags for accurate music genre classification.
MIREX AUDIO GENRE CLASSIFICATION
"... This extended abstract details a submission to the Music Information Retrieval Evaluation eXchange in the Audio Genre classification task. This submission is very similar to the system that placed second in the 2004 ISMIR Audio description contest. A novel feature set and segmentation of features is ..."
Abstract
- Add to MetaCart
This extended abstract details a submission to the Music Information Retrieval Evaluation eXchange in the Audio Genre classification task. This submission is very similar to the system that placed second in the 2004 ISMIR Audio description contest. A novel feature set and segmentation of features is introduced and modifications to the Decision Tree based model used in the 2004 submission are detailed. Finally, the results achieved in the evaluation are analysed. Keywords: MIREX, Audio, Genre. 1 FEATURE SET Two feature sets are calculated in this submission, one describing the timbre of the audio and another describing the rhythmic content. 22 kHz audio is taken as input,
Speech and Music Classification and Separation: A Review
, 2006
"... Abstract. The classification and separation of speech and music signals have attracted attention by many researchers. The purpose of the classification process is needed to build two different libraries: speech library and music library, from a stream of sounds. However, the separation process is ne ..."
Abstract
- Add to MetaCart
Abstract. The classification and separation of speech and music signals have attracted attention by many researchers. The purpose of the classification process is needed to build two different libraries: speech library and music library, from a stream of sounds. However, the separation process is needed in a cocktail-party problem to separate speech from music and remove the undesired one. In this paper, a review of the existing classification and separation algorithms is presented and discussed. The classification algorithms will be divided into three categories: time-domain, frequency-domain, and time-frequency domain approaches. The time-domain approaches used in literature are: the zero-crossing rate (ZCR), the short-time energy (STE), the ZCR and the STE with positive derivative, with some of their modified versions, the variance of the roll-off, and the neural networks. The frequency-domain approaches are mainly based on: spectral centroid, variance of the spectral centroid, spectral flux, variance of the spectral flux, roll-off of the spectrum, cepstral residual, and the delta pitch. The time-frequency domain approaches have not been yet tested thoroughly in literature; so, the spectrogram and the evolutionary spectrum will be introduced. Also, some new algorithms dealing with music and speech separation and segregation processes will be presented. 1.
fulfilment of the requirements for the degree of
"... Audio content processing for automatic music genre classification: descriptors, ..."
Abstract
- Add to MetaCart
Audio content processing for automatic music genre classification: descriptors,

