Results 1 - 10
of
46
Aggregate features and AdaBoost for music classification
- Machine Learning
, 2006
"... Abstract. We present an algorithm that predicts musical genre and artist from an audio waveform. Our method uses the ensemble learner AdaBoost to select from a set of audio features that have been extracted from segmented audio and then aggregated. Our classifier proved to be the most effective meth ..."
Abstract
-
Cited by 34 (11 self)
- Add to MetaCart
Abstract. We present an algorithm that predicts musical genre and artist from an audio waveform. Our method uses the ensemble learner AdaBoost to select from a set of audio features that have been extracted from segmented audio and then aggregated. Our classifier proved to be the most effective method for genre classification at the recent MIREX 2005 international contests in music information extraction, and the second-best method for recognizing artists. This paper describes our method in detail, from feature extraction to song classification, and presents an evaluation of our method on three genre databases and two artist-recognition databases. Furthermore, we present evidence collected from a variety of popular features and classifiers that the technique of classifying features aggregated over segments of audio is better than classifying either entire songs or individual shorttimescale features.
Content-Based Music Information Retrieval: Current Directions and Future Challenges
, 2008
"... ..."
IMPROVING GENRE CLASSIFICATION BY COMBINATION OF AUDIO AND SYMBOLIC DESCRIPTORS USING A TRANSCRIPTION SYSTEM
"... Recent research in music genre classification hints at a glass ceiling being reached using timbral audio features. To overcome this, the combination of multiple different feature sets bearing diverse characteristics is needed. We propose a new approach to extend the scope of the features: We transcr ..."
Abstract
-
Cited by 15 (2 self)
- Add to MetaCart
Recent research in music genre classification hints at a glass ceiling being reached using timbral audio features. To overcome this, the combination of multiple different feature sets bearing diverse characteristics is needed. We propose a new approach to extend the scope of the features: We transcribe audio data into a symbolic form using a transcription system, extract symbolic descriptors from that representation and combine them with audio features. With this method, we are able to surpass the glass ceiling and to further improve music genre classification, as shown in the experiments through three reference music databases and comparison to previously published performance results. 1
Analysis of Minimum Distances in High-Dimensional Musical Spaces
"... Language Processing. Do not distribute! We propose an automatic method for measuring music similarity using audio features so we can enhance the current generation of taxonomy-based music search engines and recommender systems. Efficiency is important in an Internet-connected world, where users have ..."
Abstract
-
Cited by 15 (3 self)
- Add to MetaCart
Language Processing. Do not distribute! We propose an automatic method for measuring music similarity using audio features so we can enhance the current generation of taxonomy-based music search engines and recommender systems. Efficiency is important in an Internet-connected world, where users have access to millions of tracks. Brute-force algorithms for searching through this content are not practical. Many previous approaches to track similarity require pair-wise processing between all audio features in a database and therefore are generally not practical for large collections. Our features are time-ordered overlapping fixed-length subsequences of equal-temperament pitch-class profiles and log-frequency cepstral coefficients; the technique is analogous to the technique of shingling used for text retrieval. We use locality sensitive hashing to implement approximate matching for our high-dimensional audio shingles. This approach retrieves near neighbors within a specified distance of the query rather than retrieving only the nearest neighbors; the degree of approximation, ɛ, is a parameter. LSH achieves sub linear query time performance with respect to the number of tracks in a collection but requires an accurate threshold on retrieval distance for efficient performance. In this paper, we present a new method for estimating the optimal search radius for LSH retrieval tasks by modeling the between-shingle distance distributions for non-similar audio shingles. We derive an estimator for a minimum distance for two shingles to be considered drawn from different tracks. therefore, are considered to be drawn from similar tracks. We evaluate our proposed methods on three contrasting music similarity tasks: retrieval of mis-attributed recordings (Apocrypha), retrieval of the same work by performed by different artists (Opus) and retrieval of edited and sampled versions of a query track by remix artists (Remixes). Our results achieve near-perfect performance in the first two tasks and 80 % precision at 70 % recall in the third task.
Audio information retrieval using semantic similarity
- In IEEE ICASSP
, 2007
"... We improve upon query-by-example for content-based audio information retrieval by ranking items in a database based on semantic similarity, rather than acoustic similarity, to a query example. The retrieval system is based on semantic concept models that are learned from a training data set containi ..."
Abstract
-
Cited by 13 (9 self)
- Add to MetaCart
We improve upon query-by-example for content-based audio information retrieval by ranking items in a database based on semantic similarity, rather than acoustic similarity, to a query example. The retrieval system is based on semantic concept models that are learned from a training data set containing both audio examples and their text captions. Using the concept models, the audio tracks are mapped into a semantic feature space, where each dimension indicates the strength of the semantic concept. Audio retrieval is then based on ranking the database tracks by their similarity to the query in the semantic space. We experiment with both semantic- and acousticbased retrieval systems on a sound effects database and show that the semantic-based system improves retrieval both quantitatively and qualitatively. Index Terms — computer audition, audio retrieval, semantic similarity 1.
Speeding Up Music Similarity
- in Proceedings of the MIREX Annual Music Information Retrieval eXchange
, 2005
"... This paper describes (1) the submission to the ISMIR’04 genre classification contest and (2) the submission to the MIREX’05 (Music Information Retrieval eXchange) audio-based genre classification and artist identification tasks. The main difference between the submissions is the reduction of computa ..."
Abstract
-
Cited by 9 (2 self)
- Add to MetaCart
This paper describes (1) the submission to the ISMIR’04 genre classification contest and (2) the submission to the MIREX’05 (Music Information Retrieval eXchange) audio-based genre classification and artist identification tasks. The main difference between the submissions is the reduction of computation time in the order of magnitudes. This paper concludes with a discussion of the relationship between genre classification and artist identification, the relationship between similarity and classification, and references to related MIREX’05 submissions. 1 IMPLEMENTATION OVERVIEW Features are extracted from 22kHz mono wav input (2 minutes from the center of each piece are used for further analysis). For the 2004 submission these features are cluster models of MFCC spectra. The 2005 submission additionally uses fluctuation patterns and two descriptors derived from them: Gravity and Focus. For each piece in the test set the distance to all pieces in the training set is computed. A nearest neighbor classifier is used. There is no training other than storing the features of the training data. Each piece in the test set is assigned the genre label (or artist’s name) of the piece closest to it. 1.1 M2K Specific The functions are implemented in Matlab 7 and submitted with an M2K wrapper. The 2004 submission requires the Netlab Toolbox and the signal processing toolbox. The 2005 submission does not require any additional toolboxes. The same functions are used for the genre classification and artist identification tasks. 1.2 Computation Time The CPU times given in Table 1 are measured on a 1.3GHz Intel Centrino laptop. The 2004 submission does not fulfill the MIREX’05 time constraints (72 hours per task). For example, it takes 10 days to compute the (symmetric) distance matrix on a collection with 3000 pieces. The 2005 submission completes this in less than 4 hours. 2
Audio-based music similarity and retrieval: Combining a spectral similarity model with information extracted from fluctuation patterns. Implementation submitted to the 3rd annual Music Information Retrieval Evaluation eXchange
- in Proceedings of the International Symposium on Music Information Retrieval
, 2006
"... This paper describes the implementation submitted by the author to the MIREX’06 (Music Information Retrieval eXchange) evaluation track on audio-based music similarity and retrieval. In addition, this paper summarizes the optimization of this implementation and its evaluation prior to submission. Fi ..."
Abstract
-
Cited by 9 (0 self)
- Add to MetaCart
This paper describes the implementation submitted by the author to the MIREX’06 (Music Information Retrieval eXchange) evaluation track on audio-based music similarity and retrieval. In addition, this paper summarizes the optimization of this implementation and its evaluation prior to submission. Finally, a detailed analysis and discussion of the MIREX results is presented. Overall, this implementation performed slightly better in terms of quality and computation time than the other implementations. However, the measured differences were not significant. 1.
RHYME AND STYLE FEATURES FOR MUSICAL GENRE CLASSIFICATION BY SONG LYRICS
"... How individuals perceive music is influenced by many different factors. The audible part of a piece of music, its sound, does for sure contribute, but is only one aspect to be taken into account. For example, cultural information as well constitute how we experience music. Next to symbolic and audio ..."
Abstract
-
Cited by 9 (3 self)
- Add to MetaCart
How individuals perceive music is influenced by many different factors. The audible part of a piece of music, its sound, does for sure contribute, but is only one aspect to be taken into account. For example, cultural information as well constitute how we experience music. Next to symbolic and audio based music information retrieval, which focus on the sound of music, song lyrics, for instance, can be used to improve classification or similarity ranking of music. Song lyrics exhibit specific properties different from traditional text documents – many lyrics are for example composed in rhyming verses, and may have different frequencies for certain parts-of-speech when compared to other text documents. Further, lyrics may use ‘slang ’ language or differ greatly in the length and complexity of the language used, which can be measured by some statistical features such as word / verse length, and the amount of repeating text. In this paper, we present a novel set of features developed from textual analysis of song lyrics, and combine them with and compare them to classical bag-of-words indexing approaches. We present results for musical genre classification on a test collection in order to demonstrate our analysis. 1
Simac: Semantic interaction with music audio contents
- Journal of Intelligent Information Systems (accepted
, 2005
"... ..."
SMARTER THAN GENIUS? HUMAN EVALUATION OF MUSIC RECOMMENDER SYSTEMS.
"... Genius is a popular commercial music recommender system that is based on collaborative filtering of huge amounts of user data. To understand the aspects of music similarity that collaborative filtering can capture, we compare Genius to two canonical music recommender systems: one based purely on art ..."
Abstract
-
Cited by 7 (3 self)
- Add to MetaCart
Genius is a popular commercial music recommender system that is based on collaborative filtering of huge amounts of user data. To understand the aspects of music similarity that collaborative filtering can capture, we compare Genius to two canonical music recommender systems: one based purely on artist similarity, the other purely on similarity of acoustic content. We evaluate this comparison with a user study of 185 subjects. Overall, Genius produces the best recommendations. We demonstrate that collaborative filtering can actually capture similarities between the acoustic content of songs. However, when evaluators can see the names of the recommended songs and artists, we find that artist similarity can account for the performance of Genius. A system that combines these musical cues could generate music recommendations that are as good as Genius, even when collaborative filtering data is unavailable. 1.

