Results 1 - 10
of
169
Automatic Musical Genre Classification Of Audio Signals
- IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING
, 2002
"... ... describe music. They are commonly used to structure the increasing amounts of music available in digital form on the Web and are important for music information retrieval. Genre categorization for audio has traditionally been performed manually. A particular musical genre is characterized by sta ..."
Abstract
-
Cited by 829 (35 self)
- Add to MetaCart
(Show Context)
... describe music. They are commonly used to structure the increasing amounts of music available in digital form on the Web and are important for music information retrieval. Genre categorization for audio has traditionally been performed manually. A particular musical genre is characterized by statistical properties related to the instrumentation, rhythmic structure and form of its members. In this work, algorithms for the automatic genre categorization of audio signals are described. More specifically, we propose a set of features for representing texture and instrumentation. In addition a novel set of features for representing rhythmic structure and strength is proposed. The performance of those feature sets has been evaluated by training statistical pattern recognition classifiers using real world audio collections. Based on the automatic hierarchical genre classification two graphical user interfaces for browsing and interacting with large audio collections have been developed.
Mel Frequency Cepstral Coefficients for Music Modeling
- In International Symposium on Music Information Retrieval
, 2000
"... We examine in some detail Mel Frequency Cepstral Coefficients (MFCCs) - the dominant features used for speech recognition - and investigate their applicability to modeling music. In particular, we examine two of the main assumptions of the process of forming MFCCs: the use of the Mel frequency scale ..."
Abstract
-
Cited by 299 (3 self)
- Add to MetaCart
(Show Context)
We examine in some detail Mel Frequency Cepstral Coefficients (MFCCs) - the dominant features used for speech recognition - and investigate their applicability to modeling music. In particular, we examine two of the main assumptions of the process of forming MFCCs: the use of the Mel frequency scale to model the spectra; and the use of the Discrete Cosine Transform (DCT) to decorrelate the Mel-spectral vectors.
A Music Similarity Function Based On Signal Analysis
, 2001
"... We present a method to compare songs based solely on their audio content. Our technique forms a signature for each song based on K-means clustering of spectral features. The signatures can then be compared using the Earth Mover's Distance [1] which allows comparison of histograms with disparate ..."
Abstract
-
Cited by 198 (7 self)
- Add to MetaCart
We present a method to compare songs based solely on their audio content. Our technique forms a signature for each song based on K-means clustering of spectral features. The signatures can then be compared using the Earth Mover's Distance [1] which allows comparison of histograms with disparate bins. Preliminary objective and subjective results on a database of over 8000 songs are encouraging. For 20 songs judged by two users, on average 2.5 out of the top 5 songs returned were judged similar. We also found that our measure is robust to simple corruption of the audio signal.
Improving timbre similarity: How high is the sky
- Results in Speech and Audio Sciences
"... Abstract. We report on experiments done in an attempt to improve the performance of a music similarity measure which we introduced earlier. The technique aims at comparing music titles on the basis of their global “timbre”, which has many applications in the field of Music Information Retrieval. Suc ..."
Abstract
-
Cited by 181 (14 self)
- Add to MetaCart
Abstract. We report on experiments done in an attempt to improve the performance of a music similarity measure which we introduced earlier. The technique aims at comparing music titles on the basis of their global “timbre”, which has many applications in the field of Music Information Retrieval. Such measures of timbre similarity have seen a growing interest lately, and every contribution (including ours) is yet another instantiation of the same basic pattern recognition architecture, only with different algorithm variants and parameters. Most give encouraging results with a little effort, and imply that near-perfect results would just extrapolate by fine-tuning the algorithms ’ parameters. However, such systematic testing over large, interdependent parameter spaces is both difficult and costly, as it requires to work on a whole general meta-database architecture. This paper contributes in two ways to the current state of the art. We report on extensive tests over very many parameters and algorithmic variants, either already envisioned in the literature or not. This leads to an improvement over existing algorithms of about 15 % R-precision. But most importantly, we describe many variants that surprisingly do not lead to any substancial improvement. Moreover, our simulations suggest the existence of a “glass ceiling ” at R-precision about 65 % which cannot probably be overcome by pursuing such variations on the same theme.
Visualizing Music and Audio using Self-Similarity
- In Proc. ACM Multimedia 99
, 1999
"... This paper presents a novel approach to visualizing the time structure of music and audio. The acoustic similarity between any two instants of an audio recording is displayed in a 2D representation, allowing identification of structural and rhythmic characteristics. Examples are presented for classi ..."
Abstract
-
Cited by 154 (6 self)
- Add to MetaCart
(Show Context)
This paper presents a novel approach to visualizing the time structure of music and audio. The acoustic similarity between any two instants of an audio recording is displayed in a 2D representation, allowing identification of structural and rhythmic characteristics. Examples are presented for classical and popular music. Applications include content-based analysis and segmentation, as well as tempo and structure extraction. 1.1 Keywords music visualization, audio analysis, audio similarity 2.
Automatic audio segmentation using a measure of audio novelty
- In Proceedings of IEEE International Conference on Multimedia and Expo (ICME
, 2000
"... Abstract -This paper describes methods for automatically locating points of significant change in music or audio, by analyzing local self-similarity. This method can find individual note boundaries or even natural segment boundaries such as verse/chorus or speech/music transitions, even in the abse ..."
Abstract
-
Cited by 154 (11 self)
- Add to MetaCart
Abstract -This paper describes methods for automatically locating points of significant change in music or audio, by analyzing local self-similarity. This method can find individual note boundaries or even natural segment boundaries such as verse/chorus or speech/music transitions, even in the absence of cues such as silence. This approach uses the signal to model itself, and thus does not rely on particular acoustic cues nor requires training. We present a wide variety of applications, including indexing, segmenting, and beat tracking of music and audio. The method works well on a wide variety of audio sources.
An overview of audio information retrieval
, 1999
"... The problem of audio information retrieval is familiar to anyone who has returned from vacation to find an answering machine full of messages. While there is not yet an “AltaVista ” for the audio data type, many workers are finding ways to automatically locate, index, and browse audio using recent ..."
Abstract
-
Cited by 153 (1 self)
- Add to MetaCart
The problem of audio information retrieval is familiar to anyone who has returned from vacation to find an answering machine full of messages. While there is not yet an “AltaVista ” for the audio data type, many workers are finding ways to automatically locate, index, and browse audio using recent advances in speech recognition and machine listening. This paper reviews the state of the art in audio information retrieval, and presents recent advances in automatic speech recognition, word spotting, speaker and music identification, and audio similarity with a view towards making audio less “opaque”. A special section addresses intelligent interfaces for navigating and browsing audio and multimedia documents, using automatically derived information to go beyond the tape recorder metaphor.
A Large-Scale Evaluation of Acoustic and Subjective Music Similarity Measures
- Computer Music Journal
, 2003
"... this paper, we examine both acoustic and subjective approaches for calculating similarity between artists, comparing their performance on a common database of 400 popular artists. Specifically, we evaluate acoustic techniques based on Mel-frequency cepstral coefficients and an intermediate `anch ..."
Abstract
-
Cited by 139 (7 self)
- Add to MetaCart
this paper, we examine both acoustic and subjective approaches for calculating similarity between artists, comparing their performance on a common database of 400 popular artists. Specifically, we evaluate acoustic techniques based on Mel-frequency cepstral coefficients and an intermediate `anchor space' of genre classification, and subjective techniques which use data from The All Music Guide, from a survey, from playlists and personal collections, and from web-text mining
Representing musical genre: a state of the art
- Journal of New Music Research
"... Musical genre is probably the most popular music descriptor. In the context of large musical databases and Electronic Music Distribution, genre is therefore a crucial metadata for the description of music content. However, genre is intrinsically ill-defined and attempts at defining genre precisely h ..."
Abstract
-
Cited by 128 (5 self)
- Add to MetaCart
Musical genre is probably the most popular music descriptor. In the context of large musical databases and Electronic Music Distribution, genre is therefore a crucial metadata for the description of music content. However, genre is intrinsically ill-defined and attempts at defining genre precisely have a strong tendency to end up in circular, ungrounded projections of fantasies. Is genre an intrinsic attribute of music titles, as, say, tempo? Or is genre a extrinsic description of the whole piece? In this article, we discuss the various approaches in representing musical genre, and propose to classify these approaches in three main categories: manual, prescriptive and emergent approaches. We discuss the pros and cons of each approach, and illustrate our study with results of the Cuidado IST project. 1.
Music Similarity Measures: What's The Use ?
, 2002
"... Electronic Music Distribution (EMD) is in demand of robust, automatically extracted music descriptors. We introduce a timbral similarity measures for comparing music titles. This measure is based on a Gaussian model of cepstrum coefficients. We describe the timbre extractor and the corresponding tim ..."
Abstract
-
Cited by 127 (5 self)
- Add to MetaCart
Electronic Music Distribution (EMD) is in demand of robust, automatically extracted music descriptors. We introduce a timbral similarity measures for comparing music titles. This measure is based on a Gaussian model of cepstrum coefficients. We describe the timbre extractor and the corresponding timbral similarity relation. We describe experiments in assessing the quality of the similarity relation, and show that the measure is able to yield interesting similarity relations, in particular when used in conjunction with other similarity relations. We illustrate the use of the descriptor in several EMD applications developed in the context of the Cuidado European project.