Results 1 - 10
of
14
Content-Based Music Information Retrieval: Current Directions and Future Challenges
, 2008
"... ..."
Chroma binary similarity and local alignment applied to cover song identification
- IEEE Trans. on Audio, Speech, and Language Processing
, 2008
"... Abstract—We present a new technique for audio signal comparison based on tonal subsequence alignment and its application to detect cover versions (i.e., different performances of the same underlying musical piece). Cover song identification is a task whose popularity has increased in the Music Infor ..."
Abstract
-
Cited by 16 (6 self)
- Add to MetaCart
Abstract—We present a new technique for audio signal comparison based on tonal subsequence alignment and its application to detect cover versions (i.e., different performances of the same underlying musical piece). Cover song identification is a task whose popularity has increased in the Music Information Retrieval (MIR) community along in the past, as it provides a direct and objective way to evaluate music similarity algorithms. This article first presents a series of experiments carried out with two state-of-the-art methods for cover song identification. We have studied several components of these (such as chroma resolution and similarity, transposition, beat tracking or Dynamic Time Warping constraints), in order to discover which characteristics would be desirable for a competitive cover song identifier. After analyzing many cross-validated results, the importance of these characteristics is discussed, and the best-performing ones are finally applied to the newly proposed method. Multiple evaluations of this one confirm a large increase in identification accuracy when comparing it with alternative state-of-the-art approaches.
Normalized Cuts for predominant melodic source separation
- Proceedings of the International Conference on Music Information Retrieval
, 2007
"... Abstract—The predominant melodic source, frequently the singing voice, is an important component of musical signals. In this paper, we describe a method for extracting the predominant source and corresponding melody from “real-world ” polyphonic music. The proposed method is inspired by ideas from c ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
Abstract—The predominant melodic source, frequently the singing voice, is an important component of musical signals. In this paper, we describe a method for extracting the predominant source and corresponding melody from “real-world ” polyphonic music. The proposed method is inspired by ideas from computational auditory scene analysis. We formulate predominant melodic source tracking and formation as a graph partitioning problem and solve it using the normalized cut which is a global criterion for segmenting graphs that has been used in computer vision. Sinusoidal modeling is used as the underlying representation. A novel harmonicity cue which we term harmonically wrapped peak similarity is introduced. Experimental results supporting the use of this cue are presented. In addition, we show results for automatic melody extraction using the proposed approach. Index Terms—Computational auditory scene analysis (CASA), music information retrieval (MIR), normalized cut, sinusoidal modeling, spectral clustering. I.
EVALUATION OF MULTIPLE-F0 ESTIMATION AND TRACKING SYSTEMS
"... Multi-pitch estimation of sources in music is an ongoing research area that has a wealth of applications in music information retrieval systems. This paper presents the systematic evaluations of over a dozen competing methods and algorithms for extracting the fundamental frequencies of pitched sound ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Multi-pitch estimation of sources in music is an ongoing research area that has a wealth of applications in music information retrieval systems. This paper presents the systematic evaluations of over a dozen competing methods and algorithms for extracting the fundamental frequencies of pitched sound sources in polyphonic music. The evaluations were carried out as part of the Music Information Retrieval Evaluation eXchange (MIREX) over the course of two years, from 2007 to 2008. The generation of the dataset and its corresponding ground-truth, the methods by which systems can be evaluated, and the evaluation results of the different systems are presented and discussed. 1.
Automatic Transcription
"... This article proposes a method for the automatic transcription of the melody, bass line, and chords in polyphonic pop music. The method uses a frame-wise pitch-salience estimator as a feature extraction front-end. For the melody and bass-line transcription, this is followed by acoustic modeling of n ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
This article proposes a method for the automatic transcription of the melody, bass line, and chords in polyphonic pop music. The method uses a frame-wise pitch-salience estimator as a feature extraction front-end. For the melody and bass-line transcription, this is followed by acoustic modeling of note events and musicological modeling of note transitions. The acoustic models include a model for the target notes (i.e., melody or bass notes) and a background model. The musicological model involves key estimation and note bigrams that determine probabilities for transitions between target notes. A transcription of the melody or the bass line is obtained using Viterbi search via the target and the background note models. The performance of the melody and the bass-line transcription is evaluated using approximately 8.5 hours of realistic polyphonic music. The chord transcription maps the pitch salience estimates to a pitch-class representation and uses trained chord models and chord-transition probabilities to produce a transcription consisting of major and minor triads. For chords, the evaluation material consists of the first eight Beatles albums. The method is computationally efficient and allows causal implementation, so it can process streaming audio. Transcription of music refers to the analysis of an acoustic music signal for producing a parametric representation of the signal. The representation may be a music score with a meticulous arrangement for each instrument or an approximate description of melody and chords in the piece, for example. The latter type of transcription is commonly used in commercial songbooks of pop music and is usually sufficient for musicians or music hobbyists to play the piece. On the other hand, more detailed transcriptions are often employed in classical music to preserve the exact arrangement of the composer.
Signal Processing for Music Analysis
, 2011
"... Music signal processing may appear to be the junior relation of the large and mature field of speech signal processing, not least because many techniques and representations originally developed for speech have been applied to music, often with good results. However, music signals possess specific ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Music signal processing may appear to be the junior relation of the large and mature field of speech signal processing, not least because many techniques and representations originally developed for speech have been applied to music, often with good results. However, music signals possess specific acoustic and structural characteristics that distinguish them from spoken language or other nonmusical signals. This paper provides an overview of some signal analysis techniques that specifically address musical dimensions such as melody, harmony, rhythm, and timbre. We will examine how particular characteristics of music signals impact and determine these techniques, and we highlight a number of novel music analysis and retrieval tasks that such processing makes possible. Our goal is to demonstrate that, to be successful, music audio signal processing techniques must be informed by a deep and thorough insight into the nature of music itself.
unknown title
"... Abstract. The drum machine has been an important tool in music production for decades. However, its flawless way of playing drum patterns is often perceived as mechanical and rigid, far from the groove provided by a human drummer. This paper presents research towards enhancing the drum machine with ..."
Abstract
- Add to MetaCart
Abstract. The drum machine has been an important tool in music production for decades. However, its flawless way of playing drum patterns is often perceived as mechanical and rigid, far from the groove provided by a human drummer. This paper presents research towards enhancing the drum machine with learning capabilities. The drum machine learns user-specific variations (i.e. the groove) from human drummers, and stores the groove as attractors in Echo State Networks (ESNs). The ESNs are purely generative (i.e. not driven by an input signal) and the output is used by the drum machine to imitate the playing style of human drummers, making it a cost-effective way of achieving life-like drums. 1
A Drum Machine that Learns to Groove
"... Abstract. Music production relies increasingly on advanced hardware and software tools that makes the creative process more flexible and versatile. The advancement of these tools helps reduce both the time and money required to create music. This paper presents research towards enhancing the functio ..."
Abstract
- Add to MetaCart
Abstract. Music production relies increasingly on advanced hardware and software tools that makes the creative process more flexible and versatile. The advancement of these tools helps reduce both the time and money required to create music. This paper presents research towards enhancing the functionality of a key tool, the drum machine. We add the ability to learn how to groove from human drummers, an important human quality when it comes to drumming. We show how the learning drum machine overcomes limitations of traditional drum machines.
Automatic Transcription of Pitch Content in Music and Selected Applications
"... Transcription of music refers to the analysis of a music signal in order to produce a parametric representation of the sounding notes in the signal. This is conventionally carried out by listening to a piece of music and writing down the symbols of common musical notation to represent the occurring ..."
Abstract
- Add to MetaCart
Transcription of music refers to the analysis of a music signal in order to produce a parametric representation of the sounding notes in the signal. This is conventionally carried out by listening to a piece of music and writing down the symbols of common musical notation to represent the occurring notes in the piece. Automatic transcription of music refers to the extraction of such representations using signal-processing methods. This thesis concerns the automatic transcription of pitched notes in musical audio and its applications. Emphasis is laid on the transcription of realistic polyphonic music, where multiple pitched and percussive instruments are sounding simultaneously. The methods included in this thesis are based on a framework which combines both low-level acoustic modeling and high-level musicological modeling. The emphasis in the acoustic modeling has been set to note events so that the methods produce discrete-pitch notes with onset times and durations
A CHROMA-BASED SALIENCE FUNCTION FOR MELODY AND BASS LINE ESTIMATION FROM MUSIC AUDIO SIGNALS
"... In this paper we present a salience function for melody and bass line estimation based on chroma features. The salience function is constructed by adapting the Harmonic Pitch Class Profile (HPCP) and used to extract a mid-level representation of melodies and bass lines which uses pitch classes rathe ..."
Abstract
- Add to MetaCart
In this paper we present a salience function for melody and bass line estimation based on chroma features. The salience function is constructed by adapting the Harmonic Pitch Class Profile (HPCP) and used to extract a mid-level representation of melodies and bass lines which uses pitch classes rather than absolute frequencies. We show that our salience function has comparable performance to alternative state of the art approaches, suggesting it could be successfully used as a first stage in a complete melody and bass line estimation system. 1

