Results 1 - 10
of
68
Automatic Segmentation of Acoustic Musical Signals Using Hidden Markov Models
- IEEE Transactions on Pattern Analysis and Machine Intelligence
, 1998
"... this paper we address an important step towards our goal of automatic musical accompaniment --- the segmentation problem. Given a score to a piece of monophonic music and a sampled recording of a performance of that score, we attempt to segment the data into a sequence of contiguous regions correspo ..."
Abstract
-
Cited by 74 (9 self)
- Add to MetaCart
this paper we address an important step towards our goal of automatic musical accompaniment --- the segmentation problem. Given a score to a piece of monophonic music and a sampled recording of a performance of that score, we attempt to segment the data into a sequence of contiguous regions corresponding to the notes and rests in the score. Within the framework of a hidden Markov model, we model our prior knowledge, perform unsupervised learning of the the data model parameters, and compute the segmentation that globally minimizes the posterior expected number of segmentation errors. We also show how to produce "on-line" estimates of score position. We present examples of our experimental results and readers are encouraged to access actual sound data we have made available from these experiments
On Tempo Tracking: Tempogram Representation and Kalman Filtering
, 2000
"... We formulate tempo tracking in a Bayesian framework where a tempo tracker is modeled as a stochastic dynamical system. The tempo is modeled as a hidden state variable of the system and is estimated by a Kalman filter. The Kalman filter operates on a Tempogram, a wavelet-like multiscale expansion ..."
Abstract
-
Cited by 63 (8 self)
- Add to MetaCart
We formulate tempo tracking in a Bayesian framework where a tempo tracker is modeled as a stochastic dynamical system. The tempo is modeled as a hidden state variable of the system and is estimated by a Kalman filter. The Kalman filter operates on a Tempogram, a wavelet-like multiscale expansion of a real performance. An important advantage of our approach is that it is possible to formulate both off-line or real-time algorithms. The simulation results on a systematically collected set of MIDI piano performances of Yesterday and Michelle by the Beatles shows accurate tracking of approximately %90 of the beats.
Tracking Musical Beats in Real Time
, 1990
"... : Identifying the temporal location of downbeats is a fundamental musical skill. Observing that previous attempts to automate this process are constrained to hold a single current notion of beat timing and placement, we find that they will fail to predict beats and not recover beyond the point at ..."
Abstract
-
Cited by 53 (3 self)
- Add to MetaCart
: Identifying the temporal location of downbeats is a fundamental musical skill. Observing that previous attempts to automate this process are constrained to hold a single current notion of beat timing and placement, we find that they will fail to predict beats and not recover beyond the point at which the first mistake is made. We propose a new model that uses beam search to consider multiple interpretations of the performance. At any time, predictions of beat timing and placement are made according to the most credible of many interpretations under consideration. Introduction. Identifying the temporal location of downbeats is a fundamental musical skill. Even musically untrained humans can tap their foot with the beat when they hear a musical performance. Humans generally perform this task with ease and precision even in the face of unusual rhythms, musical expressiveness, and imprecise performances. A fully general, automatic beat tracker would be of great value in many task...
Monte Carlo Methods for Tempo Tracking and Rhythm Quantization
- JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH
, 2003
"... We present a probabilistic generarive model for timing deviations in expressive music performance. The structure of the proposed model is equivalent to a switching state space model. The switch variables correspond to discrete note locations as in a musical score. The continuous hidden variables ..."
Abstract
-
Cited by 44 (7 self)
- Add to MetaCart
We present a probabilistic generarive model for timing deviations in expressive music performance. The structure of the proposed model is equivalent to a switching state space model. The switch variables correspond to discrete note locations as in a musical score. The continuous hidden variables denote the tempo. We formulate two well known music recognition problems, namely tempo tracking and automatic transcription (rhythm quantization) as filtering and maximum a posteriori (MAP) state estimation tasks. Ex- act computation of posterior features such as the MAP state is intractable in this model class, so we introduce Monte Carlo methods for integration and optimization. We compare Markov Chain Monte Carlo (MCMC) methods (such as Gibbs sampling, simulated annealing and iterative improvement) and sequential Monte Carlo methods (particle filters). Our simulation results suggest better results with sequential methods. The methods can be applied in both online and batch scenarios such as tempo tracking and transcription and are thus potentially useful in a number of music applications such as adaptive automatic accompaniment, score typesetting and music information retrieval.
Pattern discovery techniques for music audio
- In Proc. International Conference on Music Information Retrieval
, 2002
"... Human listeners are able to recognize structure in music through the perception of repetition and other relationships within a piece of music. This work aims to automate the task of music analysis. Music is “explained ” in terms of embedded relationships, especially repetition of segments or phrases ..."
Abstract
-
Cited by 33 (3 self)
- Add to MetaCart
Human listeners are able to recognize structure in music through the perception of repetition and other relationships within a piece of music. This work aims to automate the task of music analysis. Music is “explained ” in terms of embedded relationships, especially repetition of segments or phrases. The steps in this process are the transcription of audio into a representation with a similarity or distance metric, the search for similar segments, forming clusters of similar segments, and explaining music in terms of these clusters. Several transcription methods are considered: monophonic pitch estimation, chroma (spectral) representation, and polyphonic transcription followed by harmonic analysis. Also, several algorithms that search for similar segments are described. These techniques can be used to perform an analysis of musical structure, as illustrated by examples. 1.
A Comparison of Melodic Database Retrieval Techniques Using Sung Queries
, 2002
"... Query-by-humming systems search a database of music for good matches to a sung, hummed, or whistled melody. Errors in transcription and variations in pitch and tempo can cause substantial mismatch between queries and targets. Thus, algorithms for measuring melodic similarity in query-by-humming syst ..."
Abstract
-
Cited by 32 (6 self)
- Add to MetaCart
Query-by-humming systems search a database of music for good matches to a sung, hummed, or whistled melody. Errors in transcription and variations in pitch and tempo can cause substantial mismatch between queries and targets. Thus, algorithms for measuring melodic similarity in query-by-humming systems should be robust. We compare several variations of search algorithms in an effort to improve search precision. In particular, we describe a new frame-based algorithm that significantly outperforms note-by-note algorithms in tests using sung queries and a database of MIDI-encoded music.
Score-Performance Matching using HMMs
- In Proceedings of the ICMC
, 1999
"... In this paper we will describe an implementation of a score-performance matching, capable of score following, based on a stochastic approach using Hidden Markov Models. ..."
Abstract
-
Cited by 29 (7 self)
- Add to MetaCart
In this paper we will describe an implementation of a score-performance matching, capable of score following, based on a stochastic approach using Hidden Markov Models.
Name that tune: A pilot study in finding a melody from a sung query
- Journal of the American Society for Information Science and Technology
, 2004
"... We have created a system for music search and retrieval. A user sings a theme from the desired piece of music. The sung theme (query) is converted into a sequence of pitch-intervals and rhythms. This sequence is compared to musical themes (targets) stored in a database. The top pieces are returned t ..."
Abstract
-
Cited by 25 (7 self)
- Add to MetaCart
We have created a system for music search and retrieval. A user sings a theme from the desired piece of music. The sung theme (query) is converted into a sequence of pitch-intervals and rhythms. This sequence is compared to musical themes (targets) stored in a database. The top pieces are returned to the user in order of similarity to the sung theme. We describe, in detail, two different approaches to measuring similarity between database themes and the sung query. In the first, queries are compared to database themes using standard string-alignment algorithms. Here, similarity between target and query is determined by edit cost. In the second approach, pieces in the database are represented as hidden Markov models (HMMs). In this approach, the query is treated as an observation sequence and a target is judged similar to the query if its HMM has a high likelihood of generating the query. In this article we report our approach to the construction of a target database of themes, encoding, and transcription of user queries, and the results of preliminary experimentation with a set of sung queries. Our experiments show that while no approach is clearly superior to the other system, string matching has a slight advantage. Moreover, neither approach surpasses human performance.
A Probabilistic Expert System for Automatic Musical Accompaniment
- Journal of Computational and Graphical Statistics
, 1999
"... A methodology is presented that allows a computer to play the role of musical accompanist in a non-improvised musical composition for soloist and accompaniment. The modeling of the accompaniment incorporates a number of distinct knowledge sources including timing information extracted in real-time f ..."
Abstract
-
Cited by 23 (5 self)
- Add to MetaCart
A methodology is presented that allows a computer to play the role of musical accompanist in a non-improvised musical composition for soloist and accompaniment. The modeling of the accompaniment incorporates a number of distinct knowledge sources including timing information extracted in real-time from the soloist's acoustic signal, an understanding of the soloist's interpretation learned from rehearsals, and prior knowledge that guides the accompaniment toward musically plausible renditions. The solo and accompaniment parts are represented collectively as a large number of Gaussian random variables with a specified conditional independence structure --- a Bayesian Belief Network. Within this framework a principled and computationally feasible method for generating real-time accompaniment is presented that incorporates the relevant knowledge sources. The EM algorithm is used to adapt the accompaniment to the soloist's interpretation through a series of rehearsals. A demonstration is provided from J.S. Bach's Cantata 12.
Score Following: State of the Art and New Developments
- In New Interfaces for Musical Expression (NIME
, 2003
"... Score following is the synchronisation of a computer with a performer playing aknownmusicalscore.Itnowhasahistory of about twenty years as a research and musical topic, and is an ongoing project at Ircam. We present an overview of existing and historical score following systems, followed by fundame ..."
Abstract
-
Cited by 23 (7 self)
- Add to MetaCart
Score following is the synchronisation of a computer with a performer playing aknownmusicalscore.Itnowhasahistory of about twenty years as a research and musical topic, and is an ongoing project at Ircam. We present an overview of existing and historical score following systems, followed by fundamental definitions and terminology, and considerations about score formats, evaluation of score followers, and training. The score follower that we developed at Ircam is based on a Hidden Markov Model and on the modeling of the expected signal received from the performer. The model has been implemented in an audio and a Midi version, and is now being used in production. We report here our first experiences and our first steps towards a complete evaluation of system performances. Finally, we indicate directions how score following can go beyondtheartisticapplications known today.

