Results 1 - 10
of
21
A Robust Mid-level Representation for Harmonic Content in Music Signals
, 2005
"... When considering the problem of audio-to-audio matching, determining musical similarity using low-level features such as Fourier transforms and MFCCs is an extremely difficult task, as there is little semantic information available. Full semantic transcription of audio is an unreliable and imperfect ..."
Abstract
-
Cited by 53 (5 self)
- Add to MetaCart
When considering the problem of audio-to-audio matching, determining musical similarity using low-level features such as Fourier transforms and MFCCs is an extremely difficult task, as there is little semantic information available. Full semantic transcription of audio is an unreliable and imperfect task in the best case, an unsolved problem in the worst. To this end we propose a robust mid-level representation that incorporates both harmonic and rhythmic information, without attempting full transcription. We describe a process for creating this representation automatically, directly from multi-timbral and polyphonic music signals, with an emphasis on popular music. We also offer various evaluations of our techniques. Moreso than most approaches working from raw audio, we incorporate musical knowledge into our assumptions, our models, and our processes. Our hope is that by utilizing this notion of a musically-motivated mid-level representation we may help bridge the gap between symbolic and audio research.
Polyphonic Audio Matching and Alignment for Music Retrieval
- in Proc. IEEE WASPAA
, 2003
"... We describe a method that aligns polyphonic audio recordings of music to symbolic score information in standard MIDI files without the difficult process of polyphonic transcription. By using this method, we can search through the MIDI database to find the MIDI file corresponding to a polyphonic audi ..."
Abstract
-
Cited by 52 (5 self)
- Add to MetaCart
We describe a method that aligns polyphonic audio recordings of music to symbolic score information in standard MIDI files without the difficult process of polyphonic transcription. By using this method, we can search through the MIDI database to find the MIDI file corresponding to a polyphonic audio recording. 1.
Polyphonic Music Modeling with Random Fields
- MM'03
, 2003
"... Recent interest in the area of music information retrieval and related technologies is exploding. However, very few of the existing techniques take advantage of recent developments in statistical modeling. In this paper we discuss an application of Random Fields to the problem of creating accurate y ..."
Abstract
-
Cited by 14 (1 self)
- Add to MetaCart
Recent interest in the area of music information retrieval and related technologies is exploding. However, very few of the existing techniques take advantage of recent developments in statistical modeling. In this paper we discuss an application of Random Fields to the problem of creating accurate yet flexible statistical models of polyphonic music. With such models in hand, the challenges of developing e#ective searching, browsing and organization techniques for the growing bodies of music collections may be successfully met. We o#er an evaluation of these models in terms of perplexity and prediction accuracy, and show that random fields not only outperform Markov chains, but are much more robust in terms of overfitting.
A fast, randomised, maximal subset matching algorithm for document-level music retrieval
- Ministry of Energy, Telecommunications and Posts
, 2006
"... We present MSM, a new maximal subset matching algorithm, for MIR at score level with polyphonic texts and patterns. First, we argue that the problem MSM and its ancestors, the SIA family of algorithms, solve is 3SUM-hard and, therefore, subquadratic solutions must involve approximation. MSM is such ..."
Abstract
-
Cited by 8 (3 self)
- Add to MetaCart
We present MSM, a new maximal subset matching algorithm, for MIR at score level with polyphonic texts and patterns. First, we argue that the problem MSM and its ancestors, the SIA family of algorithms, solve is 3SUM-hard and, therefore, subquadratic solutions must involve approximation. MSM is such a solution; we describe it, and argue that, at O(n log n) time with no large constants, it is orders of magnitude more time-efficient than its closest competitor. We also evaluate MSM’s performance on a retrieval problem addressed by the OMRAS project, and show that it outperforms OMRAS on this task by a considerable margin.
Simac: Semantic interaction with music audio contents
- Journal of Intelligent Information Systems (accepted
, 2005
"... ..."
Searching Musical Audio Using Symbolic Queries
"... Finding a piece of music based on its content is a key problem in music information retrieval. For example, a user may be interested in finding music based on knowledge of only a small fragment of the overall tune. In this paper, we consider the searching of musical audio using symbolic queries. We ..."
Abstract
-
Cited by 6 (2 self)
- Add to MetaCart
Finding a piece of music based on its content is a key problem in music information retrieval. For example, a user may be interested in finding music based on knowledge of only a small fragment of the overall tune. In this paper, we consider the searching of musical audio using symbolic queries. We first propose a relative pitch approach for representing queries and pieces. Experiments show that this technique, while effective, works best when the whole tune is used as a query. We then present an algorithm for matching based on a pitch classes approach, using the longest common subsequence between a query and target. Experimental evaluation shows that our technique is highly effective, with a mean average precision of 0.77 on a collection of 1808 recordings. Significantly, our technique is robust for truncated queries, being able to maintain effectiveness and to retrieve correct answers whether the query fragment is taken from the beginning, middle, or end of a piece. This represents a significant reduction in the burden placed on users when formulating queries.
Interdisciplinary research issues in music information retrieval: ISMIR 2000–2002
- Journal of New Music Research
, 2003
"... Music Information Retrieval (MIR) is an interdisciplinary research area that has grown out of the need to manage burgeoning collections of music in digital form. Its diverse disciplinary communities, exemplified by the recently established ISMIR conference series, have yet to articulate a common res ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
Music Information Retrieval (MIR) is an interdisciplinary research area that has grown out of the need to manage burgeoning collections of music in digital form. Its diverse disciplinary communities, exemplified by the recently established ISMIR conference series, have yet to articulate a common research agenda or agree on methodological principles and metrics of success. In order for MIR to succeed, researchers need to work with real user communities and develop research resources such as reference music collections, so that the wide variety of techniques being developed in MIR can be meaningfully compared with one another. Out of these efforts, a common MIR practice can emerge.
Effective retrieval of polyphonic audio with polyphonic symbolic queries
- Proceedings of the 9th ACM SIGMM International Workshop on Multimedia Information Retrieval
, 2007
"... Accurately finding audio recordings in response to symbolic queries is one of the key challenges in the field of music information retrieval. Pitch is one of the main features of music; in this paper we propose and evaluate approaches for using pitch information in polyphonic symbolic queries to ret ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
Accurately finding audio recordings in response to symbolic queries is one of the key challenges in the field of music information retrieval. Pitch is one of the main features of music; in this paper we propose and evaluate approaches for using pitch information in polyphonic symbolic queries to retrieve full tracks of audio recordings. The audio data is first converted into symbolic data, using an automated transcription process. This is a noisy process, adding up to three times as many notes to the transcription than are actually present. Nevertheless, recordings can be accurately retrieved by manually-constructed queries (either in full or truncated) using the longest common subsequence algorithm (and a sliding window if the queries are truncated). Precision at 1 of about 80 % was achieved, and around 85 % of queries return correct answers in the top 10 from a collection of 1808 recordings. Truncated queries are as effective as untruncated queries for retrieving correct answers in the first rank position. Thus, the burden on users is reduced as they only need to produce a small fraction of a song as a query.

