Results 1 - 10
of
16
Harmonic/percussive separation using median filtering
- In Proc. of the 13th Int. Conference on Digital Audio Effects (DAFx-10
, 2010
"... ABSTRACT In this paper, we present a fast, simple and effective method to separate the harmonic and percussive parts of a monaural audio signal. The technique involves the use of median filtering on a spectrogram of the audio signal, with median filtering performed across successive frames to suppr ..."
Abstract
-
Cited by 39 (1 self)
- Add to MetaCart
(Show Context)
ABSTRACT In this paper, we present a fast, simple and effective method to separate the harmonic and percussive parts of a monaural audio signal. The technique involves the use of median filtering on a spectrogram of the audio signal, with median filtering performed across successive frames to suppress percussive events and enhance harmonic components, while median filtering is also performed across frequency bins to enhance percussive events and supress harmonic components. The two resulting median filtered spectrograms are then used to generate masks which are then applied to the original spectrogram to separate the harmonic and percussive parts of the signal. We illustrate the use of the algorithm in the context of remixing audio material from commercial recordings.
ACTIVE MUSIC LISTENING INTERFACES BASED ON SIGNAL PROCESSING
, 2007
"... This paper introduces our research aimed at building “active music listening interfaces”. This research approach is intended to enrich end-users’ music listening experiences by applying music-understanding technologies based on signal processing. Active music listening is a way of listening to music ..."
Abstract
-
Cited by 28 (12 self)
- Add to MetaCart
(Show Context)
This paper introduces our research aimed at building “active music listening interfaces”. This research approach is intended to enrich end-users’ music listening experiences by applying music-understanding technologies based on signal processing. Active music listening is a way of listening to music through active interactions. We have developed seven interfaces for active music listening, such as interfaces for skipping sections of no interest within a musical piece while viewing a graphical overview of the entire song structure, for displaying virtual dancers or song lyrics synchronized with the music, for changing the timbre of instrument sounds in compact-disc recordings, and for browsing a large music collection to encounter interesting musical pieces or artists. These interfaces demonstrate the importance of music-understanding technologies and the benefit they offer to end users. Our hope is that this work will help change music listening into a more active, immersive experience.
Drum Sound Detection in Polyphonic Music with Hidden Markov Models
, 2009
"... This paper proposes a method for transcribing drums from polyphonic music using a network of connected hidden Markov models (HMMs). The task is to detect the temporal locations of unpitched percussive sounds (such as bass drum or hi-hat) and recognise the instruments played. Contrary to many earlier ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
(Show Context)
This paper proposes a method for transcribing drums from polyphonic music using a network of connected hidden Markov models (HMMs). The task is to detect the temporal locations of unpitched percussive sounds (such as bass drum or hi-hat) and recognise the instruments played. Contrary to many earlier methods, a separate sound event segmentation is not done, but connected HMMs are used to perform the segmentation and recognition jointly. Two ways of using HMMs are studied: modelling combinations of the target drums and a detector-like modelling of each target drum. Acoustic feature parametrisation is done with mel-frequency cepstral coefficients and their first-order temporal derivatives. The effect of lowering the feature dimensionality with principal component analysis and linear discriminant analysis is evaluated. Unsupervised acoustic model parameter adaptation with maximum likelihood linear regression is evaluated for compensating the differences between the training and target signals. The performance of the proposed method is evaluated on a publicly available data set containing signals with and without accompaniment, and compared with two reference methods. The results suggest that the transcription is possible using connected HMMs, and that using detector-like models for each target drum provides a better performance than modelling drum combinations.
USING TENSOR FACTORISATION MODELS TO SEPARATE DRUMS FROM POLYPHONIC MUSIC
"... This paper describes the use of Non-negative Tensor Factorisation models for the separation of drums from polyphonic audio. Im-proved separation of the drums is achieved through the incorpo-ration of Gamma Chain priors into the Non-negative Tensor Fac-torisation framework. In contrast to many previo ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
(Show Context)
This paper describes the use of Non-negative Tensor Factorisation models for the separation of drums from polyphonic audio. Im-proved separation of the drums is achieved through the incorpo-ration of Gamma Chain priors into the Non-negative Tensor Fac-torisation framework. In contrast to many previous approaches, the method used in this paper requires little or no pre-training or use of drum templates. The utility of the technique is shown on real-world audio examples. 1.
An error correction framework based on drum pattern periodicity for improving drum sound detection
- In Proc. ICASSP’06
, 2006
"... This paper presents a framework for correcting errors of automatic drum sound detection focusing on the periodicity of drum patterns. We define drum patterns as periodic structures found in onset sequences of bass and snare drum sounds. Our framework extracts periodic drum patterns from imperfect on ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
(Show Context)
This paper presents a framework for correcting errors of automatic drum sound detection focusing on the periodicity of drum patterns. We define drum patterns as periodic structures found in onset sequences of bass and snare drum sounds. Our framework extracts periodic drum patterns from imperfect onset sequences of detected drum sounds (bottom-up processing) and corrects errors using the periodicity of the drum patterns (top-down processing). We implemented this framework on our drum-sound detection system. We first obtained onset sequences of the drum sounds with our system and extracted drum patterns. On the basis of our observation that the same drum patterns tend to be repeated, we detected time points which deviate from the periodicity as error candidates. Finally, we verified each error candidate to judge whether it is an actual onset or not. Experiments of drum sound detection for polyphonic audio signals of popular CD recordings showed that our correction framework improved the average detection accuracy from 77.4 % to 80.7%. 1.
SIMULTANEOUS PROCESSING OF SOUND SOURCE SEPARATION AND MUSICAL INSTRUMENT IDENTIFICATION USING BAYESIAN SPECTRAL MODELING
"... This paper presents a method of both separating audio mixtures into sound sources and identifying the musical instruments of the sources. A statistical tone model of the power spectrogram, called an integrated model, is defined and source separation and instrument identification are carried out on t ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
(Show Context)
This paper presents a method of both separating audio mixtures into sound sources and identifying the musical instruments of the sources. A statistical tone model of the power spectrogram, called an integrated model, is defined and source separation and instrument identification are carried out on the basis of Bayesian inference. Since, the parameter distributions of the integrated model depend on each instrument, the instrument name is identified by selecting the one that has the maximum relative instrument weight. Experimental results showed correct instrument identification enables precise source separation even when many overtones overlap. Index Terms — Source separation, instrument identification, Bayesian methods, spectrogram
Music listening in the future: Augmented Music Understanding Interfaces and Crowd Music Listening
- in Proc. of the AES 42nd International Con! on Semantic Audio
, 2011
"... Correspondence should be addressed to Masataka Goto (m.goto[at]aist.go.jp) In the future, music listening can be more active, more immersive, richer, and deeper by using automatic music-understanding technologies (semantic audio analysis). In the rst half of this invited talk, four Aug-mented Music- ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
(Show Context)
Correspondence should be addressed to Masataka Goto (m.goto[at]aist.go.jp) In the future, music listening can be more active, more immersive, richer, and deeper by using automatic music-understanding technologies (semantic audio analysis). In the rst half of this invited talk, four Aug-mented Music-Understanding Interfaces that facilitate deeper understanding of music are introduced. In our interfaces, visualization of music content and music touch-up (customization) play important roles in augmenting people's understanding of music because understanding is deepened through seeing and editing. In the second half, a new style of music listening called Crowd Music Listening is discussed. By posting, sharing, and watching time-synchronous comments (semantic information), listeners can enjoy music to-gether with the crowd. Such Internet-based music listening with shared semantic information also helps music understanding because understanding is deepened through communication. Two systems that deal with new trends in music listening | time-synchronous comments and mashup music videos | are nally introduced. 1.
Automatic Transcription of Pitch Content in Music and Selected Applications
"... Transcription of music refers to the analysis of a music signal in order to produce a parametric representation of the sounding notes in the signal. This is conventionally carried out by listening to a piece of music and writing down the symbols of common musical notation to represent the occurring ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
(Show Context)
Transcription of music refers to the analysis of a music signal in order to produce a parametric representation of the sounding notes in the signal. This is conventionally carried out by listening to a piece of music and writing down the symbols of common musical notation to represent the occurring notes in the piece. Automatic transcription of music refers to the extraction of such representations using signal-processing methods. This thesis concerns the automatic transcription of pitched notes in musical audio and its applications. Emphasis is laid on the transcription of realistic polyphonic music, where multiple pitched and percussive instruments are sounding simultaneously. The methods included in this thesis are based on a framework which combines both low-level acoustic modeling and high-level musicological modeling. The emphasis in the acoustic modeling has been set to note events so that the methods produce discrete-pitch notes with onset times and durations
DRUM TRANSCRIPTION USING PARTIALLY FIXED NON-NEGATIVE MATRIX FACTORIZATION
"... In this paper, a drum transcription algorithm using partially fixed non-negative matrix factorization is presented. The pro-posed method allows users to identify percussive events in complex mixtures with a minimal training set. The algorithm decomposes the music signal into two parts: percussive pa ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
(Show Context)
In this paper, a drum transcription algorithm using partially fixed non-negative matrix factorization is presented. The pro-posed method allows users to identify percussive events in complex mixtures with a minimal training set. The algorithm decomposes the music signal into two parts: percussive part with pre-defined drum templates and harmonic part with un-defined entries. The harmonic part is able to adapt to the music content, allowing the algorithm to work in polyphonic mixtures. Drum event times can be simply picked from the percussive activation matrix with onset detection. The system is efficient and robust even with a minimal training set. The recognition rates for the ENST dataset vary from 56.7 to 78.9% for three percussive instruments extracted from polyphonic music.
AN OPEN-SOURCE DRUM TRANSCRIPTION SYSTEM FOR PURE DATA AND MAX MSP
"... This paper presents a drum transcription algorithm adjusted to the constraints of real-time audio. We introduce an instance filtering (IF) method using sub-band onset detection, which improves the performance of a system having at its core a feature-based K-nearest neighbor classifier (KNN). The ar- ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
(Show Context)
This paper presents a drum transcription algorithm adjusted to the constraints of real-time audio. We introduce an instance filtering (IF) method using sub-band onset detection, which improves the performance of a system having at its core a feature-based K-nearest neighbor classifier (KNN). The ar-chitecture proposed allows for adapting different parts of the algorithm for either bass drum, snare drum or hi-hat cym-bals. The open-source system is implemented in the graphic programming languages Pure Data (PD) and Max MSP, and aims to work with a large variety of drum sets. We evalu-ated its performance on a database of audio samples gener-ated from a well known collection of midi drum loops ran-domly matched with a diverse collection of drum sets. Both of the evaluation stages, testing and validation, show a signifi-cant improvement in the performance when using the instance filtering algorithm. Index Terms — drum transcription, feature-based classi-fication, real-time audio, Pure Data, Max MSP 1.