Results 1 - 10
of
15
Analysis Of Reassigned Spectrograms For Musical Transcription
- IN IEEE WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS, MOHONK
, 2001
"... The reassignment method for the short-time Fourier transform is proposed as a technique for improving the time and frequency estimates of musical audio data. Based on this representation, four classes of expected objects (sinusoid, unresolved sinusoid, transient and noise) are proposed and explained ..."
Abstract
-
Cited by 8 (4 self)
- Add to MetaCart
The reassignment method for the short-time Fourier transform is proposed as a technique for improving the time and frequency estimates of musical audio data. Based on this representation, four classes of expected objects (sinusoid, unresolved sinusoid, transient and noise) are proposed and explained. Pattern classification methods are then used to extract objects conforming to these classes from individual frames of the reassigned spectrogram, with each frame being examined independently. Results for several simple real-world examples are presented, showing the capability of this method even without the aid of tracking from frame to frame. The main benefits of the proposed reassignment stage are that it yields an improved time-frequency localisation estimate relative to standard methods, and that it produces a measure of the variance of these estimates to be used as an aid in later processing.
Estimating Partial Frequency and Frequency Slope Using Reassignment Operators
- in Proc. of the International Computer Music Conference (ICMC’02
, 2002
"... The estimation of the frequency slope of a partial from its peak in the DFT spectrum today is possible only if a Gaussian window is used. In the following we derive a new method to estimate the frequency slope of a partial from its DFT spectral peak based on the reassignment operators. Compared to t ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
The estimation of the frequency slope of a partial from its peak in the DFT spectrum today is possible only if a Gaussian window is used. In the following we derive a new method to estimate the frequency slope of a partial from its DFT spectral peak based on the reassignment operators. Compared to the Gaussian window based method our new method can be used with a much larger variety of windows and often achieves better accuracy for equal resolution. After a short introduction into the reassignment method we present a short analytical derivation of the method and we investigate into the analysis properties in relation with the window properties. Based on the analytical derivation of the method we explain the basic requirements for the windows to be used to achieve high accuracy estimates for frequency and frequency slope.
Analysis of Musical Audio for Polyphonic Transcription - 1st Year Report
, 2001
"... This report centres around some of this issues involved in automatic transcription of polyphonic musical audio signals. That is, representing the information contained in the audio in such a way as to be recognisable and usable by a musician. First, a review of the various fields which have a bearin ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
This report centres around some of this issues involved in automatic transcription of polyphonic musical audio signals. That is, representing the information contained in the audio in such a way as to be recognisable and usable by a musician. First, a review of the various fields which have a bearing on the subject is put forward, including music, music psychology, auditory psychology and signal processing. Then a thorough appraisal of previous work on automated polyphonic transcription is presented. Next, original work on the use of time-frequency reassignment as a front end is imparted and finally, future ideas are expounded and a timetable for forthcoming research is given.
The Timbre Model
- in Workshop on Current Research Directions in Computer Music
, 2001
"... This paper presents the timbre model, a signal model which has been built to better understand the relationship between the perception of timbre and the musical sounds most commonly associated with timbre. In addition, an extension to the timbre model incorporating expressions is introduced. The ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
This paper presents the timbre model, a signal model which has been built to better understand the relationship between the perception of timbre and the musical sounds most commonly associated with timbre. In addition, an extension to the timbre model incorporating expressions is introduced. The presented work therefore has relation to a large field of science, including auditory perception, signal processing, physical models and the acoustics of musical instruments, music expression, and other computer music research. The timbre model is based on a sinusoidal model, and it consists of a spectral envelope, frequencies, a temporal envelope and different irregularity parameters. The paper is divided into four parts: an overview of the research done on the perception of timbre, an overview of the signal processing aspects dealing with sinusoidal modeling, the timbre model, and an introduction of some expressive extensions to the timbre model.
1 Adaptive additive modeling with continuous parameter trajectories
"... Abstract — This article investigates into the estimation of time varying amplitude and phase trajectories of sinusoidal signal components. The new algorithm adaptively optimizes the parameters of a smoothly connected piecewise polynomial trajectory model. A mathematical analysis is presented that re ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Abstract — This article investigates into the estimation of time varying amplitude and phase trajectories of sinusoidal signal components. The new algorithm adaptively optimizes the parameters of a smoothly connected piecewise polynomial trajectory model. A mathematical analysis is presented that relates the user selected meta parameters of the trajectory model (polynomial order, segment size, and smoothness at the junctions) to the analysis properties of the adaptive algorithm. It reveals new insights into the relationships between the meta parameters and the resulting time/frequency resolution of the estimate. Moreover, it is shown that for efficient optimization the phase trajectory needs to be represented in a specific form. A new approach to address the bias/variance tradeoff of the polynomial phase trajectory model by means of regularization is presented and a complete adaptive analysis/synthesis system for sinusoidal sound components is proposed. The adaptive analysis system is investigated by means of simple tracking experiments to demonstrate the effect of the smoothness constraints and compare the results with a standard STFT base frequency estimation technique and known Cramer Rao bounds. The potential of the adaptive strategy for the modeling of sinusoidal transients is discussed and it is shown that it achieves similar transient quality as a previously proposed method, however, with considerably lower model error. Two examples for modeling real world signals are discussed. I.
Sound Morphing using Loris and the Reassigned Bandwdith-Enhanced Additive Sound Model: Practice and Applications
, 2002
"... The reassigned bandwidth-enhanced additive sound model is a high-fidelity representation that allows manipulations and transformations to be applied to a great variety of sounds, including noisy and nonharmonic sounds. Combining sinusoidal and noise energy in a homogeneous representation, the reassi ..."
Abstract
- Add to MetaCart
The reassigned bandwidth-enhanced additive sound model is a high-fidelity representation that allows manipulations and transformations to be applied to a great variety of sounds, including noisy and nonharmonic sounds. Combining sinusoidal and noise energy in a homogeneous representation, the reassigned bandwidth-enhanced model is ideally suited to sound morphing, and is implemented in the open source software library, Loris. This paper presents methods for using Loris and the reassigned bandwidth-enhanced additive model to achieve high-fidelity sound representations and manipulations, and introduces new software tools allowing non-programmers to avail themselves of the sound modeling and manipulation capabilities of the Loris package.
Adaptive Additive Synthesis Using Spline Based Parameter Trajectory Models
, 2001
"... We present the results of an analytical study concerned with the frequency resolution of our adaptive additive synthesis model. First, we derive the relation between the characteristics of the piece wise polynomial parameter trajectories of the model and the frequency resolution that can be obtained ..."
Abstract
- Add to MetaCart
We present the results of an analytical study concerned with the frequency resolution of our adaptive additive synthesis model. First, we derive the relation between the characteristics of the piece wise polynomial parameter trajectories of the model and the frequency resolution that can be obtained by means of adapting the model using a minimum error objective. Second, we present an analytical investigation of the problem to model signal resonances beyond the frequency resolution of the model. Based on the analytical description of the situation a new solution is proposed that leads to high quality additive models of non stationary sounds with dense resonances, i.e. choir or drum sounds, and provides increased robustness with respect to sound transformations.
Time-scale Modification using the Phase Vocoder
, 2001
"... The phase vocoder has been used as a time-scale-modification tool for several decades. Applying large positive modification factors to different kinds of sounds (time-stretching), the result will always sound "phasy" or "reverberant". The sound quality can be improved by "locking" the phases. Phase- ..."
Abstract
- Add to MetaCart
The phase vocoder has been used as a time-scale-modification tool for several decades. Applying large positive modification factors to different kinds of sounds (time-stretching), the result will always sound "phasy" or "reverberant". The sound quality can be improved by "locking" the phases. Phase-locking preserves the phase relations around a local maximum in the magnitude spectrum. For large modification factors, locking the entire phase spectrum sounds "rigid".

