Results 1  10
of
11
Computational auditory induction as a missingdata modelfitting problem with Bregman divergence,” Speech Communication, vol
, 2010
"... The human auditory system has the ability, known as auditory induction, to estimate the missing parts of a continuous auditory stream briefly covered by noise and perceptually resynthesize them. Humans are thus able to simultaneously analyze an auditory scene and reconstruct the underlying signal. I ..."
Abstract

Cited by 11 (2 self)
 Add to MetaCart
The human auditory system has the ability, known as auditory induction, to estimate the missing parts of a continuous auditory stream briefly covered by noise and perceptually resynthesize them. Humans are thus able to simultaneously analyze an auditory scene and reconstruct the underlying signal. In this article, we formulate this ability as a nonnegative matrix factorization (NMF) problem with unobserved data, and show how to solve it using an auxiliary function method. We explain how this method can also be generally related to the EM algorithm, enabling the use of prior distributions on the parameters. We show how sparseness is a key to global feature extraction, and that our method is ideally able to extract patterns which never occur completely. We finally illustrate on an example how our method is able to simultaneously analyze a scene and interpolate the gaps into it.
Consistent Wiener Filtering: Generalized TimeFrequency Masking Respecting Spectrogram Consistency
"... Abstract. Wiener filtering is one of the most widely used methods in audio source separation. It is often applied on timefrequency representations of signals, such as the shorttime Fourier transform (STFT), to exploit their shortterm stationarity, but so far the design of the Wiener timefrequenc ..."
Abstract

Cited by 10 (2 self)
 Add to MetaCart
(Show Context)
Abstract. Wiener filtering is one of the most widely used methods in audio source separation. It is often applied on timefrequency representations of signals, such as the shorttime Fourier transform (STFT), to exploit their shortterm stationarity, but so far the design of the Wiener timefrequency mask did not take into account the necessity for the output spectrograms to be consistent, i.e., to correspond to the STFT of a timedomain signal. In this paper, we generalize the concept of Wiener filtering to timefrequency masks which can involve manipulation of the phase as well by formulating the problem as a consistencyconstrained MaximumLikelihood one. We present two methods to solve the problem, one looking for the optimal timedomain signal, the other promoting consistency through a penalty function directly in the timefrequency domain. We show through experimental evaluation that, both in oracle conditions and combined with spectral subtraction, our method outperforms classical Wiener filtering.
Consistent Wiener Filtering for Audio Source Separation
, 2012
"... Wiener filtering is one of the most ubiquitous tools in signal processing, in particular for signal denoising and source separation. In the context of audio, it is typically applied in the timefrequency domain by means of the shorttime Fourier transform (STFT). Such processing does generally not ta ..."
Abstract

Cited by 8 (0 self)
 Add to MetaCart
(Show Context)
Wiener filtering is one of the most ubiquitous tools in signal processing, in particular for signal denoising and source separation. In the context of audio, it is typically applied in the timefrequency domain by means of the shorttime Fourier transform (STFT). Such processing does generally not take into account the relationship between STFT coefficients in different timefrequency bins due to the redundancy of the STFT, which we refer to as consistency. We propose to enforce this relationship in the design of the Wiener filter, either as a hard constraint or as a soft penalty. We derive two conjugate gradient algorithms for the computation of the filter coefficients and show improved audio source separation performance compared to the classical Wiener filter both in oracle and in blind conditions.
Fast signal reconstruction from magnitude stft spectrogram based on spectrogram consistency
 Proc. of International Conference on Digital Audio Effects DAFx ’10
, 2010
"... The modification of magnitude spectrograms is at the core of many audio signal processing methods, from source separation to sound modification or noise canceling, and reconstructing a natural sounding signal in such situations is thus a very important issue. This article presents recent theoretica ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
(Show Context)
The modification of magnitude spectrograms is at the core of many audio signal processing methods, from source separation to sound modification or noise canceling, and reconstructing a natural sounding signal in such situations is thus a very important issue. This article presents recent theoretical and experimental developments on the application to signal reconstruction from a modified magnitude spectrogram of the constraints that an array of complex numbers must verify to be a consistent shorttime Fourier transform (STFT) spectrogram, i.e., to be the STFT spectrogram of an actual realvalued signal. We give here further theoretical insights, present several potential variations on our previously introduced algorithm, investigate various techniques to speed up the signal reconstruction process, and present a thorough experimental comparison of the performance of all the considered algorithms. 1.
On the use of masking filters in sound source separation
 in Proc. of 15th International Conference on Digital Audio Effects
, 2012
"... This Conference Paper is brought to you for free and open access by the ..."
Abstract

Cited by 4 (2 self)
 Add to MetaCart
(Show Context)
This Conference Paper is brought to you for free and open access by the
Phase spectrum prediction of audio signals
 in International Symposium on Communications Control and Signal Processing (ISCCSP
, 2012
"... Modeling the phases of audio signals has received significantly less attention in comparison to the modeling of magnitudes. This paper proposes to use linear least squares and neural networks to predict phases from the neighboring points only in the phase spectrum. The simulation results show that t ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
(Show Context)
Modeling the phases of audio signals has received significantly less attention in comparison to the modeling of magnitudes. This paper proposes to use linear least squares and neural networks to predict phases from the neighboring points only in the phase spectrum. The simulation results show that there is a structure in the phase components which could be used in further analysis algorithms based on the phase spectrum. Index Terms — STFT, phase spectrum prediction, phase unwrapping, linear least squares, neural networks 1.
Nonnegative Matrix Factorization based Algorithms to cluster Frequency Basis Functions for Monaural Sound Source Separation
, 2013
"... Part of the Electrical and Electronics Commons This Theses, Ph.D is brought to you for free and open access by the Engineering at ..."
Abstract
 Add to MetaCart
(Show Context)
Part of the Electrical and Electronics Commons This Theses, Ph.D is brought to you for free and open access by the Engineering at
A Sparse Auditory Envelope Representation with Iterative Reconstruction for Audio Coding
, 2011
"... 1933–2009 iiiii Modern audio coding exploits the properties of the human auditory system to efficiently code speech and music signals. Perceptual domain coding is a branch of audio coding in which the signal is stored and transmitted as a set of parameters derived directly from the modeling of the h ..."
Abstract
 Add to MetaCart
1933–2009 iiiii Modern audio coding exploits the properties of the human auditory system to efficiently code speech and music signals. Perceptual domain coding is a branch of audio coding in which the signal is stored and transmitted as a set of parameters derived directly from the modeling of the human auditory system. Often, the perceptual representation is designed such that reconstruction can be achieved with limited resources but this usually means that some perceptually irrelevant information is included. In this thesis, we investigate perceptual domain coding by using a representation designed to contain only the audible information regardless of whether reconstruction can be performed efficiently. The perceptual representation we use is based on a multichannel Basilar membrane model, where each channel is decomposed into envelope and carrier components. We assume that the information in the carrier is also present in the envelopes and therefore discard the carrier components. The envelope components are sparsified using a transmultiplexing masking model and form our basic
IEEE SIGNAL PROCESSING LETTERS 1 Consistent Wiener Filtering for Audio Source Separation
, 2012
"... Abstract—Wiener filtering is one of the most ubiquitous tools in signal processing, in particular for signal denoising and source separation. In the context of audio, it is typically applied in the timefrequency domain by means of the shorttime Fourier transform (STFT). Such processing does genera ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract—Wiener filtering is one of the most ubiquitous tools in signal processing, in particular for signal denoising and source separation. In the context of audio, it is typically applied in the timefrequency domain by means of the shorttime Fourier transform (STFT). Such processing does generally not take into account the relationship between STFT coefficients in different timefrequency bins due to the redundancy of the STFT, which we refer to as consistency. We propose to enforce this relationship in the design of the Wiener filter, either as a hard constraint or as a soft penalty. We derive two conjugate gradient algorithms for the computation of the filter coefficients and show improved audio source separation performance compared to the classical Wiener filter both in oracle and in blind conditions.
Phasecontrolled
"... sound transfer based on maximallyinconsistent spectrograms ..."
(Show Context)