## Denoising Source Separation

### Cached

### Download Links

- [cogprints.ecs.soton.ac.uk]
- [cogprints.org]
- [jmlr.csail.mit.edu]
- [www.jmlr.org]
- [jmlr.org]
- DBLP

### Other Repositories/Bibliography

Citations: | 30 - 6 self |

### BibTeX

@MISC{Särelä_denoisingsource,

author = {Jaakko Särelä and Harri Valpola},

title = {Denoising Source Separation},

year = {}

}

### Years of Citing Articles

### OpenURL

### Abstract

A new algorithmic framework called denoising source separation (DSS) is introduced. The main benefit of this framework is that it allows for easy development of new source separation algorithms which are optimised for specific problems. In this framework, source separation algorithms are constucted around denoising procedures. The resulting algorithms can range from almost blind to highly specialised source separation algorithms. Both simple linear and more complex nonlinear or adaptive denoising schemes are considered. Some existing independent component analysis algorithms are reinterpreted within DSS framework and new, robust blind source separation algorithms are suggested. Although DSS algorithms need not be explicitly based on objective functions, there is often an implicit objective function that is optimised. The exact relation between the denoising procedure and the objective function is derived and a useful approximation of the objective function is presented. In the experimental section, various DSS schemes are applied extensively to artificial data, to real magnetoencephalograms and to simulated CDMA mobile network signals. Finally, various extensions to the proposed DSS algorithms are considered. These include nonlinear observation mappings, hierarchical models and overcomplete, nonorthogonal feature spaces. With these extensions, DSS appears to have relevance to many existing models of neural information processing.

### Citations

8613 | Maximum Likelihood from Incomplete Data via the EM algorithm
- Dempster, Laird, et al.
- 1977
(Show Context)
Citation Context ...ting several sources are reviewed in Sec. 2.4. Sec. 2.5 discusses a speedup technique called spectral shift. 235 sN ⎤sSÄRELÄ AND VALPOLA 2.1 One-Unit Algorithm for Source Separation The EM algorithm (=-=Dempster et al., 1977-=-) is a method for performing maximum likelihood estimation when part of the data is missing. One way to perform EM estimation in the case of linear models is to assume that the missing data consists o... |

1609 | Independent Component Analysis - Hyvärinen, Karhunen, et al. - 2001 |

975 |
Emergence of Simple-Cell Receptive Field Properties by Learning a Sparse Code for Natural Images
- Olshausen, Field
- 1996
(Show Context)
Citation Context ...r, important field of application is feature extraction. ICA has been used for example in the extraction of features from natural images, similar to those that are found in the primary visual cortex (=-=Olshausen and Field, 1996-=-). It is reasonable to consider DSS extensions that have been suggested in the field of feature extraction as well. For instance, until now we have only considered the extraction of multiple component... |

830 | Optimization by Vector Space Methods
- Luenberger
- 1969
(Show Context)
Citation Context ...to estimate several sources by iteratively applying the DSS algorithm several times. The convergence to previously extracted sources is prevented by making their eigenvalues zero: worth = w − AA T w (=-=Luenberger, 1969-=-), where A now contains the already estimated mixing vectors. Note that in this deflation scheme, it is possible to use different kinds of denoising procedures when the sources differ in characteristi... |

753 |
Optimal Filtering
- Anderson, Moore
- 1979
(Show Context)
Citation Context ...unction f(s) while the other parts of the algorithm remain mostly the same. Denoising is useful as such and therefore there is a wide literature of sophisticated denoising methods to choose from (see =-=Anderson and Moore, 1979-=-). Moreover, one usually has some knowledge about the signals of interest and thus possesses the information needed for denoising. In fact, quite often the signals extracted by BSS techniques would be... |

559 | Fast and robust fixed-point algorithms for independent component analysis
- Hyvärinen
- 1999
(Show Context)
Citation Context ...s can be used for incorporating complex prior information (see Valpola et al., 2001). However, the Bayesian approach does not always result in simple or computationally efficient algorithms. FastICA (=-=Hyvärinen, 1999-=-) provides a set of algorithms for performing ICA based on optimising easily calculable contrast functions. The algorithms are fast but often more accurate results can be achieved by computationally m... |

552 |
CDMA: Principles of Spread Spectrum Communication
- Viterbi
- 1995
(Show Context)
Citation Context ...gnal analysis (see Gazzaniga, 2000; Rangayyan, 2002), careful design of experimental setups provides us with presumed signal characteristics. In man-made technology, such as a CDMA mobile system (see =-=Viterbi, 1995-=-), the transmitted signals are even more restricted. The Bayesian approach provides a sound framework for including prior information into inferences about the signals. Recently, several Bayesian ICA ... |

490 | Wavelets and Subband Coding
- Vetterli, Kovacevic
- 1995
(Show Context)
Citation Context ...be directly applied to source separation, producing better results than purely blind techniques. There are also very general noise reduction techniques such as wavelet denoising (Donoho et al., 1995; =-=Vetterli and Kovacevic, 1995-=-) or median filtering (Kuosmanen and Astola, 1997) which can be applied in exploratory data analysis. In this section, we discuss denoising functions ranging from simple but powerful linear ones to so... |

294 | Real-time computing without stable states: A new framework for neural computation based on perturbations
- Maass, Natschläger, et al.
- 2002
(Show Context)
Citation Context ...7sSÄRELÄ AND VALPOLA There are several possibilities for the nonlinear feature expansion in hierarchical DSS. For instance kernel PCA (Schölkopf et al., 1998), sparse coding or liquid state machines (=-=Maass et al., 2002-=-) can be used. The hierarchical DSS can be used in a fully supervised setting by fixing the activations of the topmost layer to target outputs. Supervised learning often suffers from slow learning in ... |

247 | Wavelet shrinkage: asymptopia
- Donoho, Johnstone, et al.
- 1995
(Show Context)
Citation Context ...enoising methods can be directly applied to source separation, producing better results than purely blind techniques. There are also very general noise reduction techniques such as wavelet denoising (=-=Donoho et al., 1995-=-; Vetterli and Kovacevic, 1995) or median filtering (Kuosmanen and Astola, 1997) which can be applied in exploratory data analysis. In this section, we discuss denoising functions ranging from simple ... |

247 | Learning invariance from transformation sequences - Földiák - 1991 |

233 | Independent factor analysis
- Attias
- 1999
(Show Context)
Citation Context ...ed. The Bayesian approach provides a sound framework for including prior information into inferences about the signals. Recently, several Bayesian ICA algorithms have been suggested (see Knuth, 1998; =-=Attias, 1999-=-; Lappalainen, 1999; Miskin and MacKay, 2001; Choudrey and Roberts, 2001; d. F. R. Højen-Sørensen et al., 2002; Chan et al., 2003). They offer accurate estimations for the linear model parameters. For... |

200 | High-order contrasts for independent component analysis
- Cardoso
- 1999
(Show Context)
Citation Context ...romotes the separation of the sources. This bares resemblance to proposals of the role of divisive normalisation on cortex (Schwartz and Simoncelli, 2001) and to the classical ICA method called JADE (=-=Cardoso, 1999-=-). The problems related to kurtosis are well known and several other improved nonlinear functions f(s) have been proposed. However, some aspects of the above denoising, especially smoothing of the tot... |

158 | Separation of a mixture of independent signals using time delayed correlations
- Molgedey, Schuster
- 1994
(Show Context)
Citation Context ...hich is now aligned with the (sphered) mixing vector of the slow source. The sources can then be recovered by s = w T X. There are other algorithms for separating Gaussian sources (Tong et al., 1991; =-=Molgedey and Schuster, 1994-=-; Belouchrani et al., 1997; Ziehe and Müller, 1998) and, although functionally different, they yield similar results for the example given above. All these algorithms assume that the autocovariance st... |

157 | Slow Feature Analysis: Unsupervised Learning of Invariances
- Wiskott, Sejnowski
- 2002
(Show Context)
Citation Context ...mework as well. Throughout this paper, we have considered linear mapping from the sources to the observations but nonlinear mappings can be used, too. One such approach is slow feature analysis (SFA, =-=Wiskott and Sejnowski, 2002-=-) where the observations are first expanded nonlinearly and sphered. The expanded data are then high-pass filtered and projections minimising the variance are estimated. Due to the nonlinear expansion... |

151 | Forming sparse representations by local antiHebbian learning. Biol Cybern 64:165–170
- Földiák
- 1990
(Show Context)
Citation Context ...iple components by forcing the projections to be orthogonal. However, nonorthogonal projections resulting from over-complete representations provide some clear advantages, especially in sparse codes (=-=Földiák, 1990-=-), and may be found useful in the DSS framework as well. Throughout this paper, we have considered linear mapping from the sources to the observations but nonlinear mappings can be used, too. One such... |

143 | Natural signal statistics and sensory gain control
- Schwartz, Simoncelli
- 2001
(Show Context)
Citation Context ... Särelä (2004) showed that decorrelating the variance-based masks actively promotes the separation of the sources. This bares resemblance to proposals of the role of divisive normalisation on cortex (=-=Schwartz and Simoncelli, 2001-=-) and to the classical ICA method called JADE (Cardoso, 1999). The problems related to kurtosis are well known and several other improved nonlinear functions f(s) have been proposed. However, some asp... |

134 | Self-organizing neural network that discovers surfaces in random-dot stereograms, Nature 355 - Becker, Hinton - 1992 |

125 |
Principal components, minor components, and linear neural networks
- Oja
- 1992
(Show Context)
Citation Context ...d to denoise the speech signals. Often it would be useful to be able to separate the sources online, i.e., in real time. Since there exists online sphering algorithms (see Douglas and Cichocki, 1997; =-=Oja, 1992-=-), real time DSS can be considered as well. One simple case of online denoising is presented by moving-average filters. Such online filters are typically not symmetric and the eigenvalues (14) of the ... |

122 |
Indeterminacy and identifiability of blind identification
- Tong, Liu, et al.
- 1991
(Show Context)
Citation Context ...irst eigenvector, which is now aligned with the (sphered) mixing vector of the slow source. The sources can then be recovered by s = w T X. There are other algorithms for separating Gaussian sources (=-=Tong et al., 1991-=-; Molgedey and Schuster, 1994; Belouchrani et al., 1997; Ziehe and Müller, 1998) and, although functionally different, they yield similar results for the example given above. All these algorithms assu... |

109 | The Cognitive Neurosciences - Gazzaniga, ed - 1995 |

101 |
A blind source separation technique based on second order statistics
- Belouchrani, Meraim, et al.
- 1997
(Show Context)
Citation Context ...(sphered) mixing vector of the slow source. The sources can then be recovered by s = w T X. There are other algorithms for separating Gaussian sources (Tong et al., 1991; Molgedey and Schuster, 1994; =-=Belouchrani et al., 1997-=-; Ziehe and Müller, 1998) and, although functionally different, they yield similar results for the example given above. All these algorithms assume that the autocovariance structure of the sources is ... |

92 | Representation and separation of signals using nonlinear PCA type learning - Karhunen, Joutsensalo - 1994 |

89 |
Spectral Analysis for Physical Applications: Multitaper and Conventional Univariate Techniques
- Percival, Walden
- 1993
(Show Context)
Citation Context ... sinusoids used in Fourier transform or DCT, is used to divide the resources of time and frequency behaviour optimally in some sense. Another possibility is to use the so-called multitaper technique (=-=Percival and Walden, 1993-=-, Ch. 7). Here we apply an overcomplete-basis approach related to the above methods. Instead of having just one spectrogram, we use several time-frequency analyses with different Tt’s and Tf ’s. Then ... |

89 | An unsupervised ensemble learning method for nonlinear dynamic state-space models - Valpola, Karhunen |

82 | New Approximations of Differential Entropy for Independent Component Analysis and Projection Pursuit - Hyvärinen - 1998 |

79 | A neurodynamical cortical model of visual attention and invariant object recognition - Deco, Rolls - 2004 |

64 | E.: Independent Component Approach to the Analysis of EEG and MEG Recordings - Vigário, Särelä, et al. |

61 |
P Kuosmanen, Fundamentals of nonlinear digital filtering
- Astola
- 1997
(Show Context)
Citation Context ... better results than purely blind techniques. There are also very general noise reduction techniques such as wavelet denoising (Donoho et al., 1995; Vetterli and Kovacevic, 1995) or median filtering (=-=Kuosmanen and Astola, 1997-=-) which can be applied in exploratory data analysis. In this section, we discuss denoising functions ranging from simple but powerful linear ones to sophisticated nonlinear ones with the goal of inspi... |

45 | Ensemble learning for independent component analysis
- Lappalainen
- 1999
(Show Context)
Citation Context ...an approach provides a sound framework for including prior information into inferences about the signals. Recently, several Bayesian ICA algorithms have been suggested (see Knuth, 1998; Attias, 1999; =-=Lappalainen, 1999-=-; Miskin and MacKay, 2001; Choudrey and Roberts, 2001; d. F. R. Højen-Sørensen et al., 2002; Chan et al., 2003). They offer accurate estimations for the linear model parameters. For instance, universa... |

39 | Mean-Field Approaches to Independent Component Analysis Neural - Hojen-Sorensen, Hansen - 2002 |

39 | Kernel pca pattern reconstruction via approximate pre-images
- Scholkopf, Mika, et al.
- 1998
(Show Context)
Citation Context ...This is similar to earlier proposals, e.g., by Földiák (1991). 267sSÄRELÄ AND VALPOLA There are several possibilities for the nonlinear feature expansion in hierarchical DSS. For instance kernel PCA (=-=Schölkopf et al., 1998-=-), sparse coding or liquid state machines (Maass et al., 2002) can be used. The hierarchical DSS can be used in a fully supervised setting by fixing the activations of the topmost layer to target outp... |

38 | Bayesian Source Separation and Localization
- Knuth
- 1998
(Show Context)
Citation Context ...more restricted. The Bayesian approach provides a sound framework for including prior information into inferences about the signals. Recently, several Bayesian ICA algorithms have been suggested (see =-=Knuth, 1998-=-; Attias, 1999; Lappalainen, 1999; Miskin and MacKay, 2001; Choudrey and Roberts, 2001; d. F. R. Højen-Sørensen et al., 2002; Chan et al., 2003). They offer accurate estimations for the linear model p... |

38 |
Biomedical Signal Analysis; A case study approach
- Rangayyan
- 2002
(Show Context)
Citation Context ... information available in the experimental setup, other design specifications or from accumulated knowledge due to scientific research. For example in biomedical signal analysis (see Gazzaniga, 2000; =-=Rangayyan, 2002-=-), careful design of experimental setups provides us with presumed signal characteristics. In man-made technology, such as a CDMA mobile system (see Viterbi, 1995), the transmitted signals are even mo... |

37 | Neural networks for blind decorrelation of signals
- Douglas, Cichocki
- 1997
(Show Context)
Citation Context ...els which can be readily used to denoise the speech signals. Often it would be useful to be able to separate the sources online, i.e., in real time. Since there exists online sphering algorithms (see =-=Douglas and Cichocki, 1997-=-; Oja, 1992), real time DSS can be considered as well. One simple case of online denoising is presented by moving-average filters. Such online filters are typically not symmetric and the eigenvalues (... |

30 | Independent components of magnetoencephalography: single-trial response onset times - Tang, Pearlmutter, et al. - 2002 |

25 | Flexible Bayesian independent component analysis for blind source separation
- Choudrey, Roberts
- 2001
(Show Context)
Citation Context ...ncluding prior information into inferences about the signals. Recently, several Bayesian ICA algorithms have been suggested (see Knuth, 1998; Attias, 1999; Lappalainen, 1999; Miskin and MacKay, 2001; =-=Choudrey and Roberts, 2001-=-; d. F. R. Højen-Sørensen et al., 2002; Chan et al., 2003). They offer accurate estimations for the linear model parameters. For instance, universal density approximation using a mixture of Gaussians ... |

24 | Independent component analysis of biomedical signals - Jung, Makeig, et al. - 2000 |

22 | A hierarchical neural system with attentional top-down enhancement of the spatial resolution for object recognition - Deco, Schurmann - 2000 |

21 |
The Algebraic Eigenvalue Problem (Monographs on Numerical Analysis
- Wilkinson
(Show Context)
Citation Context ...ON where D ∗ = VΛ ∗ V T . Further, let us denote Z = XD ∗ . This brings the DSS algorithm for estimating one separating vector into the form w + = ZZ T w. (15) This is the classical power method (see =-=Wilkinson, 1965-=-) implementation for principal component analysis (PCA). Note that ZZ T is the unnormalised covariance matrix. The algorithm converges to the fixed point w ∗ satisfying λw ∗ = ZZ T /T w ∗ , (16) where... |

18 | Transform invariant recognition by association in a recurrent network - Parga, Rolls - 1998 |

16 | Approximate Likelihood for Noisy Mixtures - Bermond, Cardoso - 1999 |

15 |
Fetal Electrocardiogram Extraction based on Non-Stationary
- Vigneron, Paraschiv-Ionescu, et al.
- 2003
(Show Context)
Citation Context ...ithm. We suggest that various kinds of prior knowledge can be easily formulated in terms of denoising. In some cases a denoising scheme has been used to post-process the results after separation (see =-=Vigneron et al., 2003-=-), but in the DSS framework this denoising can be used for the source separation itself. The paper is organised as follows: After setting the general problem of linear source separation in Sec. 2, we ... |

14 | Experimental comparison of neural algorithms for independent component analysis and blind separation
- Giannakopoulos, Karhunen, et al.
- 1999
(Show Context)
Citation Context ...thms for performing ICA based on optimising easily calculable contrast functions. The algorithms are fast but often more accurate results can be achieved by computationally more demanding algorithms (=-=Giannakopoulos et al., 1999-=-), for example by the Bayesian ICA algorithms. Valpola and Pajunen (2000) analysed the factors behind the speed of FastICA. The analysis suggested that the nonlinearity used in FastICA can be interpre... |

14 | instrumentation, and applications to noninvasive studies of the working human brain. Reviews of modern - Magnetoencephalographytheory - 1993 |

13 | Variational bayesian learning of ica with missing data - Chan, Lee, et al. |

12 | Dynamical Factor Analysis of Rhythmic Magnetoencephalographic Activity - Särelä, Valpola, et al. |

12 | R.: Overlearning in Marginal Distribution-Based ICA: Analysis and Solutions - Särelä, Vigário |

11 | ICASSO: Software for Investigating the reliability of ICA estimates by Clustering and
- Himberg, Hyvarinen
- 2003
(Show Context)
Citation Context ...ation of the first sources. Instead one source was extracted with each of the masks several times using different initial vector w until five sufficiently different source estimates were reached (see =-=Himberg and Hyvärinen, 2003-=-; Meinecke et al., 2002, for further possibilities along these lines). Deflation was only used if no estimate could be found for all the 5 sources. This was often the case for poor SNR under 0dB. To g... |

9 |
Independent component analysis for artefact separation in astrophysical images
- Funaro, Valpola
- 2003
(Show Context)
Citation Context ...d in Sec. 2 that the index t of different samples s(t) might refer as well to space as to time. In space it becomes natural to apply filtering in 2D or even in 3D. For example, the astrophysical ICA (=-=Funaro et al., 2003-=-) would clearly benefit from multi-dimensional filtering. Source separation is not the only application of ICA-like algorithms. Another, important field of application is feature extraction. ICA has b... |