#### DMCA

## Probabilistic Independent Component Analysis (2003)

Citations: | 208 - 13 self |

### Citations

11964 | Maximum likelihood from incomplete data via the EM algorithm
- Dempster, Laird, et al.
- 1977
(Show Context)
Citation Context ... two Gamma distributions to model the probability density of background noise, positive and negative BOLD effects. The model of equation 15 is fitted using the expectation-maximization (EM) algorithm =-=[17]-=-. In order to infer the appropriate number of components in the mixture model we successively fit models with an increasing number of mixtures and use an approximation to the Bayesian model evidence t... |

6605 |
Neural Networks for Pattern Recognition
- Bishop
- 1995
(Show Context)
Citation Context ... clustered and localised inside visual cortical areas, do not lend themselves to easy interpretation. This is the classical problem of over-fitting a noise-free generative model to noisy observations =-=[10]-=- and needs to be resolved by setting up a suitable probabilistic model that controls the balance between what is attributable to ’real effects’ of interest and what simply is due to observational nois... |

2308 | Independent Component Analysis - Hyvärinen, Karhunen, et al. - 2001 |

1851 |
Independent component analysis, a new concept?”
- Comon
- 1994
(Show Context)
Citation Context ...ations only after modelling has completed (e.g. Gaussian Random Field Theory-based inference [39]). As one alternative to hypothesis–driven analytical techniques, Independent Component Analysis (ICA, =-=[15]-=-) has been applied to FMRI data as an exploratory data analysis technique in order to find independently distributed spatial patterns that depict source processes in the data [36, 8]. The basic goal o... |

1564 |
Modeling by shortest data description
- RISSANEN
- 1978
(Show Context)
Citation Context ...dence. Other possible choices for model order selection for PPCA include the Bayesian Information Criterion (BIC, [29]) the Akaike Information Criterion (AIC, [1]) or Minimum Description Length (MDL, =-=[43]-=-). Note that the estimation of the model order in the case of the probabilistic PCA model is based on the assumption of Gaussian source distribution. [37], however, provides some empirical evidence th... |

1495 | An information-maximization approach to blind separation and blind deconvolution
- Bell, Sejnowski
- 1995
(Show Context)
Citation Context ...eoretic principles were proposed by [32]. These learning rules are based on the principle of redundancy reduction as a coding strategy for neurons of the perceptual system [4]. More recently, [5] and =-=[9]-=- introduced a surprisingly simple blind source separation algorithm for a non-linear feed-forward network from an information maximization viewpoint. This algorithm was subsequently improved, extended... |

961 | Functional data analysis
- Ramsay, Silverman
- 1997
(Show Context)
Citation Context ... a certain neighbourhood of fixed size. In addition to spatial information, assumptions on the nature of the time courses can be incorporated using regularized principal component analysis techniques =-=[40]-=-. Instead of filtering the data, constraints can be imposed on the eigenvectors, e.g. constraints on the smoothness can be included by penalizing the roughness using the integrated square of the secon... |

642 | Some informational aspects of visual perception
- Attneave
- 1954
(Show Context)
Citation Context ...ules based on information theoretic principles were proposed by [32]. These learning rules are based on the principle of redundancy reduction as a coding strategy for neurons of the perceptual system =-=[4]-=-. More recently, [5] and [9] introduced a surprisingly simple blind source separation algorithm for a non-linear feed-forward network from an information maximization viewpoint. This algorithm was sub... |

559 |
Blind separation of sources, part I: An adaptive algorithm basedonneuromimetic architecture.
- Jutten, H´erault
- 1991
(Show Context)
Citation Context ... available to allow for informed estimation, this is a challenging problem. A possible solution is to employ what is known in the area of signal processing as blind source separation (BSS) techniques =-=[27]-=-. The signal of functional magnetic resonance imaging studies is a prime example, comprising different sources of variability, possibly including machine artefacts, physiological pulsation, head motio... |

532 | C.M.: Mixtures of probabilistic Principal Component Analyzers.
- Tipping, Bishop
- 1999
(Show Context)
Citation Context ...endent components in the signal + noise sub–space and (iii) assessing the statistical significance of estimated sources. At the first stage we employ probabilistic Principal Component Analysis (PPCA, =-=[49]-=-) in order to find an appropriate linear sub-space which contains the sources. The choice of the number of components to extract is a problem of model order selection. Underestimation of the dimension... |

456 |
Latent Variable Models and Factor Analysis, Kendall’s Library of Statistics,
- Bartholomew, Knott
- 1999
(Show Context)
Citation Context ...atial source signals correspond to parameter estimates in the GLM with the additional constraint of being statistically independent. The model of equation 2 is closely related to Factor Analysis (FA) =-=[6]-=-. There, the sources are assumed to have a Gaussian distribution and the noise is assumed to have a diagonal covariance matrix. In Factor Analysis, the sources are known as common factors and η is a v... |

400 | A unified statistical approach for determining significant signals in images of cerebral activation.
- Worsley, Marrett, et al.
- 1996
(Show Context)
Citation Context ...ring steps can also be used for data pre-processing for ICA. In the case of spatial smoothing note that since the inferential steps (see section 4 below) are not based on Gaussian Random Field theory =-=[52]-=-, we have the additional freedom of choosing more sophisticated smoothing techniques that do not simply convolve the data using a Gaussian kernel. Non-linear smoothing like the SUSAN filter [47] allow... |

364 | Self-organization in a perceptual network,"
- Linsker
- 1988
(Show Context)
Citation Context ...nts, a technique that was also employed by other authors [16]. 2sIn parallel to blind source separation studies, unsupervised learning rules based on information theoretic principles were proposed by =-=[32]-=-. These learning rules are based on the principle of redundancy reduction as a coding strategy for neurons of the perceptual system [4]. More recently, [5] and [9] introduced a surprisingly simple bli... |

317 | Analysis of fMRI data by blind separation into independent spatial components,”
- McKeown
- 1998
(Show Context)
Citation Context ...th cutoff of 90s [34]) and spatially smoothed with a Gaussian kernel of 3mm (FWHM). ICA maps were thresholded at Z > 2.3 after transformation into ”Z-scores” across the spatial domain as described in =-=[36]-=-. Note that this should not be confused with GLM or PICA Z-score thresholding as described in section 4, where Z-scores are formed by dividing by the voxel-wise estimated standard deviation of the noi... |

277 | Independent Factor Analysis.
- Attias
- 1999
(Show Context)
Citation Context ... by work from [18] and [22], where mixture models were used for statistical maps generated from parametric FMRI activation modelling and links to the work on explicit source density modelling for ICA =-=[3, 13, 45]-=-. The proposed methodology can be extended in various ways. In the present implementation, we chose to discard an explicit source model from the estimation stages and use the Gaussian mixture model on... |

273 | Susan - a new approach to low level image processing,” - Smith, Brady - 1997 |

176 |
Statistical methods of estimation and inference for functional MR image analysis.
- Bullmore, Brammer, et al.
- 1996
(Show Context)
Citation Context ...c noise is computationally much more involved than estimation in the standard noise free setting. The form of Σi needs to be constrained, e.g. we can use the common approaches to FMRI noise modelling =-=[11, 50]-=-, and restrict ourselves to autoregressive noise. However, since the exploratory approach allows modelling of various sources of variability, e.g. temporally consistent physiological noise, as part of... |

169 | Analysis of fMRI time-series revisited:
- Worsley, Friston
- 1995
(Show Context)
Citation Context ...ied to FMRI data test specific hypotheses about the expected BOLD response at the individual voxel locations using simple regression or more sophisticated models like the General Linear Model 1 (GLM) =-=[51]-=-. There, the expected signal changes are specified as regressors of interest in a multiple linear regression framework and the estimated regression coefficients are tested against a null hypothesis th... |

126 | Source separation using higher order moments,
- Cardoso
- 1989
(Show Context)
Citation Context ...roblem. The seminal work into BSS [27] looked at extensions to standard principal component analysis (PCA). Theoretical work on high order moments provided one of the first solutions to a BSS problem =-=[12]-=-. [27] published a concise presentation of their adaptive algorithm and outlined the transition from PCA to ICA very clearly. Their approach has been further developed by [28] and [14]. Exact conditio... |

124 | Bayes factors and model uncertainty.
- Kass, Raftery
- 1995
(Show Context)
Citation Context ...nd replace Λ by its adjusted eigenspectrum Λ/G-1 (ν) prior to evaluating the model evidence. Other possible choices for model order selection for PPCA include the Bayesian Information Criterion (BIC, =-=[29]-=-) the Akaike Information Criterion (AIC, [1]) or Minimum Description Length (MDL, [43]). Note that the estimation of the model order in the case of the probabilistic PCA model is based on the assumpti... |

117 | Maximum likelihood and covariant algorithms for independent component analysis",
- MacKay
- 1996
(Show Context)
Citation Context ... an information maximization viewpoint. This algorithm was subsequently improved, extended and modified [2] and its relation to maximum likelihood estimation and redundancy reduction was investigated =-=[33]-=-. There now exists a variety of alternative algorithms and principled extensions that include work on non-linear mixing, non-instantaneous mixing, incorporation of source structure and observational n... |

107 |
Combining spatial extent and peak intensity to test for activations in functional imaging.
- Poline, Worsley, et al.
- 1997
(Show Context)
Citation Context ...ges known as statistical parametric maps which are commonly assessed for statistical significance using voxel-wise null-hypothesis testing or testing for the size or mass of suprathresholded clusters =-=[39]-=-. These approaches are confirmatory in nature and make strong prior assumptions about the spatiotemporal characteristics of signals contained in the data. Naturally, the inferred spatial patterns of a... |

102 | Automatic choice of dimensionality for pca
- Minka
- 2001
(Show Context)
Citation Context ...monstrate that the number of source processes can be inferred from the covariance matrix of the observations using a Bayesian framework that approximates the posterior distribution of the model order =-=[37]-=- and extending this approach to take account of the limited amount of data and the particular structure of FMRI noise [7]. At the second stage the source signals are estimated within the lower- dimens... |

99 | Bayesian Approaches to Gaussian Mixture Modelling”,
- Roberts, Husmeier, et al.
- 1998
(Show Context)
Citation Context ...riate number of components in the mixture model we successively fit models with an increasing number of mixtures and use an approximation to the Bayesian model evidence to define a stopping rule (see =-=[44]-=- for details). Our experiments suggest that this typically results in a model with 2-3 mixtures. In cases where the number of ’active’ voxels is small, however, a single Gaussian mixture may actually ... |

74 |
Functional MRI: An introduction to methods.
- Jezzard, Matthews, et al.
- 2002
(Show Context)
Citation Context ... of the noise variance σ2 i at each voxel location is �σ 2 i = �η t i �η i/trace(P ), 13swhich, if p − q is reasonably large, will approximately equal σ2 i , i.e. equal the true variance of the noise =-=[25]-=-. We can thus convert the individual spatial IC maps sr· into Z-statistic maps zr· by dividing the raw IC estimate by the estimate of the voxel-wise noise standard deviation. Under the null-hypothesis... |

68 |
Generalizations of principal component analysis, optimization problems, and neural networks
- Karhunen, Joutsensalo
- 1995
(Show Context)
Citation Context ...lutions to a BSS problem [12]. [27] published a concise presentation of their adaptive algorithm and outlined the transition from PCA to ICA very clearly. Their approach has been further developed by =-=[28]-=- and [14]. Exact conditions for the identifiability of the model can be found in [15] together with an algorithm that approximates source signal distributions using their first few moments, a techniqu... |

63 |
A new approach to low level image processing
- SUSAN
- 1995
(Show Context)
Citation Context ...heory [52], we have the additional freedom of choosing more sophisticated smoothing techniques that do not simply convolve the data using a Gaussian kernel. Non-linear smoothing like the SUSAN filter =-=[47]-=- allow for the reduction of noise whilst preserving the underlying spatial structure and as a consequence reduce the commonly observed effect of estimated spatial pattern of activation ’bleeding’ into... |

61 | On the distribution of the largest principal component
- Johnstone
- 2001
(Show Context)
Citation Context ...tribution and we can utilise results from random matrix theory on the empirical distribution function Gn(ν) for the eigenvalues of the covariance matrix of a single random p × n-dimensional matrix ˜X =-=[26]-=-. Suppose that p/n → γ as n → ∞ and 0 < γ ≤ 1, then Gn(ν) → Gγ(ν) almost surely, where the limiting distribution has a density g(ν) = 1 � (ν − b−)(b+ − ν), b− ≤ ν ≤ b+, (9) 2πγν and where b± = (1± √ γ... |

59 | Neural learning in structured parameter spaces—Natural Riemannian gradient,” in
- Amari
- 1997
(Show Context)
Citation Context ...surprisingly simple blind source separation algorithm for a non-linear feed-forward network from an information maximization viewpoint. This algorithm was subsequently improved, extended and modified =-=[2]-=- and its relation to maximum likelihood estimation and redundancy reduction was investigated [33]. There now exists a variety of alternative algorithms and principled extensions that include work on n... |

54 |
Variability in fMRI: an examination of intersession differences.
- McGonigle, Howseman, et al.
- 2000
(Show Context)
Citation Context ...re 7(iv)) appears to work well. 6.4 Real FMRI data For the first example, we used data courtesy of Dr. Dave McGonigle that previously had been used to evaluate the between-session variability in FMRI =-=[35]-=-. In brief, the experiment involved 33 sessions of runs under motor, cognitive and visual stimulation. The data presented here is one of the two visual stimulation sessions of 36 volumes each that pro... |

46 | A new statistical approach to detecting significant activation in functional
- Marchini, Ripley
- 2000
(Show Context)
Citation Context ...the extracted time course with the expected BOLD response exceeds 0.3. Both for GLM analysis and ICA, the data was high-pass filtered (Gaussian-weighted local straight line fitting with cutoff of 90s =-=[34]-=-) and spatially smoothed with a Gaussian kernel of 3mm (FWHM). ICA maps were thresholded at Z > 2.3 after transformation into ”Z-scores” across the spatial domain as described in [36]. Note that this ... |

45 | Brain Mapping – The Methods, - Toga, Mazziotta - 1996 |

43 | Spatiotemporal independent component analysis of event-related fMRI data using skewed probability density functions.
- Stone, Porrill, et al.
- 2002
(Show Context)
Citation Context ...smaller cross-covariance structure than the true time courses. This is an effect quite different to the assumptions that lead into the investigation of ’Spatiotemporal Independent Component Analysis’ =-=[48]-=- . There, the authors speculated that in in the case of an ICA decomposition based on optimising spatial independence between estimated source signals, suboptimal solutions emerge since the decomposit... |

37 | Spatial mixture modeling of fMRI data
- Hartvig, Jensen
- 2000
(Show Context)
Citation Context ...crepancy between the assumed and the ’true’ signal space will render the analysis sub–optimal. Furthermore, while there is a growing number of models that explicitly include prior spatial information =-=[22]-=-, the standard GLM approach is univariate and essentially discards information about the spatial properties of the data, only inducing spatial smoothness by convolving the individual volumes with a Ga... |

34 |
Mixture model mapping of brain activation in functional magnetic resonance images
- Everitt, Bullmore
- 1999
(Show Context)
Citation Context ...erent voxel locations than at others and this change in ’specificity’ is reflected in the relative value of residual noise. In order to assess the Z-maps for significantly activated voxels, we follow =-=[18]-=- and [22] and employ mixture modelling of the probability density for spatial map of Z-scores. Equation 6 implies that �si = � W Asi + � W ηi, i.e. in the signal space defined by the mixing matrix A, ... |

27 | Flexible Bayesian Independent Component Analysis for Blind Source Separation
- Choudrey, Roberts
- 2001
(Show Context)
Citation Context ... by work from [18] and [22], where mixture models were used for statistical maps generated from parametric FMRI activation modelling and links to the work on explicit source density modelling for ICA =-=[3, 13, 45]-=-. The proposed methodology can be extended in various ways. In the present implementation, we chose to discard an explicit source model from the estimation stages and use the Gaussian mixture model on... |

26 | Plurality and resemblance in fMRI data analysis.
- Lange, Strother, et al.
- 1999
(Show Context)
Citation Context ...the clusters was ∼ 0.25 of the peak activation level. In the real activation data, the highest activation was ∼ 3% peak to peak. Note that this is more realistic than the artificial data presented in =-=[31]-=- where all activated voxels have identical Z scores. The above procedure was carried out for auditory and visual ’activation’ using a separate spatial activation mask and activation timecourses. 5.3 A... |

24 | Inferring the eigenvalues of covariance matrices from limited, noisy data,”
- Everson, Roberts
- 2000
(Show Context)
Citation Context ...of the observations Rx = 〈xix t i〉 = AA t , which is of rank q. In the presence of isotropic noise, however, the covariance matrix of the observations will be the sum of AA t and the noise covariance =-=[19]-=- Rx = AA t + σ 2 Ip, (8) 7soriginal data variance normalised data adjusted eigenspectrum Key: ( ) eigenspectrum ( ) Lap ( ) BIC ( ) MDL ( ) AIC 0 20 40 60 80 100 120 140 160 180 0 20 40 60 80 100 120 ... |

20 | Bayesian source separation for reference function determination in fMRI.
- Rowe
- 2001
(Show Context)
Citation Context ...the eigenspectrum. Note that equation 9 is only satisfied for 0 < γ ≤ 1, i.e. when the number of samples is equal or larger than the dimensionality of the problem at hand. This approach is similar to =-=[45]-=-, where an inverse Wishart prior is placed on the noise covariance matrix in a fully Bayesian source separation model. If we assume that the source distributions p(s) are Gaussian, the probabilistic I... |

18 |
Temporal autocorrelation in univariate linear modelling of FMRI data
- Woolrich, Ripley, et al.
- 2001
(Show Context)
Citation Context ...nge 4sprocesses than observations in time. The covariance of the noise is allowed to be voxel dependent in order to allow for the vastly different noise covariances observed in different tissue types =-=[50]-=-. The vector µ defines the mean of the observations xi where the index i is over the set of all voxel locations V and the p × q matrix A is assumed to be non-degenerate, i.e. of rank q. Solving the bl... |

15 |
FSL: New tools for functional and structural brain image analysis.
- Smith, Bannister, et al.
- 2001
(Show Context)
Citation Context ... rigid-body motion correction. The corrected data was 15stemporally high pass filtered (Gaussian-weighted LSF straight line subtraction, with σ = 75.0s [34]) and masked of non- brain voxels using BET =-=[46]-=-. 5.2 Artificial activation in real FMRI resting state data The activation data set (iii) was analysed using standard GLM techniques as implemented in FEAT [46]. Final Z statistic maps were used to de... |

14 |
Investigating the intrinsic dimensionality of FMRI data for ICA
- Beckmann, Noble, et al.
- 2001
(Show Context)
Citation Context ... is attributable to ’real effects’ of interest and what simply is due to observational noise. In order to address these issues we examine the probabilistic Independent Component Analysis (PICA) model =-=[38, 7]-=- for FMRI data that allows for a non-square mixing process and assumes that the data are confounded by additive Gaussian noise. In the case of isotropic noise covariance the task of blind source separ... |

14 | Real-time independent component analysis of fMRI time-series
- Esposito, Seifritz, et al.
- 2003
(Show Context)
Citation Context ...f voxel i being within gray-matter we can choose wi = pi and the covariance is weighted by the probability of gray-matter membership. Simple approaches to performing ICA on the cortical surface (e.g. =-=[20]-=-) are special cases of this, binarising p and therefore losing valuable partial volume information. In this more general setting, however, the uncertainty in the segmentation will also be incorporated... |

14 |
Improved optimisation for the robust and accurate linear registration and motion correction of brain images. NeuroImage 17: 825–841
- Jenkinson, Bannister, et al.
- 2002
(Show Context)
Citation Context ...versing at 8Hz), (iii) 30s on/off visual stimulus (coloured checkerboard reversing at 8Hz) and 45s on/off auditory stimulus (radio recording). The data were corrected for subject motion using MCFLIRT =-=[24]-=- to perform 6 parameter rigid-body motion correction. The corrected data was 15stemporally high pass filtered (Gaussian-weighted LSF straight line subtraction, with σ = 75.0s [34]) and masked of non- ... |

14 | Ica: Model order selection and dynamic source models
- Penny, Robert, et al.
- 2001
(Show Context)
Citation Context ... is attributable to ’real effects’ of interest and what simply is due to observational noise. In order to address these issues we examine the probabilistic Independent Component Analysis (PICA) model =-=[38, 7]-=- for FMRI data that allows for a non-square mixing process and assumes that the data are confounded by additive Gaussian noise. In the case of isotropic noise covariance the task of blind source separ... |

10 |
Fitting autoregressive models for regression,
- Akaike
- 1969
(Show Context)
Citation Context ...G-1 (ν) prior to evaluating the model evidence. Other possible choices for model order selection for PPCA include the Bayesian Information Criterion (BIC, [29]) the Akaike Information Criterion (AIC, =-=[1]-=-) or Minimum Description Length (MDL, [43]). Note that the estimation of the model order in the case of the probabilistic PCA model is based on the assumption of Gaussian source distribution. [37], ho... |

9 |
A new on-line adaptive algorithm for blind separation of source signals
- Cichocki, Unbehauen, et al.
- 1994
(Show Context)
Citation Context ...o a BSS problem [12]. [27] published a concise presentation of their adaptive algorithm and outlined the transition from PCA to ICA very clearly. Their approach has been further developed by [28] and =-=[14]-=-. Exact conditions for the identifiability of the model can be found in [15] together with an algorithm that approximates source signal distributions using their first few moments, a technique that wa... |

8 |
Density Shaping by Neural Networks with Application to Classification, Estimation and Forecasting
- Baram, Roth
- 1994
(Show Context)
Citation Context ...ation theoretic principles were proposed by [32]. These learning rules are based on the principle of redundancy reduction as a coding strategy for neurons of the perceptual system [4]. More recently, =-=[5]-=- and [9] introduced a surprisingly simple blind source separation algorithm for a non-linear feed-forward network from an information maximization viewpoint. This algorithm was subsequently improved, ... |

7 |
Linear redundancy reduction learning
- Deco, Obradovic
- 1995
(Show Context)
Citation Context ...fiability of the model can be found in [15] together with an algorithm that approximates source signal distributions using their first few moments, a technique that was also employed by other authors =-=[16]-=-. 2sIn parallel to blind source separation studies, unsupervised learning rules based on information theoretic principles were proposed by [32]. These learning rules are based on the principle of redu... |

5 |
The physiological noise in oxygensensitive magnetic resonance imaging
- Kruger, Glover
- 2001
(Show Context)
Citation Context ...ally (temporally) correlated, they can still be separated from each other in the PICA unmixing as long as they are spatially distinct and not perfectly temporally correlated. If (as e.g. suggested by =-=[30]-=-) a noise component combines non-additively with a signal source, then indeed the linear mixing model used here will be imperfect. In this case, however, the nonlinear interaction should (to first ord... |

3 |
Combining ICA and GLM: a hybrid approach to FMRI analysis
- Beckmann, Tracey, et al.
- 2000
(Show Context)
Citation Context ...onent Analysis (ICA, [15]) has been applied to FMRI data as an exploratory data analysis technique in order to find independently distributed spatial patterns that depict source processes in the data =-=[36, 8]-=-. The basic goal of ICA is to solve the BSS problem by expressing a set of random variables (observations) as linear combinations of statistically independent latent component variables (source signal... |

3 |
A decomposition theorem for vector variables with a linear structure
- Rao
- 1969
(Show Context)
Citation Context ...or of random variables called specific factors. In FA the assumption of independence between the individual source processes reduces to assuming that sources are mutually uncorrelated. 2.1 Uniqueness =-=[42]-=- extends the standard factor analysis model such that the common and specific variables are independent non-degenerate random variables and examines the implication for the minimum rank of the mixing ... |

2 |
Characterization of the distribution of random variables in linear structural relations
- Rao
- 1966
(Show Context)
Citation Context ... model such that the common and specific variables are independent non-degenerate random variables and examines the implication for the minimum rank of the mixing matrix A in equation 2. Earlier work =-=[41]-=- characterised the multivariate normal distribution through the non-uniqueness of its linear structure, a result which within the ICA literature has been restated as the limitation that only one Gauss... |

1 | Spatial mixture modelling of fmri dam - Hartvig, Jensen |

1 | Analysis of fMRI dam by blind separation into independent spatial components - McKeown, Makeig, et al. - 1998 |

1 | A family of fixed-point algorithms for independent component analysis - Hyv'grinen - 1997 |