## Model selection in electromagnetic source analysis with an application to VEF’s (2002)

Venue: | IEEE Transactions on Biomedical Engineering |

Citations: | 7 - 4 self |

### BibTeX

@ARTICLE{Waldorp02modelselection,

author = {Lourens J. Waldorp and Hilde M. Huizenga and Raoul P. P. P. Grasman and Koen B. E. Böcker and Jan C. De Munck and Peter C. M. Molenaar},

title = {Model selection in electromagnetic source analysis with an application to VEF’s},

journal = {IEEE Transactions on Biomedical Engineering},

year = {2002},

volume = {49},

pages = {1121--1129}

}

### OpenURL

### Abstract

Abstract — In electromagnetic source analysis it is necessary to determine how many sources are required to describe the EEG or MEG adequately. Model selection procedures (MSP’s, or goodness of fit procedures) give an estimate of the required number of sources. Existing and new MSP’s are evaluated in different source and noise settings: two sources which are close or distant, and noise which is uncorrelated or correlated. The commonly used MSP residual variance is seen to be ineffective, that is it often selects too many sources. Alternatives like the adjusted Hotelling’s test, Bayes information criterion, and the Wald test on source amplitudes are seen to be effective. The adjusted Hotelling’s test is recommended if a conservative approach is taken, and MSP’s such as Bayes information criterion or the Wald test on source amplitudes are recommended if a more liberal approach is desirable. The MSP’s are applied to empirical data (visual evoked fields). I.

### Citations

1137 | An Introduction to Multivariate Statistical Analysis, Second Edition - Anderson - 1984 |

323 |
Estimating the dimension of a model
- Schwartz
- 1978
(Show Context)
Citation Context ...nformation criterion. The Bayes information criterion (BIC, or minimum description length) is similar to the AIC except for the penalty term. The BIC is defined in terms of the loglikelihood function =-=[31], [-=-32]. If the pure error is white, and if its variance is estimated by s 2 , then BIC = ln(πs 2 ) + e′ e + ln(m)p. (10) s2 The source model with the smallest BIC value is selected. Criterion C1 is sa... |

316 |
Nonlinear Regression
- Seber, Wild
- 2003
(Show Context)
Citation Context ... ∗ ∗ ∗ ∗ ∗ ∗ C6 - - - ∗ ∗ - - - - - - Lack of fit. The lack of fit (LOF) tests whether the total error variance exceeds the pure error variance. If the pure error is white, then the LO=-=F is defined as [13], [14], -=-[21] LOF = e′ e/(m − p) s 2 . (4) If the pure error is white and multivariate normally distributed, then the LOF has an F distribution with m − p and m(n − 1) df. A model fits if the LOF is no... |

308 |
Detection Of Signals by Information Theoretic Criteria
- Wax, Kailath
- 1985
(Show Context)
Citation Context ...nd no noise is modelled. Model selection procedures (MSP’s) can be used to give an estimate of the required number of sources. MSP’s based on an eigenvalue decomposition have been studied extensiv=-=ely [3], [4-=-], [5], [6], [7], [8], [9], [10]. A disadvantage of these MSP’s is that they only give an estimate of the number of uncorrelated sources. Other MSP’s, based on the residuals of a source analysis, ... |

176 |
Model selection and Akaike’s Information Criterion (AIC): The general theory and its analytical extensions
- Bozdogan
- 1987
(Show Context)
Citation Context ... a penalty term for different models [28]. If the pure error is white and its variance is estimated by s 2 , then AIC = ln(πs 2 ) + e′ e + 2p. (9) s2 The model with the smallest AIC value is select=-=ed [29]-=-. Criterion C1 is satisfied by s 2 . However, C2 is not satisfied. C3 is met, since the decision rule is a minimization over source models. The penalty term contains the number of parameters and so C4... |

60 | Matrix Analysis for Statistics - Schott - 1995 |

55 |
Fundamentals of dipole source potential analysis
- Scherg
- 1990
(Show Context)
Citation Context ...se MSP’s is that they only give an estimate of the number of uncorrelated sources. Other MSP’s, based on the residuals of a source analysis, have also been reported, for example the residual varia=-=nce [11], th-=-e chisquare [1], [2], and lack of fit statistic [12]. These MSP’s give an estimate of the number of uncorrelated or correlated sources. In this paper we study the second class of MSP’s more *Loure... |

55 | Modern Regression Methods - Ryan - 1997 |

51 |
An asymptotic theory for linear model selection
- Shao
- 1997
(Show Context)
Citation Context ... over fit is already present when pure error is white, and becomes dominant when the pure error is colored. This is a common finding. The AIC has been found to over fit in a variety of settings [29], =-=[36]. Th-=-e cause of this is the tolerant ‘penalty’ of 2p. Note that the performance of the AIC is also similar to that of the LR (see Sec. II-C). The performance of the BIC can be considered as good. It un... |

30 | Theory of multivariate statistics - Bilodeau, Brenner - 1999 |

21 |
On Information theoretic criteria for determining the number of signals in high resolution array processing
- Wong, Zhang, et al.
- 1990
(Show Context)
Citation Context ...modelled. Model selection procedures (MSP’s) can be used to give an estimate of the required number of sources. MSP’s based on an eigenvalue decomposition have been studied extensively [3], [4], [=-=5], [6], [7-=-], [8], [9], [10]. A disadvantage of these MSP’s is that they only give an estimate of the number of uncorrelated sources. Other MSP’s, based on the residuals of a source analysis, have also been ... |

19 |
Recursive MUSIC: a framework for EEG and MEG source localization
- Mosher, Leahy
- 1998
(Show Context)
Citation Context ...ction procedures (MSP’s) can be used to give an estimate of the required number of sources. MSP’s based on an eigenvalue decomposition have been studied extensively [3], [4], [5], [6], [7], [8], [=-=9], [10]. A -=-disadvantage of these MSP’s is that they only give an estimate of the number of uncorrelated sources. Other MSP’s, based on the residuals of a source analysis, have also been reported, for example... |

18 |
Cautionary note about R 2
- Kvalseth
- 1985
(Show Context)
Citation Context ...ce models, the model with the smallest number of sources which has a value equal to or below the threshold is selected. In the statistical literature the RV is in general not considered as a good MSP =-=[18]-=-, [19]. Note that the RV is high if the modelling error is high, but also if the pure error is high. As a consequence, the MSP has a tendency to model pure error (i.e. over fit). This is due to the fa... |

16 |
Testing the fit of a parametric function
- AERTS, CLAESKENS, et al.
(Show Context)
Citation Context ... is colored and the data and model are prewhitened, then C5 is met but not C6. Note that the AIC resembles the LR. The acceptance/rejection region of the hypothesis test is translated by 2(pd+1 − pd=-=) [30]-=-. The AIC also resembles Cp, since the number of sensors m in Cp is fixed, and the first term in the AIC is also constant. Bayes information criterion. The Bayes information criterion (BIC, or minimum... |

14 |
instrumentation, and applications to noninvasive studies of the working human brain. Reviews of modern
- Magnetoencephalographytheory
- 1993
(Show Context)
Citation Context ...e applied to empirical data (visual evoked fields). I. Introduction ELECTROMAGNETIC source analysis yields an estimate of the sources of the electro-encephalogram (EEG) or magneto-encephalogram (MEG) =-=[1]-=-. Precise estimates of the sources can only be obtained if the number of sources is correct [2]. That is, if activity related to the stimulus, and no noise is modelled. Model selection procedures (MSP... |

14 |
A comparison of the information and posterior probability criteria for model selection
- Chow
- 1981
(Show Context)
Citation Context ...tion criterion. The Bayes information criterion (BIC, or minimum description length) is similar to the AIC except for the penalty term. The BIC is defined in terms of the loglikelihood function [31], =-=[32]. I-=-f the pure error is white, and if its variance is estimated by s 2 , then BIC = ln(πs 2 ) + e′ e + ln(m)p. (10) s2 The source model with the smallest BIC value is selected. Criterion C1 is satisfie... |

13 |
Functional imaging and localization of electromagnetic brain activity
- Scherg
- 1992
(Show Context)
Citation Context ...plained by the model. The RV is defined as (e.g. [12], [16]) � ′ e e RV = 100 ¯y ′ � . (2) ¯y A model is said to fit the data if its value is close to 0. In practice a threshold value is cho=-=sen (e.g. [17]-=-). The threshold of 5% is chosen here. When comparing several source models, the model with the smallest number of sources which has a value equal to or below the threshold is selected. In the statist... |

13 | Estimated generalized least squares electromagnetic source analysis based on a parametric noise covariance model - Waldorp, Huizenga, et al. - 2001 |

11 |
On detection of the number of signals when the noise covariance matrix is arbitrary
- Zhao, Krishnaiah, et al.
- 1986
(Show Context)
Citation Context ... noise is modelled. Model selection procedures (MSP’s) can be used to give an estimate of the required number of sources. MSP’s based on an eigenvalue decomposition have been studied extensively [=-=3], [4], [5-=-], [6], [7], [8], [9], [10]. A disadvantage of these MSP’s is that they only give an estimate of the number of uncorrelated sources. Other MSP’s, based on the residuals of a source analysis, have ... |

11 |
On the distributional properties of model selection criteria
- ZHANG
- 1992
(Show Context)
Citation Context ... if less sensors are used, then the penalty of the BIC can reduce to that of the AIC. Therefore, to avoid over fitting the BIC should be used only when the penalty term is larger than that of the AIC =-=[37]-=-. The WA is also good: a tendency to under fit when the sources are close and correct decisions when sources are distant. The tendency to under fit for close sources is less for colored pure error. Th... |

8 |
Detection tests for array processing in unknown correlated noise fields
- Stoica, Cedervall
- 1997
(Show Context)
Citation Context ...Model selection procedures (MSP’s) can be used to give an estimate of the required number of sources. MSP’s based on an eigenvalue decomposition have been studied extensively [3], [4], [5], [6], [=-=7], [8], [9-=-], [10]. A disadvantage of these MSP’s is that they only give an estimate of the number of uncorrelated sources. Other MSP’s, based on the residuals of a source analysis, have also been reported, ... |

8 |
Estimating stationary dipoles from MEG/EEG data contaminated with spatially and temporally correlated background noise
- Munck, Huizenga, et al.
- 2002
(Show Context)
Citation Context ...ad his eyes open. The data consisted of 100 trials of 100 samples. One sensor was excluded from analysis. The 150×150 spatial pure error covariance matrix was computed with the algorithm described in=-= [35]-=-. The resulting pure error absolute correlations ranged from 0.00 to 0.80 with an average of 0.16. Inverse computations. The correct head model was used for the inverse computations. When the pure err... |

8 | Spatiotemporal EEG/MEG source analysis based on a parametric noise covariance model - Huizenga, Munck, et al. - 2002 |

7 |
Simulation studies of multiple dipole neuromagnetic source localization: model order and limits of source resolution
- SUPEK, J
- 1993
(Show Context)
Citation Context ...ysis yields an estimate of the sources of the electro-encephalogram (EEG) or magneto-encephalogram (MEG) [1]. Precise estimates of the sources can only be obtained if the number of sources is correct =-=[2]. Th-=-at is, if activity related to the stimulus, and no noise is modelled. Model selection procedures (MSP’s) can be used to give an estimate of the required number of sources. MSP’s based on an eigenv... |

7 |
Information and an Extension of the Maximum Likelihood Principle
- Akaike
- 1973
(Show Context)
Citation Context ...ed, then Cp satisfies C5 but not C6. Akaike information criterion. The Akaike information criterion (AIC) compares the loglikelihood function of the total error to a penalty term for different models =-=[28]. I-=-f the pure error is white and its variance is estimated by s 2 , then AIC = ln(πs 2 ) + e′ e + 2p. (9) s2 The model with the smallest AIC value is selected [29]. Criterion C1 is satisfied by s 2 . ... |

7 |
Optimal measurement conditions for spatiotemporal eeg/meg source analysis
- Huizenga, Heslenfeld, et al.
- 2002
(Show Context)
Citation Context ... s 2 ( ˆ θ)R( ˆ θ)CR( ˆ θ) ′ , with R( ˆ θ) a q × p matrix of first order partial derivatives of r( ˆ θ) with respect to the p source parameters. The Wald test for white pure error is the=-=n defined as [34] W = (r(ˆ θ) − rh)-=- ′ [R( ˆ θ)CR( ˆ θ) ′ ] −1 (r( ˆ θ) − rh) qs2 ( ˆ . (11) θ) W is approximately F distributed with q and m − p df. The Wald amplitude (WA) test, tests if source amplitudes deviate fro... |

6 |
Estimating and testing the sources of evoked potentials
- Huizenga, Molenaar
- 1994
(Show Context)
Citation Context ...r of uncorrelated sources. Other MSP’s, based on the residuals of a source analysis, have also been reported, for example the residual variance [11], the chisquare [1], [2], and lack of fit statisti=-=c [12]. Th-=-ese MSP’s give an estimate of the number of uncorrelated or correlated sources. In this paper we study the second class of MSP’s more *Lourens J. Waldorp is with the Department of Psychology, Univ... |

6 |
Equivalent source estimation of scalp potential fields contaminated by heteroscedastic and correlated noise
- Huizenga, Molenaar
- 1995
(Show Context)
Citation Context .../mn(n − 1) [13, p. 33] . If the pure error is ‘colored’, that is, is correlated and has unequal variances (heteroscedastic), then source parameters can be estimated by generalized least squares =-=(GLS) [14]. In GLS both -=-data and model are prewhitened by the pure error covariance matrix Σ. Generally, Σ is unknown and is therefore estimated by S = � n i=1 (yi − ¯y)(yi − ¯y) ′ /n(n − 1), that is from trial... |

5 |
The estimation of time varying dipoles on the basis of evoked potentials
- deMunck
(Show Context)
Citation Context ...e is modelled. Model selection procedures (MSP’s) can be used to give an estimate of the required number of sources. MSP’s based on an eigenvalue decomposition have been studied extensively [3], [=-=4], [5], [6-=-], [7], [8], [9], [10]. A disadvantage of these MSP’s is that they only give an estimate of the number of uncorrelated sources. Other MSP’s, based on the residuals of a source analysis, have also ... |

5 |
A comparison of moving dipole inverse solutions using EEG's and MEG's
- Cuffin
- 1985
(Show Context)
Citation Context ...for MSP’s The RV compares the squared total error to the squared data (data power) to give an estimate of how much of the data power remains unexplained by the model. The RV is defined as (e.g. [12]=-=, [16]) � ′ -=-e e RV = 100 ¯y ′ � . (2) ¯y A model is said to fit the data if its value is close to 0. In practice a threshold value is chosen (e.g. [17]). The threshold of 5% is chosen here. When comparing s... |

5 |
Ordinary least squares dipole localization is influenced by the reference, Electroencephalography and clinical neurophysiology 99
- Huizenga, Molenaar
- 1996
(Show Context)
Citation Context ... ( ˆ θ)C = s 2 ( ˆ θ)(F ′ F) −1 be the covariance matrix of ˆ θ, where the m × p matrix F contains the first order partial derivatives of the model f( ˆ θ) with respect to the parameters =-=[13, p. 26], [33]. The covariance -=-matrix of r( ˆ θ) − rh is then s 2 ( ˆ θ)R( ˆ θ)CR( ˆ θ) ′ , with R( ˆ θ) a q × p matrix of first order partial derivatives of r( ˆ θ) with respect to the p source parameters. The Wal... |

4 |
Determining the number of independent sources of the EEG. a simulation study on information criteria. Brain Topography
- Knosche, Berends, et al.
(Show Context)
Citation Context ... selection procedures (MSP’s) can be used to give an estimate of the required number of sources. MSP’s based on an eigenvalue decomposition have been studied extensively [3], [4], [5], [6], [7], [=-=8], [9], [1-=-0]. A disadvantage of these MSP’s is that they only give an estimate of the number of uncorrelated sources. Other MSP’s, based on the residuals of a source analysis, have also been reported, for e... |

4 |
likelihood ratio, and Lagrange multiplier tests in econometrics
- “Wald
- 1984
(Show Context)
Citation Context ... The likelihood ratio (LR) test evaluates if one model is more likely than another. The LR is defined as -2 times the difference between two log-likelihood functions corresponding to different models =-=[25]. If t-=-he pure error is white and multivariate normal, and if the pure error variance is estimated by s 2 for both models, then the LR for a source model with d and d + 1 sources is LR = e(ˆθd) ′ e(ˆθd... |

4 |
465–472. On the processing of spatial frequencies as revealed by evoked- [63
- Kenemans, Baas, et al.
- 1982
(Show Context)
Citation Context ... VEF’s To illustrate the use of MSP’s in experimental research, the procedures are applied to visual evoked field (VEF) data. From previous research it was expected that two sources generated the =-=VEF [38]-=-. A. Method MEG data were recorded with a CTF-gradiometer system with 151 sensors. Head position was monitored with the same 151 sensors and repeatedly activated coils at the fiducial points (nasion, ... |

3 |
Another cautionary note about R 2 : it’s use in weighted least squares regression analysis
- Willet, Singer
- 1988
(Show Context)
Citation Context ...al variance. As was shown above, the RV given in (2) does not meet C1-C6. If the pure error is colored, and the source estimates are obtained by GLS, then the data and model in (2) can be prewhitened =-=[20]-=-. In this manner C5 is satisfied but C6 is not. Chi-square statistic. The chi-square statistic tests whether the total error variance exceeds the pure error variance. The chi-square statistic is defin... |

3 |
Testing for lack of fit in nonlinear regression
- Neill
- 1988
(Show Context)
Citation Context ... C6 - - - ∗ ∗ - - - - - - Lack of fit. The lack of fit (LOF) tests whether the total error variance exceeds the pure error variance. If the pure error is white, then the LOF is defined as [13], [1=-=4], [21] LOF = e-=-′ e/(m − p) s 2 . (4) If the pure error is white and multivariate normally distributed, then the LOF has an F distribution with m − p and m(n − 1) df. A model fits if the LOF is not significan... |

3 |
Dipole separability in a neuromagnetic source analysis
- Lütkenhöner
- 1998
(Show Context)
Citation Context ... a conservative approach is taken, and that the BIC or WA are used if a more liberal approach is desirable. The sources at angles 10 ◦ −20 ◦ were difficult to separate. This was also found in [2=-=] and [39]-=- in similar circumstances. In [2] the authors suggest that modelling two close sources as one can be advantageous, in the sense that standard errors of the estimates are smaller. In this respect, it i... |

2 | A course in large sample theory. Bury st Edmunds - Ferguson - 1996 |

1 |
Model comparison and R 2
- Anderson-Sprecher
- 1994
(Show Context)
Citation Context ...els, the model with the smallest number of sources which has a value equal to or below the threshold is selected. In the statistical literature the RV is in general not considered as a good MSP [18], =-=[19]-=-. Note that the RV is high if the modelling error is high, but also if the pure error is high. As a consequence, the MSP has a tendency to model pure error (i.e. over fit). This is due to the fact tha... |

1 |
Statistical model selection procedures
- Waldorp
- 2002
(Show Context)
Citation Context ...e degrees of freedom. To correct this, m in the denominator can be replaced by m−p in Hotelling’s T 2 , which gives the adjusted Hotelling’s T 2 (aT 2 ). aT 2 is F distributed with m − p and n=-= − m df [24]-=-. Criteria C1-C6 are now all satisfied. Likelihood ratio test. The likelihood ratio (LR) test evaluates if one model is more likely than another. The LR is defined as -2 times the difference between t... |