## Machine learning techniques for brain-computer interfaces (2004)

Venue: | BIOMEDICAL ENGINEERING |

Citations: | 16 - 3 self |

### BibTeX

@ARTICLE{Müller04machinelearning,

author = {K.-R. Müller and M. Krauledat and G. Dornhege and G. Curio and B. Blankertz},

title = {Machine learning techniques for brain-computer interfaces},

journal = {BIOMEDICAL ENGINEERING},

year = {2004},

pages = {11--22}

}

### OpenURL

### Abstract

This review discusses machine learning methods and their application to Brain-Computer Interfacing. A particular focus is placed on feature selection. We also point out common flaws when validating machine learning methods in the context of BCI. Finally we provide a brief overview on the Berlin-Brain Computer Interface (BBCI).

### Citations

9441 | The Nature of Statistical Learning Theory
- Vapnik
- 1995
(Show Context)
Citation Context ...ror cannot be obtained by simply minimizing the training error (2). One way to avoid the overfitting dilemma is to restrict the complexity of the function class F that one chooses the function f from =-=[12]-=-. The intuition, which will be formalized in the following is that a “simple” (e.g. linear) function that explains most of the data is preferable to a complex one (Occam’s razor). Typically one introd... |

5073 |
Neural Networks for Pattern Recognition
- Bishop
- 1995
(Show Context)
Citation Context ...nds to a non-linear ellipsoidal decision boundary (left). From [2]. or regression in F is to be found. This is also implicitly done for (one hidden layer) neural networks, radial basis networks (e.g. =-=[22, 23, 24, 20]-=-) or Boosting algorithms [25] where the input data is mapped to some representation given by the hidden layer, the RBF bumps or the hypotheses space respectively. The so-called curse of dimensionality... |

3878 |
Neural Network: A Comprehensive Foundation
- Haykin
- 1994
(Show Context)
Citation Context ...nds to a non-linear ellipsoidal decision boundary (left). From [2]. or regression in F is to be found. This is also implicitly done for (one hidden layer) neural networks, radial basis networks (e.g. =-=[22, 23, 24, 20]-=-) or Boosting algorithms [25] where the input data is mapped to some representation given by the hidden layer, the RBF bumps or the hypotheses space respectively. The so-called curse of dimensionality... |

2775 |
Introduction to Statistical Pattern Recognition, 2nd edition
- Fukunaga
- 1990
(Show Context)
Citation Context ...d how big is the variance of the data in this direction (should be small). This can be achieved by maximizing the so-called Rayleigh coefficient of between and within class variance with respect to w =-=[15, 16]-=-. The slightly stronger assumptions have been fulfilled in several of our BCI experiments e.g. in [17, 18]. When the optimization to obtain (regularized) Fisher’s discriminant is formulated as a mathe... |

2412 | A decision-theoretic generalization of on-line learning and an application to boosting
- Freund, Schapire
- 1995
(Show Context)
Citation Context ...boundary (left). From [2]. or regression in F is to be found. This is also implicitly done for (one hidden layer) neural networks, radial basis networks (e.g. [22, 23, 24, 20]) or Boosting algorithms =-=[25]-=- where the input data is mapped to some representation given by the hidden layer, the RBF bumps or the hypotheses space respectively. The so-called curse of dimensionality from statistics says essenti... |

2011 |
Pattern classification
- Duda, Hart, et al.
- 2001
(Show Context)
Citation Context ...e pitfalls and point out ways around them. Let us first fix the notation and introduce the linear hyperplane classification model upon which we will rely mostly in the following (cf. Fig. 2, see e.g. =-=[14]-=-). In a BCI set-up we measure k = 1...K samples xk, where x are some appropriate feature vectors in n dimensional space. In the training data we have a class label, e.g. yk ∈ {−1,+1} for each sample p... |

1360 | A training algorithm for optimal margin classifiers
- Boser, Guyon, et al.
- 1992
(Show Context)
Citation Context ...lgorithm in this space. Fortunately, for certain feature spaces F and corresponding mappings Φ there is a highly effective trick for computing scalar products in feature spaces using kernel functions =-=[27, 28, 29, 12]-=-. Let us come back to the example from Eq. (5). Here, the computation of a scalar product between two feature space vectors, can be readily reformulated in terms of a kernel function k Φ(x) ⊤ Φ(y) = (... |

1283 |
Adaptive Filter Theory
- Haykin
- 2001
(Show Context)
Citation Context ... on the ’learning machine’, which also holds the potential of adapting to specific tasks and changing environments given that suitable machine learning (e.g. [2]) and adaptive signal processing (e.g. =-=[10]-=-) algorithms are used. Short training times, however, imply the challenge that only few data samples are available for The studies were partly supported by the Bundesministerium für Bildung und Forsch... |

1092 | Nonlinear component analysis as a kernel eigenvalue problem
- Schölkopf, Smola, et al.
- 1998
(Show Context)
Citation Context ...=: k(x,y). This finding generalizes as for x,y ∈ÊN , and d ∈ N the kernel function k(x,y) = (x ⊤ y) d computes a scalar product in the space of all products of d vector entries (monomials) of x and y =-=[12, 30]-=-. Note furTable 1: Common kernel functions: Gaussian RBF (c ∈Ê), polynomial (d ∈ N,θ ∈Ê), sigmoidal (κ,θ ∈Ê) and inverse multiquadric (c ∈Ê+) kernel functions are among the most common ones. While RBF... |

1014 |
The use of multiple measurements in taxonomic problems
- Fisher
- 1936
(Show Context)
Citation Context ...d how big is the variance of the data in this direction (should be small). This can be achieved by maximizing the so-called Rayleigh coefficient of between and within class variance with respect to w =-=[15, 16]-=-. The slightly stronger assumptions have been fulfilled in several of our BCI experiments e.g. in [17, 18]. When the optimization to obtain (regularized) Fisher’s discriminant is formulated as a mathe... |

755 | An introduction to variable and feature selection
- Guyon, Elisseeff
- 2003
(Show Context)
Citation Context ...terest, are taken into account. Furthermore and very important in practice, we can discard non-informative dimensions of the data and thus select the features of interest for classification (see e.g. =-=[41]-=-). Straightforward as this may appear, it is in fact a machine learning art of its own, since we must decide on features that don’t overfit the training samples but rather generalize to yet unknown te... |

518 | A tutorial on support vector regression
- Smola, Schölkopf
- 1998
(Show Context)
Citation Context ... and inverse multiquadric (c ∈Ê+) kernel functions are among the most common ones. While RBF and polynomial are known to fulfill Mercers condition, this is not strictly the case for sigmoidal kernels =-=[31]-=-. Further valid kernels proposed in the context of regularization networks are e.g. multiquadric or spline kernels [13, 32, 33]. � −�x − y�2� Gaussian RBF k(x,y) = exp c Polynomial (x ⊤ y+θ) d Sigmoid... |

496 |
Fast learning in networks of locally-tuned processing units
- Moody, Darken
- 1989
(Show Context)
Citation Context ...nds to a non-linear ellipsoidal decision boundary (left). From [2]. or regression in F is to be found. This is also implicitly done for (one hidden layer) neural networks, radial basis networks (e.g. =-=[22, 23, 24, 20]-=-) or Boosting algorithms [25] where the input data is mapped to some representation given by the hidden layer, the RBF bumps or the hypotheses space respectively. The so-called curse of dimensionality... |

398 | An introduction to kernel-based learning algorithms
- Muller, Mika, et al.
- 2001
(Show Context)
Citation Context ... impose the main load of the learning task on the ’learning machine’, which also holds the potential of adapting to specific tasks and changing environments given that suitable machine learning (e.g. =-=[2]-=-) and adaptive signal processing (e.g. [10]) algorithms are used. Short training times, however, imply the challenge that only few data samples are available for The studies were partly supported by t... |

389 | Convolution kernels on discrete structures
- Haussler
- 1999
(Show Context)
Citation Context ...regularization operator (cf. [33, 34]). Table 1 lists some of the most widely used kernel functions. More sophisticated kernels (e.g. kernels generating splines or Fourier expansions) can be found in =-=[35, 36, 33, 37, 38, 39]-=-. The interesting point about kernel functions is that the scalar product can be implicitly computed in F , without explicitly using or even knowing the mapping Φ. So, kernels allow to compute scalar ... |

376 | Brain-computer interfaces for communication and control - Wolpaw, Birbaumer, et al. |

341 | Fisher Discriminant Analysis with Kernels
- Smoke, Weston
- 1999
(Show Context)
Citation Context ... construct a nonlinear version of a linear algorithm. 3 Examples of such kernel-based learning machines are among others, e.g. Support Vector Machines (SVMs) [12, 2], Kernel Fisher Discriminant (KFD) =-=[40]-=- or Kernel Principal Component Analysis (KPCA) [30]. 3.4. Discussion To summarize: a small error on unseen data cannot be obtained by simply minimizing the training error, on the contrary, this will i... |

292 |
Theoretical foundations of the potential function method in pattern recognition learning
- Aizerman, Braverman, et al.
- 1964
(Show Context)
Citation Context ...lgorithm in this space. Fortunately, for certain feature spaces F and corresponding mappings Φ there is a highly effective trick for computing scalar products in feature spaces using kernel functions =-=[27, 28, 29, 12]-=-. Let us come back to the example from Eq. (5). Here, the computation of a scalar product between two feature space vectors, can be readily reformulated in terms of a kernel function k Φ(x) ⊤ Φ(y) = (... |

285 |
Regularization Algorithms for Learning That are equivalent to Multilayer
- Piggio, Girosi
- 1990
(Show Context)
Citation Context ...o fulfill Mercers condition, this is not strictly the case for sigmoidal kernels [31]. Further valid kernels proposed in the context of regularization networks are e.g. multiquadric or spline kernels =-=[13, 32, 33]-=-. � −�x − y�2� Gaussian RBF k(x,y) = exp c Polynomial (x ⊤ y+θ) d Sigmoidal tanh(κ(x ⊤ y)+θ) inv. multiquadric 1 � �x − y�2 + c2 thermore that using a particular SV kernel corresponds tosan implicit c... |

263 | Soft margins for adaboost
- Rätsch, Onoda, et al.
- 2001
(Show Context)
Citation Context ...and on the right the original decision line is plotted with dots. Illustrated is the noise sensitivity: only one strong noise/outlier pattern can spoil the whole estimation of the decision line. From =-=[21]-=-. 3.3. Beyond linear classifiers Kernel based learning has taken the step from linear to nonlinear classification in a particularly interesting and efficient 1 manner: a linear algorithm is applied in... |

208 | An equivalence between sparse approximation and support vector machines
- Girosi
- 1998
(Show Context)
Citation Context ... c Polynomial (x ⊤ y+θ) d Sigmoidal tanh(κ(x ⊤ y)+θ) inv. multiquadric 1 � �x − y�2 + c2 thermore that using a particular SV kernel corresponds tosan implicit choice of a regularization operator (cf. =-=[33, 34]-=-). Table 1 lists some of the most widely used kernel functions. More sophisticated kernels (e.g. kernels generating splines or Fourier expansions) can be found in [35, 36, 33, 37, 38, 39]. The interes... |

152 | The connection between regularization operators and support vector kernels
- Smola, Schölkopf, et al.
- 1998
(Show Context)
Citation Context ...o fulfill Mercers condition, this is not strictly the case for sigmoidal kernels [31]. Further valid kernels proposed in the context of regularization networks are e.g. multiquadric or spline kernels =-=[13, 32, 33]-=-. � −�x − y�2� Gaussian RBF k(x,y) = exp c Polynomial (x ⊤ y+θ) d Sigmoidal tanh(κ(x ⊤ y)+θ) inv. multiquadric 1 � �x − y�2 + c2 thermore that using a particular SV kernel corresponds tosan implicit c... |

138 | Optimal spatial filtering of single trial EEG during imagined hand movement
- Ramoser, Müller-Gerking, et al.
- 2000
(Show Context)
Citation Context ...t the involved brain functions (frequency filtering, spatial filtering, ...), or one may recruit projection techniques from the theory of supervised learning, e.g., a common spatial pattern analysis (=-=[16, 43, 44]-=-), or from unsupervised learning, e.g, independent component analysis ([45, 46, 47]). (2) A neurophysiological assessment of the selected features mayslead to a better understanding of the involved br... |

108 |
Brain-computer interface technology: A review of the first international meeting
- Wolpaw, Birbaumer
- 2000
(Show Context)
Citation Context ...I. Finally we provide a brief overview on the Berlin-Brain Computer Interface (BBCI). 1. INTRODUCTION Brain-Computer Interfacing is an interesting, active and highly interdisciplinary research topic (=-=[3, 4, 5, 6]-=-) at the interface between medicine, psychology, neurology, rehabilitation engineering, man-machine interaction, machine learning and signal processing. A BCI could, e.g., allow a paralyzed patient to... |

107 | Engineering support vector machine kernels that recognize translation initiation sites
- Zien, Rätsch, et al.
(Show Context)
Citation Context ...regularization operator (cf. [33, 34]). Table 1 lists some of the most widely used kernel functions. More sophisticated kernels (e.g. kernels generating splines or Fourier expansions) can be found in =-=[35, 36, 33, 37, 38, 39]-=-. The interesting point about kernel functions is that the scalar product can be implicitly computed in F , without explicitly using or even knowing the mapping Φ. So, kernels allow to compute scalar ... |

80 |
Pattern Classification: A Unified View of Statistical and Neural Approaches. Wiley-Interscience
- Schürmann
- 1996
(Show Context)
Citation Context ...rom the toy example in Figure 4: in two dimensions a rather complicated nonlinear decision surface is necessary to separate the classes, whereas in a feature space of second order monomials (see e.g. =-=[26]-=-) Φ :Ê2 →Ê3 (x1,x2) ↦→ (z1,z2,z3) := (x 2 1 ,√2x1x2,x 2 2 ) (5) all one needs for separation is a linear hyperplane. In this simple toy example, we can easily control both: the statis✕ ✕ z 1 ✕ tical c... |

79 | Classifying single trial EEG: Towards brain–computer interfacing
- 12Blankertz, Curio, et al.
- 2002
(Show Context)
Citation Context ...ing the so-called Rayleigh coefficient of between and within class variance with respect to w [15, 16]. The slightly stronger assumptions have been fulfilled in several of our BCI experiments e.g. in =-=[17, 18]-=-. When the optimization to obtain (regularized) Fisher’s discriminant is formulated as a mathematical programm, cf. [19, 2], it resembles the SVM: min 1/2||w|| w,b,ξ 2 2 + C/K ||ξ || 2 2 subject to yk... |

78 | On a kernel-based method for pattern recognition, regression, approximation and operator inversion, Algorithmica 22
- Smola, SchVolkopf
- 1998
(Show Context)
Citation Context ... 1 otherwise (the so-called 0/1-loss). The same framework can be applied for regression problems, where y ∈Ê. Here, the most common loss function is the squared loss: l( f(x),y) = ( f(x) − y) 2 ; see =-=[11]-=- for a discussion of other loss functions. Unfortunately the risk cannot be minimized directly, since the underlying probability distribution P(x,y) is unknown. Therefore, we have to try to estimate a... |

78 | Priors, stabilizers and basis functions: From regularization to radial, tensor and additive splines
- Girosi, Jones, et al.
- 1993
(Show Context)
Citation Context ...o fulfill Mercers condition, this is not strictly the case for sigmoidal kernels [31]. Further valid kernels proposed in the context of regularization networks are e.g. multiquadric or spline kernels =-=[13, 32, 33]-=-. � −�x − y�2� Gaussian RBF k(x,y) = exp c Polynomial (x ⊤ y+θ) d Sigmoidal tanh(κ(x ⊤ y)+θ) inv. multiquadric 1 � �x − y�2 + c2 thermore that using a particular SV kernel corresponds tosan implicit c... |

76 |
Lopes da Silva, Event-related EEG/MEG synchronization and desynchronization: basic principles, Clin. Neurophysiol
- Pfurtscheller, H
- 1999
(Show Context)
Citation Context ...s in the scalp plots) for left (thin line) and right (thick line) hand motor imagery. The contralateral attenuation of the μ-rhythm during motor imagery is clearly observable. For details on ERD, see =-=[54]-=-. the dimensionality of features is. The reason of the very bad performance of k-NN is that the underlying Euclidean metric is not appropriate for the bad signal-to-noise ratio found in EEG trials. Fo... |

64 |
A spelling device for the paralysed
- Birbaumer, Ghanayim, et al.
- 1999
(Show Context)
Citation Context ...ng on the adaptability of the human brain to biofeedback, i.e., a subject learns the mental states required to be understood by the machines, an endeavour that can take months until it reliably works =-=[8, 9]-=-. The Berlin Brain-Computer Interface (BBCI) pursues another objective in this respect, i.e., to impose the main load of the learning task on the ’learning machine’, which also holds the potential of ... |

58 | Boosting bit rates and error detection for the classification of fast-paced motor commands based on single-trial EEG analysis
- Blankertz, Dornhege, et al.
- 2003
(Show Context)
Citation Context ...ing the so-called Rayleigh coefficient of between and within class variance with respect to w [15, 16]. The slightly stronger assumptions have been fulfilled in several of our BCI experiments e.g. in =-=[17, 18]-=-. When the optimization to obtain (regularized) Fisher’s discriminant is formulated as a mathematical programm, cf. [19, 2], it resembles the SVM: min 1/2||w|| w,b,ξ 2 2 + C/K ||ξ || 2 2 subject to yk... |

58 | A mathematical programming approach to the kernel Fisher algorithm,” in Advances in Neural Information Processing Systems
- Mika, Ratsch, et al.
(Show Context)
Citation Context ...r assumptions have been fulfilled in several of our BCI experiments e.g. in [17, 18]. When the optimization to obtain (regularized) Fisher’s discriminant is formulated as a mathematical programm, cf. =-=[19, 2]-=-, it resembles the SVM: min 1/2||w|| w,b,ξ 2 2 + C/K ||ξ || 2 2 subject to yk(w ⊤ xk + b) = 1 − ξ k for k = 1,...,K. 3.2. Some remarks about regularization and non-robust classifiers Linear classifier... |

50 |
Linear and nonlinear methods for brain-computer interfaces
- MULLER, ANDERSON, et al.
(Show Context)
Citation Context ...um für Bildung und Forschung (BMBF), FKZ 01IBB02A and FKZ 01IBB02B, by the Deutsche Forschungsgemeinschaft (DFG), FOR 375/B1 and the PASCAL Network of Excellence, EU # 506778. This review is based on =-=[1, 2]-=-. Hindenburgdamm 30, 12 203 Berlin, Germany learning to characterize the individual brain states to be distinguished. In particular when dealing with few samples of data (trials of the training sessio... |

49 | Brain-computer interface research at the Wadsworth center
- Wolpaw, McFarland, et al.
- 2000
(Show Context)
Citation Context ...ng on the adaptability of the human brain to biofeedback, i.e., a subject learns the mental states required to be understood by the machines, an endeavour that can take months until it reliably works =-=[8, 9]-=-. The Berlin Brain-Computer Interface (BBCI) pursues another objective in this respect, i.e., to impose the main load of the learning task on the ’learning machine’, which also holds the potential of ... |

40 |
Brain-computer communication : unlocking the locked in
- Kübler, Kotchoubey, et al.
- 2001
(Show Context)
Citation Context ...I. Finally we provide a brief overview on the Berlin-Brain Computer Interface (BBCI). 1. INTRODUCTION Brain-Computer Interfacing is an interesting, active and highly interdisciplinary research topic (=-=[3, 4, 5, 6]-=-) at the interface between medicine, psychology, neurology, rehabilitation engineering, man-machine interaction, machine learning and signal processing. A BCI could, e.g., allow a paralyzed patient to... |

34 | Support vector regression with ANOVA decomposition kernels
- Stitson, Gammerman, et al.
- 1997
(Show Context)
Citation Context ...regularization operator (cf. [33, 34]). Table 1 lists some of the most widely used kernel functions. More sophisticated kernels (e.g. kernels generating splines or Fourier expansions) can be found in =-=[35, 36, 33, 37, 38, 39]-=-. The interesting point about kernel functions is that the scalar product can be implicitly computed in F , without explicitly using or even knowing the mapping Φ. So, kernels allow to compute scalar ... |

31 |
Learning to control brain activity: A review of the production and control ofEEGcomponents for driving brain– computer interface (BCI) systems
- Curran, Strokes
- 2003
(Show Context)
Citation Context ...I. Finally we provide a brief overview on the Berlin-Brain Computer Interface (BBCI). 1. INTRODUCTION Brain-Computer Interfacing is an interesting, active and highly interdisciplinary research topic (=-=[3, 4, 5, 6]-=-) at the interface between medicine, psychology, neurology, rehabilitation engineering, man-machine interaction, machine learning and signal processing. A BCI could, e.g., allow a paralyzed patient to... |

31 | Support vector channel selection for BCI
- Lal, Schroder, et al.
- 2004
(Show Context)
Citation Context ...so been in use in the BBCI system. Note however that we will not be exhaustive in our exposition, for further references on feature selection in general see e.g. [41, 14] or in the context of BCI see =-=[17, 42]-=-. Suppose for each epoch of recorded brain signals one has a multi-dimensional vector x. Each dimension of that vector is called a feature, and the whole vector is called feature vector. The samples a... |

30 |
Support Vector Learning. R. Oldenbourg
- Schölkopf
- 1997
(Show Context)
Citation Context |

28 |
Deecke L. Neuroimage of voluntary movement: topography of the Bereitschaftspotential, a 64-channel DC current source density study. Neuroimage
- RQ, Huter, et al.
(Show Context)
Citation Context ... motor tasks, a negative readiness potential precedes the actual execution. Using multi-channel EEG recordings it has been demonstrated that several brain areas contribute to this negative shift (cf. =-=[51, 52]-=-). In unilateral finger or hand movements the negative shift is mainly focussed on the frontal lobe in the area of the corresponding motor cortex, i.e., contralateral to the performing hand. Based on ... |

23 |
Linear neural networks
- Orr
- 2010
(Show Context)
Citation Context ...results for linear learning machines, it can be even more devastating for nonlinear methods. A more formal way to control one’s mistrust in the available training data, is to use regularization (e.g. =-=[13, 20]-=-). Regularization helps to limit (a) the influence of outliers or strong noise (e.g. to avoid Fig.3 middle), (b) the complexity of the classifier (e.g. to avoid Fig.3 right) and (c) the raggedness of ... |

21 |
Independent component analysis of non-invasively recorded cortical magnetic DC- in humans
- Wubbeler, Ziehe, et al.
- 2000
(Show Context)
Citation Context ...e may recruit projection techniques from the theory of supervised learning, e.g., a common spatial pattern analysis ([16, 43, 44]), or from unsupervised learning, e.g, independent component analysis (=-=[45, 46, 47]-=-). (2) A neurophysiological assessment of the selected features mayslead to a better understanding of the involved brain functions, and—as consequence—to further improvements of the algorithms. (3) On... |

19 |
Negative cortical DC shifts preceding and accompanying simple and complex sequential movements.Experimental Brain Research
- Lang, Zilch, et al.
- 1989
(Show Context)
Citation Context ... motor tasks, a negative readiness potential precedes the actual execution. Using multi-channel EEG recordings it has been demonstrated that several brain areas contribute to this negative shift (cf. =-=[51, 52]-=-). In unilateral finger or hand movements the negative shift is mainly focussed on the frontal lobe in the area of the corresponding motor cortex, i.e., contralateral to the performing hand. Based on ... |

13 | Blind source separation techniques for decomposing evoked brain signals
- Müller, Vigário, et al.
(Show Context)
Citation Context ...e may recruit projection techniques from the theory of supervised learning, e.g., a common spatial pattern analysis ([16, 43, 44]), or from unsupervised learning, e.g, independent component analysis (=-=[45, 46, 47]-=-). (2) A neurophysiological assessment of the selected features mayslead to a better understanding of the involved brain functions, and—as consequence—to further improvements of the algorithms. (3) On... |

12 |
Theory of Reproducing Kernels and
- Saitoh
- 1988
(Show Context)
Citation Context ...lgorithm in this space. Fortunately, for certain feature spaces F and corresponding mappings Φ there is a highly effective trick for computing scalar products in feature spaces using kernel functions =-=[27, 28, 29, 12]-=-. Let us come back to the example from Eq. (5). Here, the computation of a scalar product between two feature space vectors, can be readily reformulated in terms of a kernel function k Φ(x) ⊤ Φ(y) = (... |

8 |
Klaus-Robert Müller. Boosting Bit Rates in Noninvasive EEG Single-Trial Classifications by Feature Combination and Multiclass Paradigms
- Dornhege, Blankertz, et al.
- 2004
(Show Context)
Citation Context ...t the involved brain functions (frequency filtering, spatial filtering, ...), or one may recruit projection techniques from the theory of supervised learning, e.g., a common spatial pattern analysis (=-=[16, 43, 44]-=-), or from unsupervised learning, e.g, independent component analysis ([45, 46, 47]). (2) A neurophysiological assessment of the selected features mayslead to a better understanding of the involved br... |

5 |
Klaus-Robert Müller, “The berlin brain-computer interface: Machine-learning based detection of user specific brain states
- Blankertz, Dornhege, et al.
- 2007
(Show Context)
Citation Context ...ction research, the communication channel from a healthy human’s brain to a computer has not yet been subject to intensive exploration, however it has potential, e.g., to speed up reaction times, cf. =-=[7]-=- or to supply a better understanding of a human operator’s mental states. Classical BCI technology has been mainly relying on the adaptability of the human brain to biofeedback, i.e., a subject learns... |

3 |
Motoaki Kawanabe, and Klaus-Robert Müller, “A fast algorithm for joint diagonalization with non-orthogonal transformations and its application to blind source separation
- Ziehe
- 2004
(Show Context)
Citation Context ...e may recruit projection techniques from the theory of supervised learning, e.g., a common spatial pattern analysis ([16, 43, 44]), or from unsupervised learning, e.g, independent component analysis (=-=[45, 46, 47]-=-). (2) A neurophysiological assessment of the selected features mayslead to a better understanding of the involved brain functions, and—as consequence—to further improvements of the algorithms. (3) On... |

2 |
KlausRobert Müller, “Increase information transfer rates in BCI by CSP extension to multi-class
- Dornhege, Blankertz, et al.
(Show Context)
Citation Context ...chines learn’. To this end, the machine learning and feature selection methods presented in the previous sections are applied to EEG data from selected BBCI paradigms: selfpaced [17, 18] and imagined =-=[49, 44, 50]-=- experiments. 6.1. Self-paced Finger Tapping Experiments In preparation of motor tasks, a negative readiness potential precedes the actual execution. Using multi-channel EEG recordings it has been dem... |