Results 11  20
of
20
MEG Source Localization using an MLP with a Distributed Output Representation
"... We present a system that takes realistic magnetoencephalographic (MEG) signals and localizes a single dipole to reasonable accuracy in real time. At its heart is a multilayer perceptron (MLP) which takes the sensor measurements as inputs, uses one hidden layer, and generates as outputs the amplitude ..."
Abstract

Cited by 6 (3 self)
 Add to MetaCart
We present a system that takes realistic magnetoencephalographic (MEG) signals and localizes a single dipole to reasonable accuracy in real time. At its heart is a multilayer perceptron (MLP) which takes the sensor measurements as inputs, uses one hidden layer, and generates as outputs the amplitudes of receptive fields holding a distributed representation of the dipole location. We trained this SoftMLP on dipolar sources with real brain noise and converted the network's output into an explicit Cartesian coordinate representation of the dipole location using two different decoding strategies. The proposed SoftMLPs are much more accurate than previous networks which output source locations in Cartesian coordinates. Hybrid SoftMLPstartLM systems, in which the SoftMLP output initializes LevenbergMarquardt, retained their accuracy of 0.28 cm with a decrease in computation time from 36 ms to 30 ms. We apply the SoftMLP localizer to real MEG data separated by a blind source separation algorithm, and compare the SoftMLP dipole locations to those of a conventional system.
A Very Fast Learning Method for Neural Networks Based On Sensitivity
 JOURNAL OF MACHINE LEARNING RESEARCH
, 2006
"... This paper introduces a learning method for twolayer feedforward neural networks based on sensitivity analysis, which uses a linear training algorithm for each of the two layers. First, random values are assigned to the outputs of the first layer; later, these initial values are updated based on ..."
Abstract

Cited by 6 (2 self)
 Add to MetaCart
This paper introduces a learning method for twolayer feedforward neural networks based on sensitivity analysis, which uses a linear training algorithm for each of the two layers. First, random values are assigned to the outputs of the first layer; later, these initial values are updated based on sensitivity formulas, which use the weights in each of the layers; the process is repeated until convergence. Since these
Fast Robust SubjectIndependent Magnetoencephalographic Source Localization Using an Artificial Neural Network
 Human Brain Mapping
, 2005
"... We describe a system that localizes a single dipole to reasonable accuracy from noisy magnetoencephalographic (MEG) measurements in real time. At its core is a multilayer perceptron (MLP) trained to map sensor signals and head position to dipole location. Including head position overcomes the pre ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
We describe a system that localizes a single dipole to reasonable accuracy from noisy magnetoencephalographic (MEG) measurements in real time. At its core is a multilayer perceptron (MLP) trained to map sensor signals and head position to dipole location. Including head position overcomes the previous need to retrain the MLP for each subject and session. The training dataset was generated by mapping randomly chosen dipoles and head positions through an analytic model and adding noise from real MEG recordings. After training, a localization took 0.7 ms with an average error of 0.90 cm. A few iterations of a LevenbergMarquardt routine using the MLP output as its initial guess took 15 ms and improved accuracy to 0.53 cm, which approaches the natural limit on accuracy imposed by noise. We applied these methods to localize single dipole sources from MEG components isolated by blind source separation and compared the estimated locations to those generated by standard manually assisted commercial software. Hum Brain Mapp 24:2134, 2005. 2004 WileyLiss, Inc.
OnLine Stochastic Functional Smoothing Optimization for Neural Network Training
, 1997
"... : A set of new algorithms based on an online implementation of a well known global optimization strategy based on stochastic functional smoothing are proposed for training neural networks. These algorithms are different from other online global optimization approaches because they use not only fi ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
: A set of new algorithms based on an online implementation of a well known global optimization strategy based on stochastic functional smoothing are proposed for training neural networks. These algorithms are different from other online global optimization approaches because they use not only firstorder, but also secondorder (Hessian) gradient information. Therefore, they have faster convergence than firstorder gradient descent search methods. Convergence and sensitivity analysis of the proposed method are provided. The online algorithms are compared with secondorder gradient method, momentum learning and conjugate gradients in order to claim their consistent and global convergence abilities; and are compared with conventional stochastic global optimization scheme in order to claim their faster learning rate. Computer simulation results are presented to support the analysis. Keywordsstochastic functional smoothing, online algorithms, global convergence, mean square error, ...
A Convergence Analysis of LogLinear Training
"... Loglinear models are widely used probability models for statistical pattern recognition. Typically, loglinear models are trained according to a convex criterion. In recent years, the interest in loglinear models has greatly increased. The optimization of loglinear model parameters is costly and ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
Loglinear models are widely used probability models for statistical pattern recognition. Typically, loglinear models are trained according to a convex criterion. In recent years, the interest in loglinear models has greatly increased. The optimization of loglinear model parameters is costly and therefore an important topic, in particular for largescale applications. Different optimization algorithms have been evaluated empirically in many papers. In this work, we analyze the optimization problem analytically and show that the training of loglinear models can be highly illconditioned. We verify our findings on two handwriting tasks. By making use of our convergence analysis, we obtain good results on a largescale continuous handwriting recognition task with a simple and generic approach. 1
A new marginbased criterion for efficient gradient descent
, 2003
"... Abstract. During the last few decades, several papers were published about secondorder optimization methods for gradient descent based learning algorithms. Unfortunately, these methods usually have a cost in time close to O(n 3) per iteration, and O(n 2) in space, where n is the number of parameter ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
Abstract. During the last few decades, several papers were published about secondorder optimization methods for gradient descent based learning algorithms. Unfortunately, these methods usually have a cost in time close to O(n 3) per iteration, and O(n 2) in space, where n is the number of parameters to optimize, which is intractable with large optimization systems usually found in reallife problems. Moreover, these methods are usually not easy to implement. Many enhancements have also been proposed in order to overcome these problems, but most of them still cost O(n 2) in time per iteration. Instead of trying to solve a hard optimization problem using complex secondorder tricks, we propose to modify the problem itself in order to optimize a simpler one, by simply changing the cost function used during training. Furthermore, we will argue that analyzing the Hessian resulting from the choice of various cost functions is very informative and could help in the design of new machine learning algorithms. For instance, we propose in this paper a version of the Support Vector Machines criterion applied to Multi Layer Perceptrons, which yields very good training and generalization performance in practice. Several empirical comparisons on two benchmark data sets are given to justify this approach. 2 IDIAPâ€“RR 0316 1
Fast robust MEG source localization using MLPs
 Biomag 2002: 13th International Conference on Biomagnetism
, 2002
"... Source localization from MEG data in real time requires algorithms which are robust, fully automatic, and very fast. We present two neural network systems which are able to localize a single dipole to reasonable accuracy within a fraction of a millisecond, even when the signals are contaminated by c ..."
Abstract
 Add to MetaCart
Source localization from MEG data in real time requires algorithms which are robust, fully automatic, and very fast. We present two neural network systems which are able to localize a single dipole to reasonable accuracy within a fraction of a millisecond, even when the signals are contaminated by considerable noise. The first network is a multilayer perceptron (MLP) which takes the sensor measurements as inputs, uses two hidden layers, and outputs source location in Cartesian coordinates. After training with random dipolar sources contaminated by real noise, localization of a single dipole could be performed within 300 microseconds on an 800 Mhz Athlon workstation, with an average localization error of 1.15 cm. To improve the accuracy to 0.28 cm, one can apply a few iterations of conventional LevenbergMarquardt (LM) minimization using the MLP output as the initial guess. The combined method is about twenty times faster than multistart LM localization with comparable accuracy. In a second network with only one hidden layer, the outputs were the amplitudes of 193 evenly distributed Gaussian functions holding a soft distributed representation of the dipole location. We trained this network on dipolar sources with real noise, and externally converted the network's output into an explicit Cartesian coordinate representation of the dipole location. This new network had an improved localization accuracy of 0.87 cm, while localization time was lengthened to about 800 microseconds.
Fast accurate MEG source localization using a multilayer
 Physics in Medicine and Biology
, 2002
"... Iterative gradient methods like LevenbergMarquardt (LM) are in widespread use for source localization from electroencephalographic (EEG) and magnetoencephalographic (MEG) signals. Unfortunately LM depends sensitively on the initial guess, necessitating repeated runs. This, combined with LM's high p ..."
Abstract
 Add to MetaCart
Iterative gradient methods like LevenbergMarquardt (LM) are in widespread use for source localization from electroencephalographic (EEG) and magnetoencephalographic (MEG) signals. Unfortunately LM depends sensitively on the initial guess, necessitating repeated runs. This, combined with LM's high perstep cost, makes its computational burden quite high. To reduce this burden, we trained a multilayer perceptron (MLP) as a realtime localizer. We used an analytical model of quasistatic electromagnetic propagation through a spherical head to map randomly chosen dipoles to sensor activities according to the sensor geometry of a 4D Neuroimaging Neuromag122 MEG system, and trained a MLP to invert this mapping in the absence of noise or in the presence of various sorts of noise such as white Gaussian noise, correlated noise, or real brain noise. A MLP structure was chosen to trade off computation and accuracy. This MLP was trained four times, with each type of noise. We measured the effects of initial guesses on LM performance, which motivated a hybrid MLPstart LM method, in which the trained MLP initializes LM. We also compared the localization performance of LM, MLPs, and hybrid MLPstartLMs for realistic brain signals. Trained MLPs are much faster than other methods, while the hybrid MLPstartLMs are faster and more accurate than fixed4startLM. In particular, the hybrid MLPstartLM initialized by a MLP trained with the real brain noise dataset is 60 times faster and is comparable in accuracy to random20startLM, and this hybrid system (localization error: 0.28 cm, computation time: 36 ms) shows almost as good performance as optimal1startLM (localization error: 0.23 cm, computation time: 22 ms), which initializes LM with the correct dipole location. MLPs trai...
Handling Asynchronous or Missing Data . . .
, 1998
"... An important issue with many sequential data analysis problems, such as those encountered in nfiancial data sets, is that different variables are known at different frequencies, at different times (asynchronicity), or are sometimes missing. To address this issue we propose to use recurrent networks ..."
Abstract
 Add to MetaCart
An important issue with many sequential data analysis problems, such as those encountered in nfiancial data sets, is that different variables are known at different frequencies, at different times (asynchronicity), or are sometimes missing. To address this issue we propose to use recurrent networks with feedback into the input units, based on two fundamental ideas. The first motivation is that the "filledin" value of the missing variable may not only depend in complicated ways on the value of this variable in the past of the sequence but also on the current and past values of other variables. The second motivation is that, for the purpose of making predictions or taking decisions, it is not always necessary to fill in the best possible value of the missing variables. In fact, it is sufficient to fill in a value which helps the system make better predictions or decisions. The advantages of this approach are demonstrated through experiments on several tasks.
An Investigation of the Gradient Descent Process in Neural Networks
, 1996
"... not be interpreted as representing Usually gradient descent is merely a way to find a minimum, abandoned if a more efficient technique is available. Here we investigate the detailed properties of the gradient descent process, and the related topics of how gradients can be computed, what the limitati ..."
Abstract
 Add to MetaCart
not be interpreted as representing Usually gradient descent is merely a way to find a minimum, abandoned if a more efficient technique is available. Here we investigate the detailed properties of the gradient descent process, and the related topics of how gradients can be computed, what the limitations on gradient descent are, and how the secondorder information that governs the dynamics of gradient descent can be probed. To develop our intuitions, gradient descent is applied to a simple robot arm dynamics compensation problem, using backpropagation on a temporal windows architecture. The results suggest that smooth filters can be easily learned, but that the deterministic gradient descent process can be slow and can exhibit oscillations. Algorithms to compute the gradient of recurrent networks are then surveyed in a general framework, leading to some unifications, a deeper understanding of recurrent networks, and some algorithmic extensions. By regarding deterministic gradient descent as a dynamic system we obtain results concerning its convergence, and a quantitative theory of its behavior