Results 1  10
of
15
Gaussian Processes for Classification: Mean Field Algorithms
 Neural Computation
, 1999
"... We derive a mean field algorithm for binary classification with Gaussian processes which is based on the TAP approach originally proposed in Statistical Physics of disordered systems. The theory also yields an approximate leaveoneout estimator for the generalization error which is computed wit ..."
Abstract

Cited by 71 (13 self)
 Add to MetaCart
We derive a mean field algorithm for binary classification with Gaussian processes which is based on the TAP approach originally proposed in Statistical Physics of disordered systems. The theory also yields an approximate leaveoneout estimator for the generalization error which is computed with no extra computational cost. We show that from the TAP approach, it is possible to derive both a simpler `naive' mean field theory and support vector machines (SVM) as limiting cases. For both mean field algorithms and support vectors machines, simulation results for three small benchmark data sets are presented. They show 1. that one may get state of the art performance by using the leaveoneout estimator for model selection and 2. the builtin leaveoneout estimators are extremely precise when compared to the exact leaveoneout estimate. The latter result is a taken as a strong support for the internal consistency of the mean field approach. 1 1
A leastsquares approach to direct importance estimation
 Journal of Machine Learning Research
, 2009
"... We address the problem of estimating the ratio of two probability density functions, which is often referred to as the importance. The importance values can be used for various succeeding tasks such as covariate shift adaptation or outlier detection. In this paper, we propose a new importance estima ..."
Abstract

Cited by 36 (24 self)
 Add to MetaCart
We address the problem of estimating the ratio of two probability density functions, which is often referred to as the importance. The importance values can be used for various succeeding tasks such as covariate shift adaptation or outlier detection. In this paper, we propose a new importance estimation method that has a closedform solution; the leaveoneout crossvalidation score can also be computed analytically. Therefore, the proposed method is computationally highly efficient and simple to implement. We also elucidate theoretical properties of the proposed method such as the convergence rate and approximation error bounds. Numerical experiments show that the proposed method is comparable to the best existing method in accuracy, while it is computationally more efficient than competing approaches.
Sparse modelling using orthogonal forward regression with press statistic and regularization
 IEEE TRANS. SYSTEMS, MAN AND CYBERNETICS, PART B
, 2004
"... The paper introduces an efficient construction algorithm for obtaining sparse linearintheweights regression models based on an approach of directly optimizing model generalization capability. This is achieved by utilizing the delete1 cross validation concept and the associated leaveoneout tes ..."
Abstract

Cited by 28 (8 self)
 Add to MetaCart
The paper introduces an efficient construction algorithm for obtaining sparse linearintheweights regression models based on an approach of directly optimizing model generalization capability. This is achieved by utilizing the delete1 cross validation concept and the associated leaveoneout test error also known as the predicted residual sums of squares (PRESS) statistic, without resorting to any other validation data set for model evaluation in the model construction process. Computational efficiency is ensured using an orthogonal forward regression, but the algorithm incrementally minimizes the PRESS statistic instead of the usual sum of the squared training errors. A local regularization method can naturally be incorporated into the model selection procedure to further enforce model sparsity. The proposed algorithm is fully automatic, and the user is not required to specify any criterion to terminate the model construction procedure. Comparisons with some of the existing stateofart modeling methods are given, and several examples are included to demonstrate the ability of the proposed algorithm to effectively construct sparse models that generalize well.
Adaptive Regularization in Neural Network Modeling
, 1997
"... . In this paper we address the important problem of optimizing regularization parameters in neural network modeling. The suggested optimization scheme is an extended version of the recently presented algorithm [24]. The idea is to minimize an empirical estimate  like the crossvalidation estimate ..."
Abstract

Cited by 14 (2 self)
 Add to MetaCart
. In this paper we address the important problem of optimizing regularization parameters in neural network modeling. The suggested optimization scheme is an extended version of the recently presented algorithm [24]. The idea is to minimize an empirical estimate  like the crossvalidation estimate  of the generalization error with respect to regularization parameters. This is done by employing a simple iterative gradient descent scheme using virtually no additional programming overhead compared to standard training. Experiments with feedforward neural network models for time series prediction and classification tasks showed the viability and robustness of the algorithm. Moreover, we provided some simple theoretical examples in order to illustrate the potential and limitations of the proposed regularization framework. 1 Introduction Neural networks are flexible tools for time series processing and pattern recognition. By increasing the number of hidden neurons in a 2layer architec...
Gaussian Process Classification and SVM: Mean Field Results and LeaveOneOut Estimator
"... In this chapter, we elaborate on the wellknown relationship between Gaussian Processes (GP) and Support Vector Machines (SVM). Secondly, we present approximate solutions for two computational problems arising in GP and SVM. The first one is the calculation of the posterior mean for GP classifiers u ..."
Abstract

Cited by 11 (0 self)
 Add to MetaCart
In this chapter, we elaborate on the wellknown relationship between Gaussian Processes (GP) and Support Vector Machines (SVM). Secondly, we present approximate solutions for two computational problems arising in GP and SVM. The first one is the calculation of the posterior mean for GP classifiers using a `naive' mean field approach. The second one is a leaveoneout estimator for the generalization error of SVM based on a linear response method. Simulation results on a benchmark dataset show similar performances for the GP mean field algorithm and the SVM algorithm. The approximate leaveoneout estimator is found to be in very good agreement with the exact leaveoneout error. 1 Introduction It is wellknown that Gaussian Processes (GP) and Support Vector Machines (SVM) are closely related, see eg. [7]. Both approaches are nonparametric. This means that they allow for infinitely many parameters to be tuned, but increasing with the amount of data, only a finite number of them are a...
Sparse kernel density construction using orthogonal forward regression with leaveoneout test score and local regularization
 IEEE Trans. Systems, Man and Cybernetics, Part B
, 2004
"... An automatic algorithm is derived for constructing kernel density estimates based on a regression approach that directly optimizes generalization capability. Computational efficiency of the density construction is ensured using an orthogonal forward regression, and the algorithm incrementally minimi ..."
Abstract

Cited by 9 (3 self)
 Add to MetaCart
An automatic algorithm is derived for constructing kernel density estimates based on a regression approach that directly optimizes generalization capability. Computational efficiency of the density construction is ensured using an orthogonal forward regression, and the algorithm incrementally minimizes the leaveoneout test score. Local regularization is incorporated into the density construction process to further enforce sparsity. Examples are included to demonstrate the ability of the proposed algorithm to effectively construct a very sparse kernel density estimate with comparable accuracy to that of the full sample Parzen window density estimate. I.
Neuralnetwork construction and selection in nonlinear modeling
 IEEE Transactions on Neural Networks
, 2003
"... In this paper, we study how statistical tools which are commonly used independently can advantageously be exploited together in order to improve neural network estimation and selection in nonlinear static modeling. The tools we consider are the analysis of the numerical conditioning of the neural ne ..."
Abstract

Cited by 9 (1 self)
 Add to MetaCart
In this paper, we study how statistical tools which are commonly used independently can advantageously be exploited together in order to improve neural network estimation and selection in nonlinear static modeling. The tools we consider are the analysis of the numerical conditioning of the neural network candidates, statistical hypothesis tests, and cross validation. We present and analyze each of these tools in order to justify at what stage of a construction and selection procedure they can be most useful. On the basis of this analysis, we then propose a novel and systematic construction and selection procedure for neural modeling. We finally illustrate its efficiency through large scale simulations experiments and real world modeling problems.
A Probabilistic Neural Network Framework for Detection of Malignant Melanoma
 Malignant Melanoma,” Artificial Neural Networks in Cancer Diagnosis, Prognosis and Patient Management
, 1999
"... Contents 1 INTRODUCTION 3 1.1 Malignant melanoma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.2 Evolution of malignant melanoma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.3 Image acquisition techniques . . . . . . . . . . . . . . . . . ..."
Abstract

Cited by 7 (0 self)
 Add to MetaCart
Contents 1 INTRODUCTION 3 1.1 Malignant melanoma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.2 Evolution of malignant melanoma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.3 Image acquisition techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.3.1 Traditional imaging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.3.2 Dermatoscopic imaging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.4 Dermatoscopic features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2 FEATURE EXTRACTION IN DERMATOSCOPIC IMAGES 8 2.1 Image acquisition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.2 Image preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.2.1 Median filtering . . . . . . . . . . . . . . . . . . . . . . . . . . .
CrossValidation with LULOO
 Proceedings of 1996 International Conference on Neural Information Processing
, 1996
"... The leaveoneout crossvalidation scheme for generalization assessment of neural network models is computationally expensive due to replicated training sessions. Linear unlearning of examples has recently been suggested as an approach to approximative crossvalidation. Here we briefly review the lin ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
The leaveoneout crossvalidation scheme for generalization assessment of neural network models is computationally expensive due to replicated training sessions. Linear unlearning of examples has recently been suggested as an approach to approximative crossvalidation. Here we briefly review the linear unlearning scheme, dubbed LULOO, and we illustrate it on a system identification example. Further, we address the possibility of extracting confidence information (error bars) from the LULOO ensemble. 1 Introduction Consider nonlinear regression in which the output y is regressed nonlinearly on the input vector x, this report concerns a neural network implementation, in which the output is predicted by b y = F (x; w) where F (\Delta) denotes the nonlinear mapping of the neural net and w is the vector of network parameters. The conditional inputoutput distribution, i.e., the probability distribution of the output conditioned on a test input, is a basic objective for neural net modeling....
Some Computational Complexity Aspects of Neural Network Training
, 1996
"... . We adress the problem of obtaining an estimate of the computational complexity of a training algorithm, as a function of the number of patterns and parameters. On these grounds, we compare two different training algorithm, namely the leaveoneout crossvalidation and the regularisation method. 1 ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
. We adress the problem of obtaining an estimate of the computational complexity of a training algorithm, as a function of the number of patterns and parameters. On these grounds, we compare two different training algorithm, namely the leaveoneout crossvalidation and the regularisation method. 1 Neural Network training Neural networks have been extensively used as nonlinear regression tools. We will here consider the case of a multilayered perceptrons model, which provides a mapping from IR I to IR S . The mapping is parameterized by a vector w containing P parameters. The Sdimensional output is then estimated by a nonlinear function of the Idimensional input y = Fw (x). The parameter estimation procedure is refered to as training in the neural network litterature, and we will stick to this name. It is usually performed by doing an iterative minimization of a cost function C(w). This cost function includes an additive cost representing the distance of the model prediction...