Results 1  10
of
30
Evaluation Of Gaussian Processes And Other Methods For NonLinear Regression
, 1996
"... This thesis develops two Bayesian learning methods relying on Gaussian processes and a rigorous statistical approach for evaluating such methods. In these experimental designs the sources of uncertainty in the estimated generalisation performances due to both variation in training and test sets are ..."
Abstract

Cited by 140 (16 self)
 Add to MetaCart
This thesis develops two Bayesian learning methods relying on Gaussian processes and a rigorous statistical approach for evaluating such methods. In these experimental designs the sources of uncertainty in the estimated generalisation performances due to both variation in training and test sets are accounted for. The framework allows for estimation of generalisation performance as well as statistical tests of significance for pairwise comparisons. Two experimental designs are recommended and supported by the DELVE software environment. Two new nonparametric Bayesian learning methods relying on Gaussian process priors over functions are developed. These priors are controlled by hyperparameters which set the characteristic length scale for each input dimension. In the simplest method, these parameters are fit from the data using optimization. In the second, fully Bayesian method, a Markov chain Monte Carlo technique is used to integrate over the hyperparameters. One advantage of these G...
Comparison of Approximate Methods for Handling Hyperparameters
 NEURAL COMPUTATION
"... I examine two approximate methods for computational implementation of Bayesian hierarchical models, that is, models which include unknown hyperparameters such as regularization constants and noise levels. In the 'evidence framework' the model parameters are integrated over, and the resulting evid ..."
Abstract

Cited by 68 (1 self)
 Add to MetaCart
I examine two approximate methods for computational implementation of Bayesian hierarchical models, that is, models which include unknown hyperparameters such as regularization constants and noise levels. In the 'evidence framework' the model parameters are integrated over, and the resulting evidence is maximized over the hyperparameters. The optimized
Constructive Algorithms for Structure Learning in Feedforward Neural Networks for Regression Problems
 IEEE Transactions on Neural Networks
, 1997
"... In this survey paper, we review the constructive algorithms for structure learning in feedforward neural networks for regression problems. The basic idea is to start with a small network, then add hidden units and weights incrementally until a satisfactory solution is found. By formulating the whole ..."
Abstract

Cited by 66 (2 self)
 Add to MetaCart
In this survey paper, we review the constructive algorithms for structure learning in feedforward neural networks for regression problems. The basic idea is to start with a small network, then add hidden units and weights incrementally until a satisfactory solution is found. By formulating the whole problem as a state space search, we first describe the general issues in constructive algorithms, with special emphasis on the search strategy. A taxonomy, based on the differences in the state transition mapping, the training algorithm and the network architecture, is then presented. Keywords Constructive algorithm, structure learning, state space search, dynamic node creation, projection pursuit regression, cascadecorrelation, resourceallocating network, group method of data handling. I. Introduction A. Problems with Fixed Size Networks I N recent years, many neural network models have been proposed for pattern classification, function approximation and regression problems. Among...
Regression Trees With Unbiased Variable Selection and Interaction Detection
 STATISTICA SINICA
, 2002
"... We propose an algorithm for regression tree construction called GUIDE. It is specifically designed to eliminate variable selection bias, a problem that can undermine the reliability of inferences from a tree structure. GUIDE controls bias by employing chisquare analysis of residuals and bootstrap c ..."
Abstract

Cited by 56 (14 self)
 Add to MetaCart
We propose an algorithm for regression tree construction called GUIDE. It is specifically designed to eliminate variable selection bias, a problem that can undermine the reliability of inferences from a tree structure. GUIDE controls bias by employing chisquare analysis of residuals and bootstrap calibration of significance probabilities. This approach allows fast computation speed, natural extension to data sets with categorical variables, and direct detection of local twovariable interactions. Previous algorithms are not unbiased and are insensitive to local interactions during split selection. The speed of GUIDE enables two further enhancements—complex modeling at the terminal nodes, such as polynomial or best simple linear models, and bagging. In an experiment with real data sets, the prediction mean square error of the piecewise constant GUIDE model is within ±20 % of that of CART�. Piecewise linear GUIDE models are more accurate; with bagging they can outperform the splinebased MARS � method.
Moderating the Outputs of Support Vector Machine Classifiers
 IEEE Transactions on Neural Networks
, 1999
"...  In this paper, we extend the use of moderated outputs to the support vector machine (SVM) by making use of a relationship between SVM and the evidence framework. The moderated output is more in line with the Bayesian idea that the posterior weight distribution should be taken into account upon pre ..."
Abstract

Cited by 42 (3 self)
 Add to MetaCart
 In this paper, we extend the use of moderated outputs to the support vector machine (SVM) by making use of a relationship between SVM and the evidence framework. The moderated output is more in line with the Bayesian idea that the posterior weight distribution should be taken into account upon prediction, and it also alleviates the usual tendency of assigning overly high condence to the estimated class memberships of the test patterns. Moreover, the moderated output derived here can be taken as an approximation to the posterior class probability. Hence, meaningful rejection thresholds can be assigned and outputs from several networks can be directly compared. Experimental results on both articial and realworld data are also discussed. KeywordsSupport vector machine, Evidence framework, Moderated output, Bayesian I. Introduction I N recent years, there has been a lot of interest in studying the support vector machine (SVM) [1], [2], [3], [4], [5], [6], [7]. SVM is based on the i...
Speciation as Automatic Categorical Modularization
, 1997
"... Realworld problems are often too difficult to be solved by a single monolithic system. Many natural and artificial systems use a modular approach to reduce the complexity of a set of subtasks while solving the overall problem satisfactorily. There are two distinct ways to do this. In functional mod ..."
Abstract

Cited by 41 (23 self)
 Add to MetaCart
Realworld problems are often too difficult to be solved by a single monolithic system. Many natural and artificial systems use a modular approach to reduce the complexity of a set of subtasks while solving the overall problem satisfactorily. There are two distinct ways to do this. In functional modularization, the components perform very different tasks, such as subroutines of a large software project. In categorical modularization, the components perform different versions of basically the same task, such as antibodies in the immune system. This second aspect is the more natural for acquiring strategies in games of conflict. An evolutionary learning system is presented which follows this second approach to automatically create a repertoire of specialist strategies for a gameplaying system. This relieves the human effort of deciding how to divide and specialize: species automatically form to deal with different highquality potential opponents, and a gating algorithm manages the re...
Prototype Selection for Composite Nearest Neighbor Classifiers
, 1997
"... Combining the predictions of a set of classifiers has been shown to be an effective way to create composite classifiers that are more accurate than any of the component classifiers. Increased accuracy has been shown in a variety of realworld applications, ranging from protein sequence identificatio ..."
Abstract

Cited by 26 (1 self)
 Add to MetaCart
Combining the predictions of a set of classifiers has been shown to be an effective way to create composite classifiers that are more accurate than any of the component classifiers. Increased accuracy has been shown in a variety of realworld applications, ranging from protein sequence identification to determining the fat content of ground meat. Despite such individual successes, the answers are not known to fundamental questions about classifier combination, such as "Can classifiers from any given model class be combined to create a composite classifier with higher accuracy?" or "Is it possible to increase the accuracy of a given classifier by combining its predictions with those of only a small number o...
Representation of functional data in neural networks
 NEUROCOMPUTIO 64 (2005) 183210
, 2005
"... Functional data analysis (FDA) is an extension of tradional data analysis tofunctiFA; data, for example spectra, temporalserira spatilF;;fifiPFA iFii gesturerecognijT/ data, etc. FunctiFA; data are rarely knowni practififi usually a regular oriFfiPxTLF sampliL i known. For thi reason, someprocessiF ..."
Abstract

Cited by 19 (10 self)
 Add to MetaCart
Functional data analysis (FDA) is an extension of tradional data analysis tofunctiFA; data, for example spectra, temporalserira spatilF;;fifiPFA iFii gesturerecognijT/ data, etc. FunctiFA; data are rarely knowni practififi usually a regular oriFfiPxTLF sampliL i known. For thi reason, someprocessiF i neededi order to benefit from the smooth character offunctiFA; datai theanalysi methods.Thi paper shows how to extend the radiTfiPPFA; functiP networks (RBFN) andmultij/FA; perceptron (MLP) models to functiPFA dataitaF.T i partifiIfiF when the latter are known throughliou ofifi/xfi.FA;xfi paiFi Varifi possi.FA;xfix for functi;FA processiA aredi;/jTPFA i;/jTP theprojectiA on smooth bases,functifi/I prictifi component analysit functitF; centerit andreductiFA and the use of diTjxxFA.jx operators. It i shown how toifi;LTPFA.j these functional
Bayesian Neural Networks for Classification: How Useful is the Evidence Framework?
, 1998
"... This paper presents an empirical assessment of the Bayesian evidence framework for neural networks using four synthetic and four realworld classification problems. We focus on three issues; model selection, automatic relevance determination (ARD) and the use of committees. Model selection using the ..."
Abstract

Cited by 19 (2 self)
 Add to MetaCart
This paper presents an empirical assessment of the Bayesian evidence framework for neural networks using four synthetic and four realworld classification problems. We focus on three issues; model selection, automatic relevance determination (ARD) and the use of committees. Model selection using the evidence criterion is only tenable if the number of training examples exceeds the number of network weights by a factor of five or ten. With this number of available examples, however, crossvalidation is a viable alternative. The ARD feature selection scheme is only useful in networks with many hidden units and for data sets containing many irrelevant variables. ARD is also useful as a hard feature selection method. Results on applying the evidence framework to the realworld data sets showed that committees of Bayesian networks achieved classification accuracies similar to the best alternative methods. Importantly, this was achievable with a minimum of human intervention. 1 Introduction ...