Results 1  10
of
95
A Study on Reduced Support Vector Machines
 IEEE TRANSACTIONS ON NEURAL NETWORKS
, 2003
"... Recently the Reduced Support Vector Machine (RSVM) was proposed as an alternate of the standard SVM. Motivated by resolving the difficulty on handling large data sets using SVM with nonlinear kernels, it preselects a subset of data as support vectors and solves a smaller optimization problem. How ..."
Abstract

Cited by 37 (5 self)
 Add to MetaCart
Recently the Reduced Support Vector Machine (RSVM) was proposed as an alternate of the standard SVM. Motivated by resolving the difficulty on handling large data sets using SVM with nonlinear kernels, it preselects a subset of data as support vectors and solves a smaller optimization problem. However, several issues of its practical use have not been fully discussed yet. For example, we do not know if it possesses comparable generalization ability as the standard SVM. In addition, we would like to see for how large problems RSVM outperforms SVM on training time. In this paper we show that the RSVM formulation is already in a form of linear SVM and discuss four RSVM implementations. Experiments indicate that in general the test accuracy of RSVM are a little lower than that of the standard SVM. In addition, for problems with up to tens of thousands of data, if the percentage of support vectors is not high, existing implementations for SVM is quite competitive on the training time. Thus, from this empirical study, RSVM will be mainly useful for either larger problems or those with many support vectors. Experiments in this paper also serve as comparisons of (1) different implementations for linear SVM; and (2) standard SVM using linear and quadratic cost functions.
Making Logistic Regression A Core Data Mining Tool: A Practical Investigation of Accuracy, Speed, and Simplicity
, 2004
"... Binary classification is a core data mining task. For large datasets or realtime applications, desirable classifiers are accurate, fast, and need no parameter tuning. We present a simple implementation of logistic regression that meets these requirements. A combination of regularization, truncated ..."
Abstract

Cited by 35 (0 self)
 Add to MetaCart
Binary classification is a core data mining task. For large datasets or realtime applications, desirable classifiers are accurate, fast, and need no parameter tuning. We present a simple implementation of logistic regression that meets these requirements. A combination of regularization, truncated Newton methods, and iteratively reweighted least squares make it faster and more accurate than modern SVM implementations, and relatively insensitive to parameters. It is robust to linear dependencies and some scaling problems, making most data preprocessing unnecessary. 1
Direct estimation of nonrigid registration
 In British Machine Vision Conference
, 2004
"... Registering images of a deforming surface is a wellstudied problem. Solutions include computing optic flow or estimating a parameterized motion model. In the case of optic flow it is necessary to include some regularization. We propose an approach based on representing the induced transformation be ..."
Abstract

Cited by 32 (3 self)
 Add to MetaCart
Registering images of a deforming surface is a wellstudied problem. Solutions include computing optic flow or estimating a parameterized motion model. In the case of optic flow it is necessary to include some regularization. We propose an approach based on representing the induced transformation between images using Radial Basis Functions (RBF). The approach can be viewed as a direct, i.e. intensitybased, method, or equivalently, as a way of using RBFs as nonlinear regularizers on the optic flow field. The approach is demonstrated on several image sequences of deforming surfaces. It is shown that the computed registrations are sufficiently accurate to allow convincing augmentations of the images. 1
Bayesian Model Assessment and Comparison Using CrossValidation Predictive Densities
 Neural Computation
, 2002
"... In this work, we discuss practical methods for the assessment, comparison, and selection of complex hierarchical Bayesian models. A natural way to assess the goodness of the model is to estimate its future predictive capability by estimating expected utilities. Instead of just making a point estimat ..."
Abstract

Cited by 28 (11 self)
 Add to MetaCart
In this work, we discuss practical methods for the assessment, comparison, and selection of complex hierarchical Bayesian models. A natural way to assess the goodness of the model is to estimate its future predictive capability by estimating expected utilities. Instead of just making a point estimate, it is important to obtain the distribution of the expected utility estimate, as it describes the uncertainty in the estimate. The distributions of the expected utility estimates can also be used to compare models, for example, by computing the probability of one model having a better expected utility than some other model. We propose an approach using crossvalidation predictive densities to obtain expected utility estimates and Bayesian bootstrap to obtain samples from their distributions. We also discuss the probabilistic assumptions made and properties of two practical crossvalidation methods, importance sampling and kfold crossvalidation. As illustrative examples, we use MLP neural networks and Gaussian Processes (GP) with Markov chain Monte Carlo sampling in one toy problem and two challenging realworld problems.
An improved particle filter for target tracking in sensor system
 Sensors 2007
"... Abstract: Sensor systems are not always equipped with the ability to track targets. Sudden maneuvers of a target can have a great impact on the sensor system, which will increase the miss rate and rate of false target detection. The use of the generic particle filter (PF) algorithm is well known for ..."
Abstract

Cited by 19 (10 self)
 Add to MetaCart
Abstract: Sensor systems are not always equipped with the ability to track targets. Sudden maneuvers of a target can have a great impact on the sensor system, which will increase the miss rate and rate of false target detection. The use of the generic particle filter (PF) algorithm is well known for target tracking, but it can not overcome the degeneracy of particles and cumulation of estimation errors. In this paper, we propose an improved PF algorithm called PFRBF. This algorithm uses the radialbasis function network (RBFN) in the sampling step for dynamically constructing the process model from observations and updating the value of each particle. With the RBFN sampling step, PFRBF can give an accurate proposal distribution and maintain the convergence of a sensor system. Simulation results verify that PFRBF performs better than the Unscented Kalman Filter (UKF), PF and Unscented Particle Filter (UPF) in both robustness and accuracy whether the observation model used for the sensor system is linear or nonlinear. Moreover, the intrinsic property of PFRBF determines that, when the particle number exceeds a certain amount, the execution time of PFRBF is less than UPF. This makes PFRBF a better candidate for the sensor systems which need many particles for target tracking.
Recent Advances in Radial Basis Function Networks
 Technical Report www.ed.ac.uk/ ~ mjo/papers/recad.ps, Institute for Adaptive and Neural Computation
, 1999
"... In 1996 an Introduction to Radial Basis Function Networks was published on the web 2 along with a package of Matlab functions 3 . The emphasis was on the linear character of RBF networks and two techniques borrowed from statistics: forward selection and ridge regression. This document 4 is ..."
Abstract

Cited by 15 (1 self)
 Add to MetaCart
In 1996 an Introduction to Radial Basis Function Networks was published on the web 2 along with a package of Matlab functions 3 . The emphasis was on the linear character of RBF networks and two techniques borrowed from statistics: forward selection and ridge regression. This document 4 is an update on developments between 1996 and 1999 and is associated with a second version of the Matlab package 5 . Improvements have been made to the forward selection and ridge regression methods and a new method, which is a cross between regression trees and RBF networks, has been developed. 1 mjo@anc.ed.ac.uk 2 www.anc.ed.ac.uk/mjo/papers/intro.ps 3 www.anc.ed.ac.uk/mjo/software/rbf.zip 4 www.anc.ed.ac.uk/mjo/papers/recad.ps 5 www.anc.ed.ac.uk/mjo/software/rbf2.zip 2 CONTENTS Contents 1 Introduction 3 1.1 MacKay's Hermite Polynomial . . . . . . . . . . . . . . . . . . . . . 3 1.2 Friedman's Simulated Circuit . . . . . . . . . . . . . . . . . . . . . . 4 2 Maximum Margina...
Sparse Representations for Medium Level Vision
 Lic. Thesis LiUTekLic2001:06, Dept. EE, Linkoping University, SE581 83 Linkoping, Sweden, February 2001. Thesis No. 869, ISBN
, 2001
"... Don’t confuse the moon with the finger that points at it. Zen proverb iii iv In this thesis a new type of representation for medium level vision operations is explored. We focus on representations that are sparse and monopolar. Theword sparse signifies that information in the feature sets used is no ..."
Abstract

Cited by 15 (8 self)
 Add to MetaCart
Don’t confuse the moon with the finger that points at it. Zen proverb iii iv In this thesis a new type of representation for medium level vision operations is explored. We focus on representations that are sparse and monopolar. Theword sparse signifies that information in the feature sets used is not necessarily present at all points. On the contrary, most features will be inactive. The word monopolar signifies that all features have the same sign, e.g. are either positive or zero. A zero feature value denotes “no information”, and for nonzero values, the magnitude signifies the relevance. A sparse scalespace representation of local image structure (lines and edges) is developed. A method known as the channel representation is used to generate sparse representations, and its ability to deal with multiple hypotheses is described. It is also shown how these hypotheses can be extracted in a robust manner. The connection of soft histograms (i.e. histograms with overlapping bins) to the channel representation, as well as to the use of dithering in relaxation of quantisation errors is shown. The use of soft histograms for estimation of unknown probability density functions (PDF), and estimation of image rotation are demonstrated. The advantage with the use of sparse, monopolar representations in associative learning is demonstrated. Finally we show how sparse, monopolar representations can be used to speed up and improve template matching. v vi
Theoretical and Experimental Evaluation of Subspace Information Criterion
, 2001
"... Recently, a new model selection criterion called the subspace information criterion (SIC) was proposed. SIC works well with small samples since it gives an unbiased estimate of the generalization error with finite samples. In this paper, we theoretically and experimentally evaluate the e#ectiveness ..."
Abstract

Cited by 15 (11 self)
 Add to MetaCart
Recently, a new model selection criterion called the subspace information criterion (SIC) was proposed. SIC works well with small samples since it gives an unbiased estimate of the generalization error with finite samples. In this paper, we theoretically and experimentally evaluate the e#ectiveness of SIC in comparison with existing model selection techniques including the traditional leaveoneout crossvalidation (CV), Mallows's C P , Akaike's information criterion (AIC), Sugiura's corrected AIC (cAIC), Schwarz's Bayesian information criterion (BIC), Rissanen's minimum description length criterion (MDL), and Vapnik's measure (VM). Theoretical evaluation includes the comparison of the generalization measure, approximation method, and restriction on model candidates and learning methods. Experimentally, the performance of SIC in various situations is investigated. The simulations show that SIC outperforms existing techniques especially when the number of training examples is small and the noise variance is large. Keywords supervised learning, generalization capability, model selection, subspace information criterion, small samples Theoretical and Experimental Evaluation of Subspace Information Criterion 2 1
A hybrid projection based and radial basis function architecture: Initial values and global optimization
, 2001
"... We introduce a mechanism for constructing and training a hybrid architecture of projection based units and radial basis functions. In particular, we introduce an optimization scheme which includes several steps and assures a convergence to a useful solution. During network architecture constructi ..."
Abstract

Cited by 13 (6 self)
 Add to MetaCart
We introduce a mechanism for constructing and training a hybrid architecture of projection based units and radial basis functions. In particular, we introduce an optimization scheme which includes several steps and assures a convergence to a useful solution. During network architecture construction and training, it is determined whether a unit should be removed or replaced. The resulting architecture has often smaller number of units compared with competing architectures. A specific overfitting resulting from shrinkage of the RBF radii is addressed by introducing a penalty on small radii. Classification and regression results are demonstrated on various benchmark data sets and compared with several variants of RBF networks [?, ?]. A striking performance improvement is achieved on the vowel data set [?]. Keywords: Projection units, RBF Units, Hybrid Network Architecture, SMLP, Clustering, Regularization. 1
Optimal design of regularization term and regularization parameter by subspace information criterion
 Neural Networks
, 2000
"... The problem of designing the regularization term and regularization parameter for linear regression models is discussed. Previously, we derived an approximation to the generalization error called the subspace information criterion (SIC), which is an unbiased estimator of the generalization error wit ..."
Abstract

Cited by 13 (6 self)
 Add to MetaCart
The problem of designing the regularization term and regularization parameter for linear regression models is discussed. Previously, we derived an approximation to the generalization error called the subspace information criterion (SIC), which is an unbiased estimator of the generalization error with finite samples under certain conditions. In this paper, we apply SIC to regularization learning and use it for (a) choosing the optimal regularization term and regularization parameter from given candidates, and (b) obtaining the closed form of the optimal regularization parameter for a fixed regularization term. The effectiveness of SIC is demonstrated through computer simulations with artificial and real data. Keywords supervised learning, generalization error, linear regression, regularization learning, ridge regression, model selection, regularization parameter, subspace information criterion Optimal Regularization by SIC 2 Nomenclature f(x) : learning target function D: domain of f(x) xm: mth sample point ym: mth sample value ɛm: mth noise (xm,ym) : mth training example M: the number of training examples y: Mdimensional vector consisting of {ym} M m=1 ɛ: Mdimensional vector consisting of {ɛm} M m=1 ϕp(x) : pth basis function θp: pth coefficient µ: the number of basis functions JG: generalization error JTE: training error JR: regularized training error T: regularization matrix α: regularization parameter A: design matrix XT,α: regularization learning matrix U: µdimensional matrix θ: true parameter ˆθT,α: regularization estimate ˆθu: unbiased estimate σ 2: noise variance 1