Results 1 
9 of
9
Particle swarm model selection
 JMLR, Special Topic on Model Selection
, 2009
"... This paper proposes the application of particle swarm optimization (PSO) to the problem of full model selection, FMS, for classification tasks. FMS is defined as follows: given a pool of preprocessing methods, feature selection and learning algorithms, to select the combination of these that obtains ..."
Abstract

Cited by 20 (7 self)
 Add to MetaCart
(Show Context)
This paper proposes the application of particle swarm optimization (PSO) to the problem of full model selection, FMS, for classification tasks. FMS is defined as follows: given a pool of preprocessing methods, feature selection and learning algorithms, to select the combination of these that obtains the lowest classification error for a given data set; the task also includes the selection of hyperparameters for the considered methods. This problem generates a vast search space to be explored, well suited for stochastic optimization techniques. FMS can be applied to any classification domain as it does not require domain knowledge. Different model types and a variety of algorithms can be considered under this formulation. Furthermore, competitive yet simple models can be obtained with FMS. We adopt PSO for the search because of its proven performance in different problems and because of its simplicity, since neither expensive computations nor complicated operations are needed. Interestingly, the way the search is guided allows PSO to avoid overfitting to some extend. Experimental results on benchmark data sets give evidence that the proposed approach is very effective, despite its simplicity. Furthermore, results obtained in the framework of a model selection challenge show the competitiveness of the models selected with PSO, compared to models selected with other techniques that focus on a single algorithm and that use domain knowledge.
On overfitting in model selection and subsequent selection bias in performance evaluation
 JOURNAL OF MACHINE LEARNING RESEARCH
, 2010
"... Model selection strategies for machine learning algorithms typically involve the numerical optimisation of an appropriate model selection criterion, often based on an estimator of generalisation performance, such as kfold crossvalidation. The error of such an estimator can be broken down into bias ..."
Abstract

Cited by 20 (2 self)
 Add to MetaCart
Model selection strategies for machine learning algorithms typically involve the numerical optimisation of an appropriate model selection criterion, often based on an estimator of generalisation performance, such as kfold crossvalidation. The error of such an estimator can be broken down into bias and variance components. While unbiasedness is often cited as a beneficial quality of a model selection criterion, we demonstrate that a low variance is at least as important, as a nonnegligible variance introduces the potential for overfitting in model selection as well as in training the model. While this observation is in hindsight perhaps rather obvious, the degradation in performance due to overfitting the model selection criterion can be surprisingly large, an observation that appears to have received little attention in the machine learning literature to date. In this paper, we show that the effects of this form of overfitting are often of comparable magnitude to differences in performance between learning algorithms, and thus cannot be ignored in empirical evaluation. Furthermore, we show that some common performance evaluation practices are susceptible to a form of selection bias as a result of this form of overfitting and hence are unreliable. We discuss methods to avoid overfitting in model selection and subsequent selection bias in performance evaluation, which we hope will be incorporated into best practice. While this study concentrates on crossvalidation based model selection, the findings are quite general and apply to any model selection practice involving the optimisation of a model selection criterion evaluated over a finite sample of data, including maximisation of the Bayesian evidence and optimisation of performance bounds.
1 Counting People with LowLevel Features and Bayesian Regression
"... Abstract—An approach to the problem of estimating the size of inhomogeneous crowds, composed of pedestrians that travel in different directions, without using explicit object segmentation or tracking is proposed. Instead, the crowd is segmented into components of homogeneous motion, using the mixtur ..."
Abstract

Cited by 19 (2 self)
 Add to MetaCart
(Show Context)
Abstract—An approach to the problem of estimating the size of inhomogeneous crowds, composed of pedestrians that travel in different directions, without using explicit object segmentation or tracking is proposed. Instead, the crowd is segmented into components of homogeneous motion, using the mixture of dynamic textures motion model. A set of holistic lowlevel features is extracted from each segmented region, and a function that maps features into estimates of the number of people per segment is learned with Bayesian regression. Two Bayesian regression models are examined. The first is a combination of Gaussian process regression (GPR) with a compound kernel, which accounts for both the global and local trends of the count mapping, but is limited by the realvalued outputs that do not match the discrete counts. We address this limitation with a second model, which is based on a Bayesian treatment of Poisson regression that introduces a prior distribution on the linear weights of the model. Since exact inference is analytically intractable, a closedform approximation is derived that is computationally efficient and kernelizable, enabling the representation of nonlinear functions. An approximate marginal likelihood is also derived for kernel hyperparameter learning. The two regressionbased crowd counting methods are evaluated on a large pedestrian dataset, containing very distinct camera views, pedestrian traffic, and outliers, such as bikes or skateboarders. Experimental results show that regressionbased counts are accurate, regardless of the crowd size, outperforming the count estimates produced by stateoftheart pedestrian detectors. Results on two hours of video demonstrate the efficiency and robustness of regressionbased crowd size estimation over long periods of time. Index Terms—surveillance, crowd analysis, Bayesian regression, Gaussian processes, Poisson regression
Model Selection: Beyond the Bayesian/Frequentist Divide
"... The principle of parsimony also known as “Ockham’s razor ” has inspired many theories of model selection. Yet such theories, all making arguments in favor of parsimony, are based on very different premises and have developed distinct methodologies to derive algorithms. We have organized challenges a ..."
Abstract

Cited by 14 (2 self)
 Add to MetaCart
(Show Context)
The principle of parsimony also known as “Ockham’s razor ” has inspired many theories of model selection. Yet such theories, all making arguments in favor of parsimony, are based on very different premises and have developed distinct methodologies to derive algorithms. We have organized challenges and edited a special issue of JMLR and several conference proceedings around the theme of model selection. In this editorial, we revisit the problem of avoiding overfitting in light of the latest results. We note the remarkable convergence of theories as different as Bayesian theory, Minimum Description Length, bias/variance tradeoff, Structural Risk Minimization, and regularization, in some approaches. We also present new and interesting examples of the complementarity of theories leading to hybrid algorithms, neither frequentist, nor Bayesian, or perhaps both frequentist and Bayesian!
Agnostic learning versus prior knowledge in the design of kernel machines
 In Proc. IJCNN07
, 2007
"... Abstract — The optimal model parameters of a kernel machine are typically given by the solution of a convex optimisation problem with a single global optimum. Obtaining the best possible performance is therefore largely a matter of the design of a good kernel for the problem at hand, exploiting any ..."
Abstract

Cited by 3 (3 self)
 Add to MetaCart
(Show Context)
Abstract — The optimal model parameters of a kernel machine are typically given by the solution of a convex optimisation problem with a single global optimum. Obtaining the best possible performance is therefore largely a matter of the design of a good kernel for the problem at hand, exploiting any underlying structure and optimisation of the regularisation and kernel parameters, i.e. model selection. Fortunately, analytic bounds on, or approximations to, the leaveoneout crossvalidation error are often available, providing an efficient and generally reliable means to guide model selection. However, the degree to which the incorporation of prior knowledge improves performance over that which can be obtained using “standard” kernels with automated model selection (i.e. agnostic learning), is an open question. In this paper, we compare approaches using example solutions for all of the benchmark tasks on both tracks
mechanism in protein domain movements
, 2014
"... Quantitative method for the assignment of hinge and shear ..."
BIOINFORMATICS ORIGINAL PAPER doi:10.1093/bioinformatics/btm123 Databases and ontologies Management, presentation and interpretation of genome scans using GSCANDB
"... Motivation: Advances in highthroughput genotyping have made it possible to carry out genomewide association studies using very high densities of genetic markers. This has led to the problem of the storage, management, quality control, presentation and interpretation of results. In order to achieve ..."
Abstract
 Add to MetaCart
Motivation: Advances in highthroughput genotyping have made it possible to carry out genomewide association studies using very high densities of genetic markers. This has led to the problem of the storage, management, quality control, presentation and interpretation of results. In order to achieve a successful outcome, it may be necessary to analyse the data in different ways and compare the results with genome annotations and other genome scans. Results: We created GSCANDB, a database for genome scan data, using a MySQL backend and PerlCGI web interface. It displays genome scans of multiple phenotypes analysed in different ways and projected onto genome annotations derived from EnsMart. The current version is optimized for analysis of mouse data, but is customizable to other species. Availability: Source code and example data are available under the GPL, in versions tailored to either human or mouse association studies, from
Appears in IEEE Int’l Conf. on Computer Vision, Kyoto, 2009. Bayesian Poisson Regression for Crowd Counting
"... Poisson regression models the noisy output of a counting function as a Poisson random variable, with a logmean parameter that is a linear function of the input vector. In this work, we analyze Poisson regression in a Bayesian setting, by introducing a prior distribution on the weights of the linear ..."
Abstract
 Add to MetaCart
(Show Context)
Poisson regression models the noisy output of a counting function as a Poisson random variable, with a logmean parameter that is a linear function of the input vector. In this work, we analyze Poisson regression in a Bayesian setting, by introducing a prior distribution on the weights of the linear function. Since exact inference is analytically unobtainable, we derive a closedform approximation to the predictive distribution of the model. We show that the predictive distribution can be kernelized, enabling the representation of nonlinear logmean functions. We also derive an approximate marginal likelihood that can be optimized to learn the hyperparameters of the kernel. We then relate the proposed approximate Bayesian Poisson regression to Gaussian processes. Finally, we present experimental results using Bayesian Poisson regression for crowd counting from lowlevel features. 1.
On Approximate Inference for Generalized Gaussian Process Models
"... (This manuscript was submitted for review on July 12, 2013) ..."