Wrappers for Feature Subset Selection
 AIJ SPECIAL ISSUE ON RELEVANCE
, 1997
"... In the feature subset selection problem, a learning algorithm is faced with the problem of selecting a relevant subset of features upon which to focus its attention, while ignoring the rest. To achieve the best possible performance with a particular learning algorithm on a particular training set, a ..."
Abstract

In the feature subset selection problem, a learning algorithm is faced with the problem of selecting a relevant subset of features upon which to focus its attention, while ignoring the rest. To achieve the best possible performance with a particular learning algorithm on a particular training set, a feature subset selection method should consider how the algorithm and the training set interact. We explore the relation between optimal feature subset selection and relevance. Our wrapper method searches for an optimal feature subset tailored to a particular algorithm and a domain. We study the strengths and weaknesses of the wrapper approach andshow a series of improved designs. We compare the wrapper approach to induction without feature subset selection and to Relief, a filter approach to feature subset selection. Significant improvement in accuracy is achieved for some datasets for the two families of induction algorithms used: decision trees and NaiveBayes.
Irrelevant Features and the Subset Selection Problem
 MACHINE LEARNING: PROCEEDINGS OF THE ELEVENTH INTERNATIONAL
, 1994
"... We address the problem of finding a subset of features that allows a supervised induction algorithm to induce small highaccuracy concepts. We examine notions of relevance and irrelevance, and show that the definitions used in the machine learning literature do not adequately partition the features ..."
Abstract

We address the problem of finding a subset of features that allows a supervised induction algorithm to induce small highaccuracy concepts. We examine notions of relevance and irrelevance, and show that the definitions used in the machine learning literature do not adequately partition the features into useful categories of relevance. We present definitions for irrelevance and for two degrees of relevance. These definitions improve our understanding of the behavior of previous subset selection algorithms, and help define the subset of features that should be sought. The features selected should depend not only on the features and the target concept, but also on the induction algorithm. We describe a method for feature subset selection using crossvalidation that is applicable to any induction algorithm, and discuss experiments conducted with ID3 and C4.5 on artificial and real datasets.
Least Median of Squares Regression
 JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
, 1984
"... ..."
Locally weighted learning
 ARTIFICIAL INTELLIGENCE REVIEW
, 1997
"... This paper surveys locally weighted learning, a form of lazy learning and memorybased learning, and focuses on locally weighted linear regression. The survey discusses distance functions, smoothing parameters, weighting functions, local model structures, regularization of the estimates and bias, ass ..."
Abstract

This paper surveys locally weighted learning, a form of lazy learning and memorybased learning, and focuses on locally weighted linear regression. The survey discusses distance functions, smoothing parameters, weighting functions, local model structures, regularization of the estimates and bias, assessing predictions, handling noisy data and outliers, improving the quality of predictions by tuning t parameters, interference between old and new data, implementing locally weighted learning e ciently, and applications of locally weighted learning. A companion paper surveys how locally weighted learning can be used in robot learning and control.
Analysis of variance for gene expression microarray data
 Journal of Computational Biology
, 2000
"... Spotted cDNA microarrays are emerging as a powerful and costeffective tool for largescale analysis of gene expression. Microarrays can be used to measure the relative quantities of speci � c mRNAs in two or more tissue samples for thousands of genes simultaneously. While the power of this technolog ..."
Abstract

Spotted cDNA microarrays are emerging as a powerful and costeffective tool for largescale analysis of gene expression. Microarrays can be used to measure the relative quantities of speci � c mRNAs in two or more tissue samples for thousands of genes simultaneously. While the power of this technology has been recognized, many open questions remain about appropriate analysis of microarray data. One question is how to make valid estimates of the relative expression for genes that are not biased by ancillary sources of variation. Recognizing that there is inherent “noise ” in microarray data, how does one estimate the error variation associated with an estimated change in expression, i.e., how does one construct the error bars? We demonstrate that ANOVA methods can be used to normalize microarray data and provide estimates of changes in gene expression that are corrected for potential confounding effects. This approach establishes a framework for the general analysis and interpretation of microarray data. Key words: Gene expression microarray, differential expression, analysis of variance, bootstrap.
Bayesian Model Averaging for Linear Regression Models
 Journal of the American Statistical Association
, 1997
"... We consider the problem of accounting for model uncertainty in linear regression models. Conditioning on a single selected model ignores model uncertainty, and thus leads to the underestimation of uncertainty when making inferences about quantities of interest. A Bayesian solution to this problem in ..."
Abstract

We consider the problem of accounting for model uncertainty in linear regression models. Conditioning on a single selected model ignores model uncertainty, and thus leads to the underestimation of uncertainty when making inferences about quantities of interest. A Bayesian solution to this problem involves averaging over all possible models (i.e., combinations of predictors) when making inferences about quantities of
Performance persistence
 Journal of Finance
, 1995
"... Most optimizationbased decision support systems are used repeatedly with only modest changes to input data from scenario to scenario. Unfortunately, optimization (mathematical programming) has a welldeserved reputation for amplifying small input changes into drastically different solutions. A prev ..."
Abstract

Most optimizationbased decision support systems are used repeatedly with only modest changes to input data from scenario to scenario. Unfortunately, optimization (mathematical programming) has a welldeserved reputation for amplifying small input changes into drastically different solutions. A previously optimal solution, or a slight variation of one, may still be nearly optimal in a new scenario and managerially preferable to a dramatically different solution that is mathematically optimal. Mathematical programming models can be stated and solved so that they exhibit varying degrees of persistence with respect to previous values of variables, constraints, or even exogenous considerations. We use case studies to highlight how modeling with persistence has improved managerial acceptance and describe how to incorporate persistence as an intrinsic feature of any optimization model. T^e reasonable man /^ptimizationbased decision support adapts himself to the world; % # V^^'systems, that is, decision support the unreasonable one persists in trvine to adapt i uu J ^U.. iJ. u If systems built around one or more mathethe world to himself; matical programming models, are preTherefore, all progress depends on the unrea j • M I J • n * j i sonable man dominantly employed as follows; A model is used to produce a plan, the plan is pub
Preliminary guidelines for empirical research in software engineering
 IEEE Transactions on Software Engineering
, 2002
"... ..."
A New Approach to Variable Selection in Least Squares Problems
, 1999
"... The title Lasso has been suggested by Tibshirani [7] as a colourful name for a technique of variable selection which requires the minimization of a sum of squares subject to an ll bound r; on the solution. This forces zero components in the minimizing solution for small values of r;. Thus this bo ..."
Abstract

The title Lasso has been suggested by Tibshirani [7] as a colourful name for a technique of variable selection which requires the minimization of a sum of squares subject to an ll bound r; on the solution. This forces zero components in the minimizing solution for small values of r;. Thus this bound can function as a selection parameter. This paper makes two contributions to computational problems associated with implementing the Lasso: (1) a com pact descent method for solving the constrained problem for a particular value of r; is formulated, and (2) a homotopy method, in which the constraint bound r; becomes the homotopy parameter, is developed to completely describe the possible selection regimes. Both algorithms have a finite termination property.
Aid and growth regressions
 Journal of Development Economics
, 2001
"... in the School of Economics at the University of Nottingham. It aims to promote research in all aspects of economic development and international trade on both a long term and a short term basis. To this end, CREDIT organises seminar series on Development Economics, acts as a point for collaborative ..."
Abstract

in the School of Economics at the University of Nottingham. It aims to promote research in all aspects of economic development and international trade on both a long term and a short term basis. To this end, CREDIT organises seminar series on Development Economics, acts as a point for collaborative research with other UK and overseas institutions and publishes research papers on topics central to its interests. A list of CREDIT Research Papers is given on the final page of this publication.