Results 1 - 10
of
138
Irrelevant Features and the Subset Selection Problem
- MACHINE LEARNING: PROCEEDINGS OF THE ELEVENTH INTERNATIONAL
, 1994
"... We address the problem of finding a subset of features that allows a supervised induction algorithm to induce small high-accuracy concepts. We examine notions of relevance and irrelevance, and show that the definitions used in the machine learning literature do not adequately partition the features ..."
Abstract
-
Cited by 515 (22 self)
- Add to MetaCart
We address the problem of finding a subset of features that allows a supervised induction algorithm to induce small high-accuracy concepts. We examine notions of relevance and irrelevance, and show that the definitions used in the machine learning literature do not adequately partition the features into useful categories of relevance. We present definitions for irrelevance and for two degrees of relevance. These definitions improve our understanding of the behavior of previous subset selection algorithms, and help define the subset of features that should be sought. The features selected should depend not only on the features and the target concept, but also on the induction algorithm. We describe a method for feature subset selection using cross-validation that is applicable to any induction algorithm, and discuss experiments conducted with ID3 and C4.5 on artificial and real datasets.
Using Statistical Testing in the Evaluation of Retrieval Experiments
, 1993
"... The standard strategies for evaluation based on precision and recall are examined and their relative advantages and disadvantages are discussed. In particular, it is suggested that relevance feedback be evaluated from the perspective of the user. A number of different statistical tests are described ..."
Abstract
-
Cited by 149 (0 self)
- Add to MetaCart
The standard strategies for evaluation based on precision and recall are examined and their relative advantages and disadvantages are discussed. In particular, it is suggested that relevance feedback be evaluated from the perspective of the user. A number of different statistical tests are described for determining if differences in performance between retrieval methods are significant. These tests have often been ignored in the past because most are based on an assumption of normality which is not strictly valid for the standard performance measures. However, one can test this assumption using simple diagnostic plots, and if it is a poor approximation, there are a number of non-parametric alternatives.
A Comparison of Prediction Accuracy, Complexity, and Training Time of Thirty-three Old and New Classification Algorithms
, 2000
"... . Twenty-two decision tree, nine statistical, and two neural network algorithms are compared on thirty-two datasets in terms of classication accuracy, training time, and (in the case of trees) number of leaves. Classication accuracy is measured by mean error rate and mean rank of error rate. Both cr ..."
Abstract
-
Cited by 134 (6 self)
- Add to MetaCart
. Twenty-two decision tree, nine statistical, and two neural network algorithms are compared on thirty-two datasets in terms of classication accuracy, training time, and (in the case of trees) number of leaves. Classication accuracy is measured by mean error rate and mean rank of error rate. Both criteria place a statistical, spline-based, algorithm called Polyclass at the top, although it is not statistically signicantly dierent from twenty other algorithms. Another statistical algorithm, logistic regression, is second with respect to the two accuracy criteria. The most accurate decision tree algorithm is Quest with linear splits, which ranks fourth and fth, respectively. Although spline-based statistical algorithms tend to have good accuracy, they also require relatively long training times. Polyclass, for example, is third last in terms of median training time. It often requires hours of training compared to seconds for other algorithms. The Quest and logistic regression algor...
Bayesian Model Averaging for Linear Regression Models
- Journal of the American Statistical Association
, 1997
"... We consider the problem of accounting for model uncertainty in linear regression models. Conditioning on a single selected model ignores model uncertainty, and thus leads to the underestimation of uncertainty when making inferences about quantities of interest. A Bayesian solution to this problem in ..."
Abstract
-
Cited by 133 (12 self)
- Add to MetaCart
We consider the problem of accounting for model uncertainty in linear regression models. Conditioning on a single selected model ignores model uncertainty, and thus leads to the underestimation of uncertainty when making inferences about quantities of interest. A Bayesian solution to this problem involves averaging over all possible models (i.e., combinations of predictors) when making inferences about quantities of
A Meta-Analysis of Rates of Return to Agricultural R&D - Ex Pede Herculem?
"... this report. Willis was a pioneer in the economic analysis of agricultural research and technical change, a teacher, and an inspiration---as well as a friend and a good bloke ..."
Abstract
-
Cited by 53 (6 self)
- Add to MetaCart
this report. Willis was a pioneer in the economic analysis of agricultural research and technical change, a teacher, and an inspiration---as well as a friend and a good bloke
Model Selection and Accounting for Model Uncertainty in Linear Regression Models
, 1993
"... We consider the problems of variable selection and accounting for model uncertainty in linear regression models. Conditioning on a single selected model ignores model uncertainty, and thus leads to the underestimation of uncertainty when making inferences about quantities of interest. The complete B ..."
Abstract
-
Cited by 40 (6 self)
- Add to MetaCart
We consider the problems of variable selection and accounting for model uncertainty in linear regression models. Conditioning on a single selected model ignores model uncertainty, and thus leads to the underestimation of uncertainty when making inferences about quantities of interest. The complete Bayesian solution to this problem involves averaging over all possible models when making inferences about quantities of interest. This approach is often not practical. In this paper we offer two alternative approaches. First we describe a Bayesian model selection algorithm called "Occam's "Window" which involves averaging over a reduced set of models. Second, we describe a Markov chain Monte Carlo approach which directly approximates the exact solution. Both these model averaging procedures provide better predictive performance than any single model which might reasonably have been selected. In the extreme case where there are many candidate predictors but there is no relationship between any of them and the response, standard variable selection procedures often choose some subset of variables that yields a high R² and a highly significant overall F value. We refer to this unfortunate phenomenon as "Freedman's Paradox" (Freedman, 1983). In this situation, Occam's vVindow usually indicates the null model as the only one to be considered, or else a small number of models including the null model, thus largely resolving the paradox.
Comparing Interactive Information Retrieval Systems Across Sites: The TREC-6 Interactive Track Matrix Experiment
, 1998
"... This is a case study in the design and analysis of a 9-site TREC-6 experiment aimed at comparing the performance of 12 interactive information retrieval (IR) systems on a shared problem: a question-answering task, 6 statements of information need, and a collection of 210,158 articles from the Financ ..."
Abstract
-
Cited by 35 (0 self)
- Add to MetaCart
This is a case study in the design and analysis of a 9-site TREC-6 experiment aimed at comparing the performance of 12 interactive information retrieval (IR) systems on a shared problem: a question-answering task, 6 statements of information need, and a collection of 210,158 articles from the Financial Times of London 1991-1994. The study discusses the application of experimental design principles and the use of a shared control IR system in addressing the problems of comparing experimental interactive IR systems across sites: isolating the effects of topics, human searchers, and other site-specific factors within an affordable design. The results confirm the dominance of the topic effect, show the searcher effect is almost as often absent as present, and indicate that for several sites the 2-factor interactions are negligible. An analysis of variance found the system effect to be significant, but a multiple comparisons test found no significant pairwise differences. 1 Introduction T...
Feature Subset Selection as Search with Probabilistic Estimates
- in AAAI Fall Symposium on Relevance
, 1994
"... Irrelevant features and weakly relevant features may reduce the comprehensibility and accuracy of concepts induced by supervised learning algorithms. We formulate the search for a feature subset as an abstract search problem with probabilistic estimates. Searching a space using an evaluation functio ..."
Abstract
-
Cited by 35 (8 self)
- Add to MetaCart
Irrelevant features and weakly relevant features may reduce the comprehensibility and accuracy of concepts induced by supervised learning algorithms. We formulate the search for a feature subset as an abstract search problem with probabilistic estimates. Searching a space using an evaluation function that is a random variable requires trading off accuracy of estimates for increased state exploration. We show how recent feature subset selection algorithms in the machine learning literature fit into this search problem as simple hill climbing approaches, and conduct a small experiment using a best-first search technique. 1 Introduction Practical algorithms in supervised machine learning degrade in performance (prediction accuracy) when faced with many features that are not necessary for predicting the desired output. An important question in the field of machine learning, statistics, and pattern recognition, is how to select a good subset set of features. From a theoretical standpoint,...
Useful Feature Subsets and Rough Set Reducts
, 1994
"... In supervised classification learning, one attempts to induce a classifier that correctly predicts the label of novel instances. We demonstrate that by choosing a useful subset of features for the indiscernibility relation, an induction algorithm based on simple decision table can have high predicti ..."
Abstract
-
Cited by 23 (4 self)
- Add to MetaCart
In supervised classification learning, one attempts to induce a classifier that correctly predicts the label of novel instances. We demonstrate that by choosing a useful subset of features for the indiscernibility relation, an induction algorithm based on simple decision table can have high prediction accuracy on artificial and real-world datasets. We show that useful feature subsets are not necessarily maximal independent sets (relative reducts) with respect to the label, and that, in practical situations, using a subset of the relative core features may lead to superior performance. 1 Introduction In supervised classification learning, one is given a training set containing labelled instances (examples) . Each labelled instance contains a list of feature values (attribute values) and a discrete label value. The induction task is to build a classifier that will correctly predict the label of novel instances. Common classifiers are decision trees, neural networks, and nearest-neighbor...
Quantitative analysis of cell motility and chemotaxis in Dictyostelium discoideum using an image processing system and a novel chemotaxis chamber providing stationary chemical gradients
- J. Cell Biol
, 1989
"... Abstract. An image processing system was programmed to automatically track and digitize the movement of amebae under phase-contrast microscopy. The amebae moved in a novel chemotaxis chamber designed to provide stable linear attractant gradients in a thin agarose gel. The gradients were established ..."
Abstract
-
Cited by 20 (2 self)
- Add to MetaCart
Abstract. An image processing system was programmed to automatically track and digitize the movement of amebae under phase-contrast microscopy. The amebae moved in a novel chemotaxis chamber designed to provide stable linear attractant gradients in a thin agarose gel. The gradients were established by pumping attractant and buffer solutions through semipermeable hollow fibers embedded in the agarose gel. Gradients were established within 30 min and shown to be stable for at least a further 90 min. By using this system it is possible to collect detailed data on the movement of large numbers of individual amebae in defined attractant gradients. We used the system to study motility and chemotaxis by a score of Dic-tyostelium coideum wild-type and mutant strains, including

