Results 1  10
of
2,915,499
An introduction to variable and feature selection
 Journal of Machine Learning Research
, 2003
"... Variable and feature selection have become the focus of much research in areas of application for which datasets with tens or hundreds of thousands of variables are available. ..."
Abstract

Cited by 1283 (16 self)
 Add to MetaCart
Variable and feature selection have become the focus of much research in areas of application for which datasets with tens or hundreds of thousands of variables are available.
Very simple classification rules perform well on most commonly used datasets
 Machine Learning
, 1993
"... The classification rules induced by machine learning systems are judged by two criteria: their classification accuracy on an independent test set (henceforth "accuracy"), and their complexity. The relationship between these two criteria is, of course, of keen interest to the machin ..."
Abstract

Cited by 542 (5 self)
 Add to MetaCart
to the machine learning community. There are in the literature some indications that very simple rules may achieve surprisingly high accuracy on many datasets. For example, Rendell occasionally remarks that many real world datasets have "few peaks (often just one) " and so are &
Regression Shrinkage and Selection Via the Lasso
 Journal of the Royal Statistical Society, Series B
, 1994
"... We propose a new method for estimation in linear models. The "lasso" minimizes the residual sum of squares subject to the sum of the absolute value of the coefficients being less than a constant. Because of the nature of this constraint it tends to produce some coefficients that are exactl ..."
Abstract

Cited by 4055 (51 self)
 Add to MetaCart
that are exactly zero and hence gives interpretable models. Our simulation studies suggest that the lasso enjoys some of the favourable properties of both subset selection and ridge regression. It produces interpretable models like subset selection and exhibits the stability of ridge regression. There is also
Wrappers for Feature Subset Selection
 AIJ SPECIAL ISSUE ON RELEVANCE
, 1997
"... In the feature subset selection problem, a learning algorithm is faced with the problem of selecting a relevant subset of features upon which to focus its attention, while ignoring the rest. To achieve the best possible performance with a particular learning algorithm on a particular training set, a ..."
Abstract

Cited by 1522 (3 self)
 Add to MetaCart
the strengths and weaknesses of the wrapper approach andshow a series of improved designs. We compare the wrapper approach to induction without feature subset selection and to Relief, a filter approach to feature subset selection. Significant improvement in accuracy is achieved for some datasets for the two
Irrelevant Features and the Subset Selection Problem
 MACHINE LEARNING: PROCEEDINGS OF THE ELEVENTH INTERNATIONAL
, 1994
"... We address the problem of finding a subset of features that allows a supervised induction algorithm to induce small highaccuracy concepts. We examine notions of relevance and irrelevance, and show that the definitions used in the machine learning literature do not adequately partition the features ..."
Abstract

Cited by 741 (26 self)
 Add to MetaCart
not only on the features and the target concept, but also on the induction algorithm. We describe a method for feature subset selection using crossvalidation that is applicable to any induction algorithm, and discuss experiments conducted with ID3 and C4.5 on artificial and real datasets.
The particel swarm: Explosion, stability, and convergence in a multidimensional complex space
 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTION
"... The particle swarm is an algorithm for finding optimal regions of complex search spaces through interaction of individuals in a population of particles. Though the algorithm, which is based on a metaphor of social interaction, has been shown to perform well, researchers have not adequately explained ..."
Abstract

Cited by 822 (10 self)
 Add to MetaCart
The particle swarm is an algorithm for finding optimal regions of complex search spaces through interaction of individuals in a population of particles. Though the algorithm, which is based on a metaphor of social interaction, has been shown to perform well, researchers have not adequately explained how it works. Further, traditional versions of the algorithm have had some dynamical properties that were not considered to be desirable, notably the particles’ velocities needed to be limited in order to control their trajectories. The present paper analyzes the particle’s trajectory as it moves in discrete time (the algebraic view), then progresses to the view of it in continuous time (the analytical view). A 5dimensional depiction is developed, which completely describes the system. These analyses lead to a generalized model of the algorithm, containing a set of coefficients to control the system’s convergence tendencies. Some results of the particle swarm optimizer, implementing modifications derived from the analysis, suggest methods for altering the original algorithm in ways that eliminate problems and increase the optimization power of the particle swarm
Verb Semantics And Lexical Selection
, 1994
"... ... structure. As Levin has addressed (Levin 1985), the decomposition of verbs is proposed for the purposes of accounting for systematic semanticsyntactic correspondences. This results in a series of problems for MT systems: inflexible verb sense definitions; difficulty in handling metaphor and new ..."
Abstract

Cited by 520 (4 self)
 Add to MetaCart
and new usages; imprecise lexical selection and insufficient system coverage. It seems one approach is to apply probability methods and statistical models for some of these problems. However, the question reminds: has PSR exhausted the potential of the knowledgebased approach? If not, are there any
A Study of CrossValidation and Bootstrap for Accuracy Estimation and Model Selection
 INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE
, 1995
"... We review accuracy estimation methods and compare the two most common methods: crossvalidation and bootstrap. Recent experimental results on artificial data and theoretical results in restricted settings have shown that for selecting a good classifier from a set of classifiers (model selection), te ..."
Abstract

Cited by 1248 (12 self)
 Add to MetaCart
), tenfold crossvalidation may be better than the more expensive leaveoneout crossvalidation. We report on a largescale experiment  over half a million runs of C4.5 and a NaiveBayes algorithm  to estimate the effects of different parameters on these algorithms on realworld datasets. For cross
The Data Grid: Towards an Architecture for the Distributed Management and Analysis of Large Scientific Datasets
 JOURNAL OF NETWORK AND COMPUTER APPLICATIONS
, 1999
"... In an increasing number of scientific disciplines, large data collections are emerging as important community resources. In this paper, we introduce design principles for a data management architecture called the Data Grid. We describe two basic services that we believe are fundamental to the des ..."
Abstract

Cited by 469 (42 self)
 Add to MetaCart
to the design of a data grid, namely, storage systems and metadata management. Next, we explain how these services can be used to develop higherlevel services for replica management and replica selection. We conclude by describing our initial implementation of data grid functionality.
Results 1  10
of
2,915,499