Results 1 - 10
of
20
A Probabilistic Approach to Feature Selection - A Filter Solution
"... Feature selection can be defined as a problem of finding a minimum set of M relevant attributes that describes the dataset as well as the original N attributes do, where M N . After examining the problems with both the exhaustive and the heuristic approach to feature selection, this paper pro ..."
Abstract
-
Cited by 86 (11 self)
- Add to MetaCart
Feature selection can be defined as a problem of finding a minimum set of M relevant attributes that describes the dataset as well as the original N attributes do, where M N . After examining the problems with both the exhaustive and the heuristic approach to feature selection, this paper proposes a probabilistic approach. The theoretic analysis and the experimental study show that the proposed approach is simple to implement and guaranteed to find the optimal if resources permit. It is also fast in obtaining results and effective in selecting features that improve the performance of a learning algorithm. An on-site application involving huge datasets has been conducted independently. It proves the effectiveness and scalability of the proposed algorithm. Discussed also are various aspects and applications of this feature selection algorithm. 1 Introduction The problem of feature selection can be defined as finding M relevant attributes among the N original attrib...
Toward integrating feature selection algorithms for classification and clustering
- IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING
, 2005
"... This paper introduces concepts and algorithms of feature selection, surveys existing feature selection algorithms for classification and clustering, groups and compares different algorithms with a categorizing framework based on search strategies, evaluation criteria, and data mining tasks, reveals ..."
Abstract
-
Cited by 71 (6 self)
- Add to MetaCart
This paper introduces concepts and algorithms of feature selection, surveys existing feature selection algorithms for classification and clustering, groups and compares different algorithms with a categorizing framework based on search strategies, evaluation criteria, and data mining tasks, reveals unattempted combinations, and provides guidelines in selecting feature selection algorithms. With the categorizing framework, we continue our efforts toward building an integrated system for intelligent feature selection. A unifying platform is proposed as an intermediate step. An illustrative example is presented to show how existing feature selection algorithms can be integrated into a meta algorithm that can take advantage of individual algorithms. An added advantage of doing so is to help a user employ a suitable algorithm without knowing details of each algorithm. Some real-world applications are included to demonstrate the use of feature selection in data mining. We conclude this work by identifying trends and challenges of feature selection research and development.
Complexity measures of supervised classification problems
- IEEE Transactions on Pattern Analysis and Machine Intelligence
, 2002
"... AbstractÐWe studied a number of measures that characterize the difficulty of a classification problem, focusing on the geometrical complexity of the class boundary. We compared a set of real-world problems to random labelings of points and found that real problems contain structures in this measurem ..."
Abstract
-
Cited by 38 (3 self)
- Add to MetaCart
AbstractÐWe studied a number of measures that characterize the difficulty of a classification problem, focusing on the geometrical complexity of the class boundary. We compared a set of real-world problems to random labelings of points and found that real problems contain structures in this measurement space that are significantly different from the random sets. Distributions of problems in this space show that there exist at least two independent factors affecting a problem's difficulty. We suggest using this space to describe a classifier's domain of competence. This can guide static and dynamic selection of classifiers for specific problems as well as subproblems formed by confinement, projection, and transformations of the feature vectors. Index TermsÐClassification, clustering, complexity, linear separability, mixture identifiability. 1
Paint selection
- ACM Transactions on Graphics
"... Figure 1: Left three: the user makes a selection by painting the object of interest with a brush (black-white circle) on a 24.5 megapixel image. Instant feedback (selection boundary or image effect) can be provided to the user during mouse dragging. Rightmost: composition and effect (sepia tone). No ..."
Abstract
-
Cited by 30 (1 self)
- Add to MetaCart
Figure 1: Left three: the user makes a selection by painting the object of interest with a brush (black-white circle) on a 24.5 megapixel image. Instant feedback (selection boundary or image effect) can be provided to the user during mouse dragging. Rightmost: composition and effect (sepia tone). Note that the blue scribbles are invisible to the user. They are drawn in the paper for illustration only. Abstract. In this paper, we present Paint Selection, a progressive painting-based tool for local selection in images. Paint Selection facilitates users to progressively make a selection by roughly painting the object of interest using a brush. More importantly, Paint Selection is efficient enough that instant feedback can be provided to users as they drag the mouse. We demonstrate that high quality selections can be quickly and effectively “painted ” on a variety of multi-megapixel images.
Intrinsic Dimensionality Estimation with Optimally Topology Preserving Maps
- IEEE Transactions on Pattern Analysis and Machine Intelligence
, 1997
"... A new method for analyzing the intrinsic dimensionality (ID) of low dimensional manifolds in high dimensional feature spaces is presented. The basic idea is to first extract a low-dimensional representation that captures the intrinsic topological structure of the input data and then to analyze this ..."
Abstract
-
Cited by 30 (3 self)
- Add to MetaCart
A new method for analyzing the intrinsic dimensionality (ID) of low dimensional manifolds in high dimensional feature spaces is presented. The basic idea is to first extract a low-dimensional representation that captures the intrinsic topological structure of the input data and then to analyze this representation, i.e. estimate the intrinsic dimensionality. More specifically, the representation we extract is an optimally topology preserving feature map (OTPM) which is an undirected parametrized graph with a pointer in the input space associated with each node. Estimation of the intrinsic dimensionality is based on local PCA of the pointers of the nodes in the OTPM and their direct neighbors. The method has a number of important advantages compared with previous approaches: First, it can be shown to have only linear time complexity w.r.t. the dimensionality of the input space, in contrast to conventional PCA based approaches which have cubic complexity and hence become computational imp...
Dimensionality Reduction for Unsupervised Data
- In Ninth IEEE International Conference on Tools with AI, ICTAI'97
, 1997
"... Dimensionality reduction is an important problem for efficient handling of large databases. Many feature selection methods can serve the purpose for supervised data in which each record is attached with a calss label. Little work has been done for dimensionality reduction for unsupervised data in wh ..."
Abstract
-
Cited by 25 (2 self)
- Add to MetaCart
Dimensionality reduction is an important problem for efficient handling of large databases. Many feature selection methods can serve the purpose for supervised data in which each record is attached with a calss label. Little work has been done for dimensionality reduction for unsupervised data in which class information is not available. Principal Component Analysis (PCA) is often used. However, PCA creates new features or principal components which are functions of original features. It is difficult to obtain intuitive understanding of the data using the new features only. In this paper we are concerned with the problem of determining and choosing the important original features for unsupervised data. Our method is based on the observation that removing an irrelevant feature from the feature set may not change the underlying concept of the data, but not so otherwise. We propose an entropy measure for ranking features, and conduct extensive experiments to show that our meth...
Parcel: Feature Subset Selection in Variable Cost Domains
, 1998
"... The vast majority of classification systems are designed with a single set of features, and optimised to a single specified cost. However, in examples such as medical and financial risk modelling, costs are known to vary subsequent to system design. In this paper, we present a design method for feat ..."
Abstract
-
Cited by 20 (1 self)
- Add to MetaCart
The vast majority of classification systems are designed with a single set of features, and optimised to a single specified cost. However, in examples such as medical and financial risk modelling, costs are known to vary subsequent to system design. In this paper, we present a design method for feature selection in the presence of varying costs. Starting from the Wilcoxon nonparametric statistic for the performance of a classification system, we introduce a concept called the maximum realisable receiver operating characteristic (MRROC), and prove a related theorem. A novel criterion for feature selection, based on the area under the MRROC curve, is then introduced. This leads to a framework which we call Parcel. This has the flexibility to use different combinations of features at different operating points on the resulting MRROC curve. Empirical support for each stage in our approach is provided by experiments on real world problems, with Parcel achieving superior results. iv v C...
Feature Selection via Discretization
- IEEE Transactions on Knowledge and Data Engineering
, 1997
"... Discretization can turn numeric attributes into discrete ones. Feature selection can eliminate some irrelevant and/or redundant attributes. Chi2 is a simple and general algorithm that uses the 2 statistic to discretize numeric attributes repeatedly until some inconsistencies are found in the data ..."
Abstract
-
Cited by 20 (1 self)
- Add to MetaCart
Discretization can turn numeric attributes into discrete ones. Feature selection can eliminate some irrelevant and/or redundant attributes. Chi2 is a simple and general algorithm that uses the 2 statistic to discretize numeric attributes repeatedly until some inconsistencies are found in the data. It achieves feature selection via discretization. It can handle mixed attributes, work with multiclass data, and remove irrelevant and redundant attributes. Keywords--- discretization, feature selection, pattern classification I. Introduction Feature selection can eliminate some irrelevant and/or redundant attributes. By using relevant features, classification algorithms can in general improve their predictive accuracy, shorten the learning period, and form simpler concepts. There are abundant feature selection algorithms. Some use methods like principle component to compose a smaller number of new features [11,12]; some select a subset of the original attributes [1,5]. This paper consi...
X2R: A Fast Rule Generator
- in Proceedings of IEEE International Conference on Systems, Man and Cybernetics
, 1995
"... Although they can learn from raw data, many concept learning algorithms require that the training data contain only discrete data. However, real world problems contain, more often than not, both numeric and discrete data. So before these algorithms can be applied, data discretization (quantization) ..."
Abstract
-
Cited by 18 (4 self)
- Add to MetaCart
Although they can learn from raw data, many concept learning algorithms require that the training data contain only discrete data. However, real world problems contain, more often than not, both numeric and discrete data. So before these algorithms can be applied, data discretization (quantization) is needed. This paper introduces X2R, a simple and fast algorithm that can be applied to both numeric and discrete data, and generate rules from datasets like Season-Classification, Golf-Playing that contain continuous and/or discrete data. The empirical results demonstrate that X2R can effectively generate rules from the raw data and perform better than some of its peers in terms of the quality of rules and time complexities. 1 Introduction Concept learning is a task to learn some concepts from raw data. Real world problems normally contain both numeric and discrete data. Many concept learning algorithms can only handle discrete data. Before running these algorithms, discretization is nec...
An Evaluation on Feature Selection for Text Clustering
- In ICML
, 2003
"... Feature selection methods have been successfully applied to text categorization but seldom applied to text clustering due to the unavailability of class label information. In this paper, we first give empirical evidence that feature selection methods can improve the efficiency and performance of tex ..."
Abstract
-
Cited by 17 (2 self)
- Add to MetaCart
Feature selection methods have been successfully applied to text categorization but seldom applied to text clustering due to the unavailability of class label information. In this paper, we first give empirical evidence that feature selection methods can improve the efficiency and performance of text clustering algorithm. Then we propose a new feature selection method called “Term Contribution (TC) ” and perform a comparative study on a variety of feature selection methods for text clustering, including Document Frequency (DF), Term Strength (TS), Entropy-based (En), Information Gain (IG) and א 2 statistic (CHI). Finally, we propose an “Iterative Feature Selection (IF) ” method that addresses the unavailability of label problem by utilizing effective supervised feature selection method to iteratively select features and perform clustering. Detailed experimental results on Web Directory data are provided in the paper. 1.

