Results 1  10
of
643
Feature Subset Selection Using A Genetic Algorithm
, 1997
"... : Practical pattern classification and knowledge discovery problems require selection of a subset of attributes or features (from a much larger set) to represent the patterns to be classified. This is due to the fact that the performance of the classifier (usually induced by some learning algorithm) ..."
Abstract

Cited by 183 (7 self)
 Add to MetaCart
: Practical pattern classification and knowledge discovery problems require selection of a subset of attributes or features (from a much larger set) to represent the patterns to be classified. This is due to the fact that the performance of the classifier (usually induced by some learning algorithm) and the cost of classification are sensitive to the choice of the features used to construct the classifier. Exhaustive evaluation of possible feature subsets is usually infeasible in practice because of the large amount of computational effort required. Genetic algorithms, which belong to a class of randomized heuristic search techniques, offer an attractive approach to find nearoptimal solutions to such optimization problems. This paper presents an approach to feature subset selection using a genetic algorithm. Some advantages of this approach include the ability to accommodate multiple criteria such as accuracy and cost of classification into the feature selection process and to find fe...
Correlationbased feature selection for machine learning
, 1998
"... A central problem in machine learning is identifying a representative set of features from which to construct a classification model for a particular task. This thesis addresses the problem of feature selection for machine learning through a correlation based approach. The central hypothesis is that ..."
Abstract

Cited by 139 (3 self)
 Add to MetaCart
A central problem in machine learning is identifying a representative set of features from which to construct a classification model for a particular task. This thesis addresses the problem of feature selection for machine learning through a correlation based approach. The central hypothesis is that good feature sets contain features that are highly correlated with the class, yet uncorrelated with each other. A feature evaluation formula, based on ideas from test theory, provides an operational definition of this hypothesis. CFS (Correlation based Feature Selection) is an algorithm that couples this evaluation formula with an appropriate correlation measure and a heuristic search strategy. CFS was evaluated by experiments on artificial and natural datasets. Three machine learning algorithms were used: C4.5 (a decision tree learner), IB1 (an instance based learner), and naive Bayes. Experiments on artificial datasets showed that CFS quickly identifies and screens irrelevant, redundant, and noisy features, and identifies relevant features as long as their relevance does not strongly depend on other features. On natural domains, CFS typically eliminated well over half the features. In most cases, classification accuracy using the reduced feature set equaled or bettered accuracy using the complete feature set.
Interval propagation to reason about sets: definition and implementation of a practical language
 CONSTRAINTS
, 1997
"... Local consistency techniques have been introduced in logic programming in order to extend the application domain of logic programming languages. The existing languages based on these techniques consider arithmetic constraints applied to variables ranging over nite integer domains. This makes difficu ..."
Abstract

Cited by 102 (5 self)
 Add to MetaCart
Local consistency techniques have been introduced in logic programming in order to extend the application domain of logic programming languages. The existing languages based on these techniques consider arithmetic constraints applied to variables ranging over nite integer domains. This makes difficult a natural and concise modelling as well as an efficient solving of a class of NPcomplete combinatorial search problems dealing with sets. To overcome these problems, we propose a solution which consists in extending the notion of integer domains to that of set domains (sets of sets). We specify a set domain by an interval whose lower and upper bounds are known sets, ordered by set inclusion. We define the formal and practical framework of a new constraint logic programming language over set domains, called Conjunto. Conjunto comprises the usual set operation symbols ([ � \ � n), and the set inclusion relation (). Set expressions built using the operation symbols are interpreted as relations (s [ s1 = s2,...). In addition, Conjunto provides us with a set of constraints called graduated constraints (e.g. the set cardinality) which map sets onto arithmetic terms. This allows us to handle optimization problems by applying a cost function to the quantifiable, i.e., arithmetic, terms which are associated to set terms. The constraint solving in Conjunto is based on local consistency techniques using interval reasoning which are extended to handle set constraints. The main contribution of this paper concerns the formal definition of the language and its design and implementation as a practical language.
Data Mining in Soft Computing Framework: A Survey
 IEEE Transactions on Neural Networks
, 2001
"... The present article provides a survey of the available literature on data mining using soft computing. A categorization has been provided based on the different soft computing tools and their hybridizations used, the data mining function implemented, and the preference criterion selected by the mode ..."
Abstract

Cited by 61 (3 self)
 Add to MetaCart
The present article provides a survey of the available literature on data mining using soft computing. A categorization has been provided based on the different soft computing tools and their hybridizations used, the data mining function implemented, and the preference criterion selected by the model. The utility of the different soft computing methodologies is highlighted. Generally fuzzy sets are suitable for handling the issues related to understandability of patterns, incomplete/noisy data, mixed media information and human interaction, and can provide approximate solutions faster. Neural networks are nonparametric, robust, and exhibit good learning and generalization capabilities in datarich environments. Genetic algorithms provide efficient search algorithms to select a model, from mixed media data, based on some preference criterion/objective function. Rough sets are suitable for handling different types of uncertainty in data. Some challenges to data mining and the application of soft computing methodologies are indicated. An extensive bibliography is also included.
Rough Mereology: A New Paradigm For Approximate Reasoning
, 1996
"... We are concerned with formal models of reasoning under uncertainty. Many approaches to this problem are known in the literature e.g. DempsterShafer theory, bayesianbased reasoning, belief networks, fuzzy logics etc. We propose rough mereology as a foundation for approximate reasoning about complex ..."
Abstract

Cited by 57 (25 self)
 Add to MetaCart
We are concerned with formal models of reasoning under uncertainty. Many approaches to this problem are known in the literature e.g. DempsterShafer theory, bayesianbased reasoning, belief networks, fuzzy logics etc. We propose rough mereology as a foundation for approximate reasoning about complex objects. Our notion of a complex object includes approximate proofs understood as schemes constructed to support our assertions about the world on the basis of our incomplete or uncertain knowledge. 1 Introduction We present a formal model of approximate reasoning about processes of synthesis of complex systems. First ideas of this approach have been presented in [15], [24], [25], [27], [28], [29], [30], [31]. Our research has been stimulated by the demand for solutions of the following groups of problems, estimated in [1] to be crucial for the progress in the area of automated design and manufacturing. These groups of problems are concerned with the treatment of: Group 1. Poorly defined...
Dynamic Reducts as a Tool for Extracting Laws from Decisions Tables
, 1994
"... . We apply rough set methods and boolean reasoning for knowledge discovery from decision tables. It is not always possible to extract general laws from experimental data by computing first all reducts [12] of a decision table and next decision rules on the basis of these reducts. We investigate a pr ..."
Abstract

Cited by 53 (13 self)
 Add to MetaCart
. We apply rough set methods and boolean reasoning for knowledge discovery from decision tables. It is not always possible to extract general laws from experimental data by computing first all reducts [12] of a decision table and next decision rules on the basis of these reducts. We investigate a problem how information about the reduct set changes in a random sampling process of a given decision table could be used to generate these laws. The reducts stable in the process of decision table sampling are called dynamic reducts. Dynamic reducts define the set of attributes called the dynamic core. This is the set of attributes included in all dynamic reducts. The set of decision rules can be computed from the dynamic core or from the best dynamic reducts. We report the results of experiments with different data sets, e.g. market data, medical data, textures and handwritten digits. The results are showing that dynamic reducts can help to extract laws from decision tables. Key words: evol...
Current Approaches to Handling Imperfect Information in Data and Knowledge Bases
, 1996
"... This paper surveys methods for representing and reasoning with imperfect information. It opens with an attempt to classify the different types of imperfection that may pervade data, and a discussion of the sources of such imperfections. The classification is then used as a framework for considering ..."
Abstract

Cited by 52 (1 self)
 Add to MetaCart
This paper surveys methods for representing and reasoning with imperfect information. It opens with an attempt to classify the different types of imperfection that may pervade data, and a discussion of the sources of such imperfections. The classification is then used as a framework for considering work that explicitly concerns the representation of imperfect information, and related work on how imperfect information may be used as a basis for reasoning. The work that is surveyed is drawn from both the field of databases and the field of artificial intelligence. Both of these areas have long been concerned with the problems caused by imperfect information, and this paper stresses the relationships between the approaches developed in each.
A Review of Rough Set Models
, 1997
"... Since introduction of the theory of rough set in early eighties, considerable work has been done on the development and application of this new theory. The paper provides a review of the Pawlak rough set model and its extensions, with emphasis on the formulation, characterization, and interpretation ..."
Abstract

Cited by 48 (16 self)
 Add to MetaCart
Since introduction of the theory of rough set in early eighties, considerable work has been done on the development and application of this new theory. The paper provides a review of the Pawlak rough set model and its extensions, with emphasis on the formulation, characterization, and interpretation of various rough set models. 1
Conjunto: Constraint Logic Programming with Finite Set Domains
 Logic Programming  Proceedings of the 1994 International Symposium, pages 339358, Massachusetts Institute of Technology
, 1994
"... Combinatorial problems involving sets and relations are currently tackled by integer programming and expressed with vectors or matrices of 01 variables. This is efficient but not flexible and unnatural in problem formulation. Toward a natural programming of combinatorial problems based on sets, gra ..."
Abstract

Cited by 47 (1 self)
 Add to MetaCart
Combinatorial problems involving sets and relations are currently tackled by integer programming and expressed with vectors or matrices of 01 variables. This is efficient but not flexible and unnatural in problem formulation. Toward a natural programming of combinatorial problems based on sets, graphs or relations, we define a new CLP language with set constraints. This language Conjunto 1 aims at combining the declarative aspect of Prolog with the efficiency of constraint solving techniques. We propose to constrain a set variable to range over finite set domains specified by lower and upper bounds for set inclusion. Conjunto is based on the inclusion and disjointness constraints applied to set expressions which comprise the union, intersection and difference symbols. The main contribution herein is the constraint handler which performs constraint propagation by applying consistency techniques over set constraints. 1 Introduction Various systems of set constraints have been define...
Perspectives of granular computing
 Proceedings of 2005 IEEE International Conference on Granular Computing
, 2005
"... Abstract—As an emerging field of study, granular computing has received much attention. Many models, frameworks, methods and techniques have been proposed and studied. It is perhaps the time to seek for a general and unified view so that fundamental issues can be examined and clarified. This paper e ..."
Abstract

Cited by 46 (10 self)
 Add to MetaCart
Abstract—As an emerging field of study, granular computing has received much attention. Many models, frameworks, methods and techniques have been proposed and studied. It is perhaps the time to seek for a general and unified view so that fundamental issues can be examined and clarified. This paper examines granular computing from three perspectives. By viewing granular computing as a way of structured thinking, we focus on its philosophical foundations in modeling human perception of the reality. By viewing granular computing as a method of structured problem solving, we examine its theoretical and methodological foundations in solving a wide range of realworld problems. By viewing granular computing as a paradigm of information processing, we turn our attention to its more concrete techniques. The three perspectives together offer a holistic view of granular computing.