Results 1  10
of
272
Feature Subset Selection Using A Genetic Algorithm
, 1997
"... : Practical pattern classification and knowledge discovery problems require selection of a subset of attributes or features (from a much larger set) to represent the patterns to be classified. This is due to the fact that the performance of the classifier (usually induced by some learning algorithm) ..."
Abstract

Cited by 258 (7 self)
 Add to MetaCart
(Show Context)
: Practical pattern classification and knowledge discovery problems require selection of a subset of attributes or features (from a much larger set) to represent the patterns to be classified. This is due to the fact that the performance of the classifier (usually induced by some learning algorithm) and the cost of classification are sensitive to the choice of the features used to construct the classifier. Exhaustive evaluation of possible feature subsets is usually infeasible in practice because of the large amount of computational effort required. Genetic algorithms, which belong to a class of randomized heuristic search techniques, offer an attractive approach to find nearoptimal solutions to such optimization problems. This paper presents an approach to feature subset selection using a genetic algorithm. Some advantages of this approach include the ability to accommodate multiple criteria such as accuracy and cost of classification into the feature selection process and to find fe...
On the Approximability of Minimizing Nonzero Variables Or Unsatisfied Relations in Linear Systems
, 1997
"... We investigate the computational complexity of two closely related classes of combinatorial optimization problems for linear systems which arise in various fields such as machine learning, operations research and pattern recognition. In the first class (Min ULR) one wishes, given a possibly infeasib ..."
Abstract

Cited by 110 (3 self)
 Add to MetaCart
We investigate the computational complexity of two closely related classes of combinatorial optimization problems for linear systems which arise in various fields such as machine learning, operations research and pattern recognition. In the first class (Min ULR) one wishes, given a possibly infeasible system of linear relations, to find a solution that violates as few relations as possible while satisfying all the others. In the second class (Min RVLS) the linear system is supposed to be feasible and one looks for a solution with as few nonzero variables as possible. For both Min ULR and Min RVLS the four basic types of relational operators =, , ? and 6= are considered. While Min RVLS with equations was known to be NPhard in [27], we established in [2, 5] that Min ULR with equalities and inequalities are NPhard even when restricted to homogeneous systems with bipolar coefficients. The latter problems have been shown hard to approximate in [8]. In this paper we determine strong bou...
Coordinating Perceptually Grounded Categories through Language. A Case Study For Colour
"... The paper proposes a number of models to examine through what mechanisms a population of autonomous agents could arrive at a repertoire of perceptually grounded categories that is sufficiently shared to allow successful communication. The models are inspired by the main approaches to human categori ..."
Abstract

Cited by 106 (17 self)
 Add to MetaCart
The paper proposes a number of models to examine through what mechanisms a population of autonomous agents could arrive at a repertoire of perceptually grounded categories that is sufficiently shared to allow successful communication. The models are inspired by the main approaches to human categorisation being discussed in the literature: nativism, empiricism, and culturalism. Colour is taken as a case study. Although the paper takes no stance on which position is to be accepted as final truth with respect to human categorisation and naming, it points to theoretical constraints that make each position more or less likely and contains clear suggestions on what the best engineering solution would be. Specifically, it argues that the collective choice of a shared repertoire must integrate multiple constraints, including constraints coming from communication.
kPlane Clustering
 Journal of Global Optimization
, 2000
"... A finite new algorithm is proposed for clustering m given points in ndimensional real space into k clusters by generating k planes that constitute a local solution to the nonconvex problem of minimizing the sum of squares of the 2norm distances between each point and a nearest plane. The key to th ..."
Abstract

Cited by 74 (3 self)
 Add to MetaCart
(Show Context)
A finite new algorithm is proposed for clustering m given points in ndimensional real space into k clusters by generating k planes that constitute a local solution to the nonconvex problem of minimizing the sum of squares of the 2norm distances between each point and a nearest plane. The key to the algorithm lies in a formulation that generates a plane in ndimensional space that minimizes the sum of the squares of the 2norm distances to each of m1 given points in the space. The plane is generated by an eigenvector corresponding to a smallest eigenvalue of an n \Theta n simple matrix derived from the m1 points. The algorithm was tested on the publicly available Wisconsin Breast Prognosis Cancer database to generate well separated patient survival curves. In contrast, the kmean algorithm did not generate such wellseparated survival curves. 1 Introduction There are many approaches to clustering such as statistical [2, 9, 6], machine learning [7, 8] and mathematical programming [15...
Mathematical Programming for Data Mining: Formulations and Challenges
 INFORMS Journal on Computing
, 1998
"... This paper is intended to serve as an overview of a rapidly emerging research and applications area. In addition to providing a general overview, motivating the importance of data mining problems within the area of knowledge discovery in databases, our aim is to list some of the pressing research ch ..."
Abstract

Cited by 59 (0 self)
 Add to MetaCart
This paper is intended to serve as an overview of a rapidly emerging research and applications area. In addition to providing a general overview, motivating the importance of data mining problems within the area of knowledge discovery in databases, our aim is to list some of the pressing research challenges, and outline opportunities for contributions by the optimization research communities. Towards these goals, we include formulations of the basic categories of data mining methods as optimization problems. We also provide examples of successful mathematical programming approaches to some data mining problems. keywords: data analysis, data mining, mathematical programming methods, challenges for massive data sets, classification, clustering, prediction, optimization. To appear: INFORMS: Journal of Compting, special issue on Data Mining, A. Basu and B. Golden (guest editors). Also appears as Mathematical Programming Technical Report 9801, Computer Sciences Department, University of Wi...
Rulebased Evolutionary Online Learning Systems: LEARNING BOUNDS, CLASSIFICATION, AND PREDICTION
, 2004
"... Rulebased evolutionary online learning systems, often referred to as Michiganstyle learning classifier systems (LCSs), were proposed nearly thirty years ago (Holland, 1976; Holland, 1977) originally calling them cognitive systems. LCSs combine the strength of reinforcement learning with the genera ..."
Abstract

Cited by 52 (10 self)
 Add to MetaCart
Rulebased evolutionary online learning systems, often referred to as Michiganstyle learning classifier systems (LCSs), were proposed nearly thirty years ago (Holland, 1976; Holland, 1977) originally calling them cognitive systems. LCSs combine the strength of reinforcement learning with the generalization capabilities of genetic algorithms promising a flexible, online generalizing, solely reinforcement dependent learning system. However, despite several initial successful applications of LCSs and their interesting relations with animal learning and cognition, understanding of the systems remained somewhat obscured. Questions concerning learning complexity or convergence remained unanswered. Performance in different problem types, problem structures, concept spaces, and hypothesis spaces stayed nearly unpredictable. This thesis has the following three major objectives: (1) to establish a facetwise theory approach for LCSs that promotes system analysis, understanding, and design; (2) to analyze, evaluate, and enhance the XCS classifier system (Wilson, 1995) by the means of the facetwise approach establishing a fundamental XCS learning theory; (3) to identify both the major advantages of an LCSbased learning approach as well as the most promising potential application areas. Achieving these three objectives leads to a rigorous understanding
General fuzzy minmax neural network for clustering and classification
 IEEE Trans. Neural Netw
, 2000
"... Abstract—This paper describes a general fuzzy minmax (GFMM) neural network which is a generalization and extension of the fuzzy minmax clustering and classification algorithms developed by Simpson. The GFMM method combines the supervised and unsupervised learning within a single training algorithm ..."
Abstract

Cited by 44 (8 self)
 Add to MetaCart
(Show Context)
Abstract—This paper describes a general fuzzy minmax (GFMM) neural network which is a generalization and extension of the fuzzy minmax clustering and classification algorithms developed by Simpson. The GFMM method combines the supervised and unsupervised learning within a single training algorithm. The fusion of clustering and classification resulted in an algorithm that can be used as pure clustering, pure classification, or hybrid clustering classification. This hybrid system exhibits an interesting property of finding decision boundaries between classes while clustering patterns that cannot be said to belong to any of existing classes. Similarly to the original algorithms, the hyperbox fuzzy sets are used as a representation of clusters and classes. Learning is usually completed in a few passes through the data and consists of placing and adjusting the hyperboxes in the pattern space which is referred to as an expansion–contraction process. The classification results can be crisp or fuzzy. New data can be included without the need for retraining. While retaining all the interesting features of the original algorithms, a number of modifications to their definition have been made in order to accommodate fuzzy input patterns in the form of lower and upper bounds, combine the supervised and unsupervised learning, and improve the effectiveness of operations. A detailed account of the GFMM neural network, its comparison with the Simpson’s fuzzy minmax neural networks, a set of examples, and an application to the leakage detection and identification in water distribution systems are given. Index Terms—Classification, clustering, fuzzy systems, fuzzy minmax neural networks, pattern recognition. I.
Classification of EEG signals from four subjects during five mental tasks
 Proceedings of the Conference on Engineering Applications in Neural Networks (EANN’96
, 1996
"... anderson,sijercic¡ ..."
(Show Context)
Pattern Recognition Techniques in Microarray Data Analysis: A Survey. Annals of the New York Academy of Sciences
 of Sciences, techniques in Bioinformatics and Medical Informatics
, 2002
"... analysis Abstract: Recent development of technologies (e.g. microarray technology) that are capable of producing massive amounts of genetic data has highlighted the need for new pattern recognition techniques that can mine and discover “biologically meaningful ” knowledge in large data sets. Many re ..."
Abstract

Cited by 33 (0 self)
 Add to MetaCart
analysis Abstract: Recent development of technologies (e.g. microarray technology) that are capable of producing massive amounts of genetic data has highlighted the need for new pattern recognition techniques that can mine and discover “biologically meaningful ” knowledge in large data sets. Many researchers have begun an endeavor in this direction to devise such datamining techniques. As such, there is a need for survey articles that periodically review and summarize the work that has been done in the area. This article presents one such survey. The first portion of the paper is meant to provide the basic biology (mostly for nonbiologists) that is required in such a project. This part is only meant to be a starting point for those experts in the technical fields who wish to embark on this new area of bioinformatics. The second portion of the paper is a survey of various data mining techniques that have been used in mining microarray data for biological knowledge and information (such as sequence information). This survey is not meant to be treated as complete in any form, as the area is currently one of the most active, and the body of research is very large. Furthermore, the applications of the techniques mentioned here are not meant to be taken as the most significant applications of the techniques, but simply as some examples among many. Molecular Genome Biology