• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

GA-MINER: parallel data mining with hierarchical genetic algorithms - final report. EPCC-AIKMS-GA-MINER-Report 1.0 (1995)

by I W Flockhart, N J Radcliffe
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 14
Next 10 →

Knowledge-Independent Data Mining with Fine-Grained Parallel Evolutionary Algorithms

by Xavier Llora, Josep M. Garrell - In Proceedings of the Genetic and Evolutionary Computation Conference (GECCO’2001 , 2001
"... This paper illustrates the application of evolutionary algorithms (EA) to data mining problems. The objectives are to demonstrate that EA can provide a competitive general purpose data mining scheme for classification tasks without constraining the knowledge representation, and that it can be ..."
Abstract - Cited by 16 (8 self) - Add to MetaCart
This paper illustrates the application of evolutionary algorithms (EA) to data mining problems. The objectives are to demonstrate that EA can provide a competitive general purpose data mining scheme for classification tasks without constraining the knowledge representation, and that it can be achieved reducing the amount of time required using the inherent parallel processing nature of EA.

A Genetic Algorithm-Based Approach to Data Mining

by Ian W. Flockhart, Nicholas J. Radcliffe , 1996
"... Most data mining systems to date have used variants of traditional machine-learning algorithms to tackle the task of directed knowledge discovery. This paper presents an approach which, as well as being useful for such directed data mining, can also be applied to the further tasks of undirected data ..."
Abstract - Cited by 11 (0 self) - Add to MetaCart
Most data mining systems to date have used variants of traditional machine-learning algorithms to tackle the task of directed knowledge discovery. This paper presents an approach which, as well as being useful for such directed data mining, can also be applied to the further tasks of undirected data mining and hypothesis refinement. This approach exploits parallel genetic algorithms as the search mechanism and seeks to evolve explicit "rules" for maximum comprehensibility. Example rules found in real commercial datasets are presented.

Rule Discovery with a Parallel Genetic Algorithm

by Dieferson L. A. Araujo, Av. De Setembro, Heitor S. Lopes, Curitiba Pr, Alex A. Freitas , 2000
"... An important issue in data mining is scalability with respect to the size of the dataset being mined. In the paper we address this issue by presenting a parallel GA for rule discovery. This algorithm exploits both data parallelism, by distributing the data being mined across all available proc ..."
Abstract - Cited by 6 (0 self) - Add to MetaCart
An important issue in data mining is scalability with respect to the size of the dataset being mined. In the paper we address this issue by presenting a parallel GA for rule discovery. This algorithm exploits both data parallelism, by distributing the data being mined across all available processors, and control parallelism, by distributing the population of individuals across all available processors. 1

Data Mining using Learning Classifier Systems

by Alwyn Barry , John Holmes, Xavier Llorà - IN L. BULL (ED) APPLICATIONS OF LEARNING CLASSIFIER SYSTEMS , 2004
"... ..."
Abstract - Cited by 6 (2 self) - Add to MetaCart
Abstract not found

MOLeCS: A MultiObjective Learning Classifier System

by Ester Bernadó-mansilla, Ester Bernadó-mansilla, Xavier Llorà, Xavier Llorà, Ivan Traus, Ivan Traus, Enginyeria I Arquitectura La Salle - Proceedings of the 2000 Conference on Genetic and Evolutionary Computation, 1 , 2000
"... Learning concept descriptions from data is a complex multiobjective task. The model induced by the learner should be accurate so that it can represent precisely the data instances, complete, which means it can be generalizable to new instances, and minimum, or easily readable. Learning Classifier Sy ..."
Abstract - Cited by 6 (2 self) - Add to MetaCart
Learning concept descriptions from data is a complex multiobjective task. The model induced by the learner should be accurate so that it can represent precisely the data instances, complete, which means it can be generalizable to new instances, and minimum, or easily readable. Learning Classifier Systems (LCSs) are a family or learners whose primary search mechanism is a genetic algorithm. Along the intense history of the field, the efforts of the community have been centered on the design of LCSs that solved these goals efficiently, resulting in the proposal of multiple systems. This paper revises the main LCS approaches and focuses on the analysis of the different mechanisms designed to fulfill the learning goals. Some of these mechanisms include implicit multiobjective learning mechanisms, while others use explicit multiobjective evolutionary algorithms. The paper analyses the advantages of using multiobjective evolutionary algorithms, especially in Pittsburgh LCSs, such as controlling the so-called bloat effect, and offering the human expert a set of concept description alternatives. 1 A Multiobjective Motivation Classification is a central task in data mining and machine learning applications. It consists

Evolution of Decision Trees

by Xavier Llora, Josep M. Garrell - In Proceedings of the 4th Catalan Conference on Artificial Intelligence , 2001
"... This paper addresses the issue of the induction of orthogonal, oblique and multivariate decision trees. Algorithms proposed by other researchers use heuristic, usually based on the information gain concept, to induce decision trees greedily. ..."
Abstract - Cited by 4 (1 self) - Add to MetaCart
This paper addresses the issue of the induction of orthogonal, oblique and multivariate decision trees. Algorithms proposed by other researchers use heuristic, usually based on the information gain concept, to induce decision trees greedily.

PKDD'98 Tutorial on Scalable, High-Performance Data Mining with Parallel Processing

by Alex A. Freitas - In Proceedings of the Principles and Practice of Knowledge Discovery in Databases (PKDD’98 , 1998
"... Contents 1 Introduction 2 Overview of 7 different approaches for speeding up data mining in large databases 3 An overview of parallel processing for data mining 4 Parallel rule induction 5 Parallel Instance-Based Learning 6 Parallel Genetic Algorithms 7 Parallel Neural Networks 8 Conclusions Introd ..."
Abstract - Cited by 2 (0 self) - Add to MetaCart
Contents 1 Introduction 2 Overview of 7 different approaches for speeding up data mining in large databases 3 An overview of parallel processing for data mining 4 Parallel rule induction 5 Parallel Instance-Based Learning 6 Parallel Genetic Algorithms 7 Parallel Neural Networks 8 Conclusions Introduction. Problem: How to perform efficient data mining in very large databases. Natural solution: parallelism Performance issues: any sequential data mining algorithm: O(N) parallelism reduces this lower bound to O(N/p) (N = No. of tuples, p = No. of processors) Cost-benefit issues: many data warehouses are already implemented on cost-effective parallel database servers 2 Overview of 7 different approaches for speeding up data mining in large databases. Data-Oriented Approaches: (1) Sampling (reduces number of tuples) (2) Attribute selection (reduces number of attributes) (3) Discretization (reduces number of values of attributes, which in

Evolutionary Computation

by Alex Alves Freitas - In Handbook of Data Mining and Knowledge Discovery , 2002
"... This chapter addresses the integration of knowledge discovery in databases (KDD) and evolutionary algorithms (EAs), particularly genetic algorithms and genetic programming. First we provide a brief overview of EAs. Then the remaining text is divided into three parts. Section 2 discusses the use of E ..."
Abstract - Cited by 2 (0 self) - Add to MetaCart
This chapter addresses the integration of knowledge discovery in databases (KDD) and evolutionary algorithms (EAs), particularly genetic algorithms and genetic programming. First we provide a brief overview of EAs. Then the remaining text is divided into three parts. Section 2 discusses the use of EAs for KDD. The emphasis is on the use of EAs in attribute selection and in the optimization of parameters for other kinds of KDD algorithms (such as decision trees and nearest neighbour algorithms). Section 3 discusses three research problems in the design of an EA for KDD, namely: how to discover comprehensible rules with genetic programming, how to discover surprising (interesting) rules, and how to scale up EAs with parallel processing. Finally, section 4 discusses what the added value of KDD is for EAs. This section includes the remark that generalization performance on a separate test set (unseen during training, or EA run) is a basic principle for evaluating the quality of discovered knowledge, and then suggests that this principle should be followed in other EA applications. 1.

Towards Better than Human Capability in Diagnosing Prostate Cancer Using Infrared Spectroscopic Imaging

by Xavier Llorà, Rohith Reddy, Brian Matesic, Rohit Bhargava
"... Cancer diagnosis is essentially a human task. Almost universally, the process requires the extraction of tissue (biopsy) and examination of its microstructure by a human. To improve diagnoses based on limited and inconsistent morphologic knowledge, a new approach has recently been proposed that uses ..."
Abstract - Cited by 2 (0 self) - Add to MetaCart
Cancer diagnosis is essentially a human task. Almost universally, the process requires the extraction of tissue (biopsy) and examination of its microstructure by a human. To improve diagnoses based on limited and inconsistent morphologic knowledge, a new approach has recently been proposed that uses molecular spectroscopic imaging to utilize microscopic chemical composition for diagnoses. In contrast to visible imaging, the approach results in very large data sets as each pixel contains the entire molecular vibrational spectroscopy data from all chemical species. Here, we propose data handling and analysis strategies to allow computerbased diagnosis of human prostate cancer by applying a novel genetics-based machine learning technique (NAX). We apply this technique to demonstrate both fast learning and accurate classification that, additionally, scales well with parallelization. Preliminary results demonstrate that this approach can improve current clinical practice in diagnosing prostate cancer.

Intensional Encapsulations of Database Subsets via Genetic Programming

by Aybar C. Acar, Amihai Motro , 2005
"... Finding intensional encapsulations of database subsets is the inverse of query evaluation. Whereas query evaluation transforms an intensional expression (the query) to its extension (a set of data values), intensional encapsulation assigns an intensional expression to a given set of data values. We ..."
Abstract - Cited by 1 (0 self) - Add to MetaCart
Finding intensional encapsulations of database subsets is the inverse of query evaluation. Whereas query evaluation transforms an intensional expression (the query) to its extension (a set of data values), intensional encapsulation assigns an intensional expression to a given set of data values. We describe a method for deriving intensional representations of subsets of records in large database tables. Our method is based on the paradigm of genetic programming. It is shown to achieve high accuracy and maintain compact expression size, while requiring cost that is acceptable to all applications, but those that require instantaneous results. Intensional encapsulation has a broad range of applications including cooperative answering, information integration, security and data mining. 1
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University