Results 1  10
of
12
On Feature Selection and Effective Classifiers
, 1998
"... In this paper, we develop and analyze four algorithms for feature selection in the context of rough set methodology. The initial state and the feasibility criterion of all these algorithms are the same. That is, they start with a given feature set and progressively remove features, while controlling ..."
Abstract

Cited by 10 (1 self)
 Add to MetaCart
In this paper, we develop and analyze four algorithms for feature selection in the context of rough set methodology. The initial state and the feasibility criterion of all these algorithms are the same. That is, they start with a given feature set and progressively remove features, while controlling the amount of degradation in classification quality. These algorithms, however, differ in the heuristics used for pruning the search space of features. Our experimental results confirm the expected relationship between the time complexity of these algorithms and the classification accuracy of the resulting upper classifiers. Our experiments demonstrate that a `reduct of given feature set can be found efficiently. Although we have adopted upper classifiers in our investigations, the algorithms presented can however be used with any method of deriving a classifier where the quality of classification is a monotonically decreasing function of the size of the feature set. We compare the perform...
The Status of Research on Rough Sets for Knowledge Discovery in Databases
 In: Proceedings of the Second International Conference on Nonlinear Problems in Aviation and Aerospace (ICNPAA98
, 1998
"... Knowledge Discovery in Databases (KDD) has evolved into an important and active area of research because of theoretical challenges and practical applications associated with the problem of discovering (or extracting) interesting and previously unknown knowledge from very large realworld database ..."
Abstract

Cited by 6 (0 self)
 Add to MetaCart
Knowledge Discovery in Databases (KDD) has evolved into an important and active area of research because of theoretical challenges and practical applications associated with the problem of discovering (or extracting) interesting and previously unknown knowledge from very large realworld databases. Many aspects of KDD have been investigated in several related fields. The emphasis of ongoing research is to extend existing results to handle characteristics of realworld databases. In this article, we outline the fundamental issues of KDD as well as describe the current status of research on applying rough set theory to KDD. 1 Introduction In the last decade, we have seen an explosive growth in our capabilities to both collect and store data. In fact, it is estimated that the amount of information in the world doubles every 20 months. Our inability to interpret and digest large quantities of data has created a need for a new generation of tools and techniques. Consequently, the d...
Association Mining and Formal Concept Analysis
 In Proceedings Sixth International Workshop on Rough Sets, Data Mining and Granular Computing, Vol
, 1998
"... In this paper, we develop a connection between association queries and formal concept analysis. An association query discovers dependencies among values of an attribute grouped by other, nonprimary attributes in a given relation. Formal concept analysis deals with formal mathematical tools and tech ..."
Abstract

Cited by 5 (1 self)
 Add to MetaCart
In this paper, we develop a connection between association queries and formal concept analysis. An association query discovers dependencies among values of an attribute grouped by other, nonprimary attributes in a given relation. Formal concept analysis deals with formal mathematical tools and techniques to develop and analyze relationship between concepts and to develop concept structures. We show that dependencies found by an association query can be derived from a concept structure. Keywords Association queries, formal concept analysis, dependency relations, concept structures. 1
TOWARDS A LINGUISTIC DESCRIPTION OF DEPENDENCIES IN DATA
"... The problem of a linguistic description of dependencies in data by a set of rules Rk: “If X is Tk then Y is Sk ” is ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
The problem of a linguistic description of dependencies in data by a set of rules Rk: “If X is Tk then Y is Sk ” is
An Inductive Learning Algorithm for Production Rule Discovery
 http://www.ceng.metu.edu.tr/~tolun/courses/ila.ps Journal of Information Technology Vol.12 No.7 2006
"... Data mining is the search for relationships and global patterns that exist in large databases. One of the main problems for data mining is that the number of possible relationships is very large, thus prohibiting the search for the correct ones by validating each of them. Hence we need intelligent ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
Data mining is the search for relationships and global patterns that exist in large databases. One of the main problems for data mining is that the number of possible relationships is very large, thus prohibiting the search for the correct ones by validating each of them. Hence we need intelligent data mine tools, as taken from the domain of machine learning. In this paper we present a new inductive machine learning algorithm called ILA. The system generates rules in canonical form from a set of examples. We also describe application of ILA to a range of data sets with different number of attributes and classes. The results obtained show that ILA is more general and robust than most other algorithms for inductive learning. Most of the time, the worst case of ILA appears to be comparable to the best case of some wellknown algorithms such as AQ and ID3, if not better. Keywords: Machine Learning, Induction, Knowledge Discovery, Inductive Learning, Symbolic Learning Algorithm. 1. Introd...
ILA2: An Inductive Learning Algorithm for Knowledge Discovery
"... In this paper we describe the ILA2 rule induction algorithm which is the improved version of a novel inductive learning algorithm, ILA. We first outline the basic algorithm ILA, and then present how the algorithm is improved using a new evaluation metric that handles uncertainty in the data. By usi ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
In this paper we describe the ILA2 rule induction algorithm which is the improved version of a novel inductive learning algorithm, ILA. We first outline the basic algorithm ILA, and then present how the algorithm is improved using a new evaluation metric that handles uncertainty in the data. By using a new soft computing metric, users can reflect their preferences through a penalty factor to control the performance of the algorithm. ILA has also a faster pass criteria feature which reduces the processing time without sacrificing much from the accuracy that is not available in basic ILA. We experimentally show that the performance of ILA2 is comparable to that of wellknown inductive learning algorithms, namely CN2, OC1, ID3 and C4.5. Keywords: Data Mining, Knowledge Discovery, Machine Learning, Inductive Learning, Rule Induction. 1. Introduction A knowledge discovery process involves extracting valid, previously unknown, potentially useful, and comprehensible patterns from large ...
Relationship between Product Based Loyalty and Clustering based on Supermarket Visit and Spending Patterns
"... Loyalty of customers to a supermarket can be measured in a variety of ways. If a customer tends to buy from certain categories of products, it is likely that the customer is loyal to the supermarket. Another indication of loyalty is based on the tendency of customers to visit the supermarket over a ..."
Abstract
 Add to MetaCart
Loyalty of customers to a supermarket can be measured in a variety of ways. If a customer tends to buy from certain categories of products, it is likely that the customer is loyal to the supermarket. Another indication of loyalty is based on the tendency of customers to visit the supermarket over a number of weeks. Regular visitors and spenders are more likely to be loyal to the supermarket. Neither one of these two criteria can provide a complete picture of customers ’ loyalty. The decision regarding the loyalty of a customer will have to take into account the visiting pattern as well as the categories of products purchased. This paper describes results of experiments that attempted to identify customer loyalty using thes e two sets of criteria separately. The experiments were based on transactional data obtained from a supermarket data collection program. Comparisons of results from these parallel sets of experiments were useful in fine tuning both the schemes of estimating the degree of loyalty of a customer. The project also provides useful insights for the development of more sophisticated measures for studying customer loyalty. It is hoped that the understanding of loyal customers will be helpful in identifying better marketing strategies. 1.
Data Mining: Trends and Issues
"... this paper addresses the database size problem. The methods are illustrated with a running example from a database of car test results. The second paper, by Lingras and Yao, seeks to remove some limitations the basic rough set model, which is based on the concept of an equivalence relation. The auth ..."
Abstract
 Add to MetaCart
this paper addresses the database size problem. The methods are illustrated with a running example from a database of car test results. The second paper, by Lingras and Yao, seeks to remove some limitations the basic rough set model, which is based on the concept of an equivalence relation. The authors show that when the type of accessibility relation used in the rough set model is more general, it is possible to derive rules for classification queries from incomplete databases. One generalization is called the nonsymmetric rough set model; the other is called nontransitive rough set model. The generated rules are based on plausibility functions proposed by Shafer. In the paper by Choubey et al., the problem deriving rules for a classification query is investigated. The classifier given by the basic method of Pawlak is termed the lower classifier. This is generalized to yield upper and elementary set classifiers. Four algorithms for feature selection are proposed and experimentally compared, in the context of upper classifiers. The work addresses the problem of database size via feature selection heuristics and the problem of noisy environment by the adoption of the upper classifier. Their results suggest that, compared to the lower classifier, an upper classifier has some important features that make it suitable for data mining applications. In particular, it is shown that the upper classifier can be summarized at a desired level of abstraction by using extended decision tables. The use of extended decision tables is important for updating decision rules incrementally, when the database is dynamic. The fourth paper, by Wu, presents a heuristic, attributebased program, called HCV (Version 2.0), for handling a classification query. It is based on the extension matri...
ILA2: An Inductive Learning Algorithm over uncertain data
"... ABSTRACT AND CONCLUSION NEEDS TO BE REWRITTEN. ESPECIALLY WE SHOULD EMPHASIZE OUR CONTRIBUTION AND ORGINALITY OF THE WORK IN CONCLUSION. In this paper we describe the ILA2 rule induction algorithm from the machine learning domain. ILA2 is the improved version of a novel inductive learning algorith ..."
Abstract
 Add to MetaCart
ABSTRACT AND CONCLUSION NEEDS TO BE REWRITTEN. ESPECIALLY WE SHOULD EMPHASIZE OUR CONTRIBUTION AND ORGINALITY OF THE WORK IN CONCLUSION. In this paper we describe the ILA2 rule induction algorithm from the machine learning domain. ILA2 is the improved version of a novel inductive learning algorithm, namely ILA. We first describe the basic algorithm ILA, then present how the algorithm was improved. We also compare ILA2 to a range of induction algorithms, including ILA. According to the empirical comparisons, ILA2 appears to be comparable to CN2 and C4.5 algorithms in terms of output classifiers' accuracy and size. Keywords: Data Mining, Knowledge Discovery, Machine Learning, Inductive Learning, Rule Induction. 1. Introduction A datamining process involves extracting valid, previously unknown, potentially useful, and comprehensible patterns from large databases. As described in~\cite{fayyad96,simoudis96}, this process is typically made up of selection and sampling, preprocessing and...
Bicimsel Kavram Analizinin Eslestirme Sorgularnda Uygulanmas (T UB ITAK IK INC I BASAMAK PROJE ONER IS I)
"... Bicimsel kavram analizi (BKA) son on yldan beri arastrmaclar tarafndan ilgi ceken bir konu olmustur. Kavramn matamatiksel nosyonu orijin olarak bicimsel mantga dayanmaktadr. Bununla birlikte kavram, bir deneysel nosyon olarak bir cok disiplinde sezgisel yaklasmlar temel alnarak tanmlanms ve evri ..."
Abstract
 Add to MetaCart
Bicimsel kavram analizi (BKA) son on yldan beri arastrmaclar tarafndan ilgi ceken bir konu olmustur. Kavramn matamatiksel nosyonu orijin olarak bicimsel mantga dayanmaktadr. Bununla birlikte kavram, bir deneysel nosyon olarak bir cok disiplinde sezgisel yaklasmlar temel alnarak tanmlanms ve evrim gecirmistir. Bu durum, zamanla teoriksel ve pratiksel uygulamalar arasnda bir bosluk dogurmustur. Bunun onemli nedenlerinden birisini asagdaki gibi sralayabiliriz. BKA verilen bir baglam ic inde basit (ya da atomik) kavramlar c izge kafes yapsn kullanarak modeller ve bilesik kavramlarn tanm ve aralarndaki iliskileri modellemede eksik kalr. Bu projede, biz Wille'nin BKA cercevesini basit kavramlarn yansra bilesik ve genel kavramlar kapsayacak sekilde genisletecek ve kavramlar aras kesin olmyan iliskileri inceliyecegiz. Teoriksel olarak bu calsmann ilginc olan yonu asagdaki gibi acklanabilir. Baglam uzay nesneler ve ozellikler arasndaki iliski geregince olusturulan bir cizge kafesi...