Results 1 - 10
of
11
On Feature Selection and Effective Classifiers
, 1998
"... In this paper, we develop and analyze four algorithms for feature selection in the context of rough set methodology. The initial state and the feasibility criterion of all these algorithms are the same. That is, they start with a given feature set and progressively remove features, while controlling ..."
Abstract
-
Cited by 10 (1 self)
- Add to MetaCart
In this paper, we develop and analyze four algorithms for feature selection in the context of rough set methodology. The initial state and the feasibility criterion of all these algorithms are the same. That is, they start with a given feature set and progressively remove features, while controlling the amount of degradation in classification quality. These algorithms, however, differ in the heuristics used for pruning the search space of features. Our experimental results confirm the expected relationship between the time complexity of these algorithms and the classification accuracy of the resulting upper classifiers. Our experiments demonstrate that a `-reduct of given feature set can be found efficiently. Although we have adopted upper classifiers in our investigations, the algorithms presented can however be used with any method of deriving a classifier where the quality of classification is a monotonically decreasing function of the size of the feature set. We compare the perform...
The Status of Research on Rough Sets for Knowledge Discovery in Databases
- In: Proceedings of the Second International Conference on Nonlinear Problems in Aviation and Aerospace (ICNPAA98
, 1998
"... Knowledge Discovery in Databases (KDD) has evolved into an important and active area of research because of theoretical challenges and practical applications associated with the problem of discovering (or extracting) interesting and previously unknown knowledge from very large real-world database ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
Knowledge Discovery in Databases (KDD) has evolved into an important and active area of research because of theoretical challenges and practical applications associated with the problem of discovering (or extracting) interesting and previously unknown knowledge from very large real-world databases. Many aspects of KDD have been investigated in several related fields. The emphasis of ongoing research is to extend existing results to handle characteristics of real-world databases. In this article, we outline the fundamental issues of KDD as well as describe the current status of research on applying rough set theory to KDD. 1 Introduction In the last decade, we have seen an explosive growth in our capabilities to both collect and store data. In fact, it is estimated that the amount of information in the world doubles every 20 months. Our inability to interpret and digest large quantities of data has created a need for a new generation of tools and techniques. Consequently, the d...
Association Mining and Formal Concept Analysis
, 1998
"... In this paper, we develop a connection between association queries and formal concept analysis. An association query discovers dependencies among values of an attribute grouped by other, non-primary attributes in a given relation. Formal concept analysis deals with formal mathematical tools and tech ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
In this paper, we develop a connection between association queries and formal concept analysis. An association query discovers dependencies among values of an attribute grouped by other, non-primary attributes in a given relation. Formal concept analysis deals with formal mathematical tools and techniques to develop and analyze relationship between concepts and to develop concept structures. We show that dependencies found by an association query can be derived from a concept structure. Keywords- Association queries, formal concept analysis, dependency relations, concept structures. 1 Introduction An association query discovers dependencies among values of an attribute grouped by some other attributes in a given relation. A specific case of discovering associations concerns with a concrete problem that focuses on the analysis of market-basket-data (or, simply, basket relation) and in the end the solution of market-basket problem helps a retail store to learn about its customers' purc...
TOWARDS A LINGUISTIC DESCRIPTION OF DEPENDENCIES IN DATA
"... The problem of a linguistic description of dependencies in data by a set of rules Rk: “If X is Tk then Y is Sk ” is ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
The problem of a linguistic description of dependencies in data by a set of rules Rk: “If X is Tk then Y is Sk ” is
An Inductive Learning Algorithm for Production Rule Discovery
- http://www.ceng.metu.edu.tr/~tolun/courses/ila.ps Journal of Information Technology Vol.12 No.7 2006
"... Data mining is the search for relationships and global patterns that exist in large databases. One of the main problems for data mining is that the number of possible relationships is very large, thus prohibiting the search for the correct ones by validating each of them. Hence we need intelligent ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Data mining is the search for relationships and global patterns that exist in large databases. One of the main problems for data mining is that the number of possible relationships is very large, thus prohibiting the search for the correct ones by validating each of them. Hence we need intelligent data mine tools, as taken from the domain of machine learning. In this paper we present a new inductive machine learning algorithm called ILA. The system generates rules in canonical form from a set of examples. We also describe application of ILA to a range of data sets with different number of attributes and classes. The results obtained show that ILA is more general and robust than most other algorithms for inductive learning. Most of the time, the worst case of ILA appears to be comparable to the best case of some well-known algorithms such as AQ and ID3, if not better. Keywords: Machine Learning, Induction, Knowledge Discovery, Inductive Learning, Symbolic Learning Algorithm. 1. Introd...
Relationship between Product Based Loyalty and Clustering based on Supermarket Visit and Spending Patterns
"... Loyalty of customers to a supermarket can be measured in a variety of ways. If a customer tends to buy from certain categories of products, it is likely that the customer is loyal to the supermarket. Another indication of loyalty is based on the tendency of customers to visit the supermarket over a ..."
Abstract
- Add to MetaCart
Loyalty of customers to a supermarket can be measured in a variety of ways. If a customer tends to buy from certain categories of products, it is likely that the customer is loyal to the supermarket. Another indication of loyalty is based on the tendency of customers to visit the supermarket over a number of weeks. Regular visitors and spenders are more likely to be loyal to the supermarket. Neither one of these two criteria can provide a complete picture of customers ’ loyalty. The decision regarding the loyalty of a customer will have to take into account the visiting pattern as well as the categories of products purchased. This paper describes results of experiments that attempted to identify customer loyalty using thes e two sets of criteria separately. The experiments were based on transactional data obtained from a supermarket data collection program. Comparisons of results from these parallel sets of experiments were useful in fine tuning both the schemes of estimating the degree of loyalty of a customer. The project also provides useful insights for the development of more sophisticated measures for studying customer loyalty. It is hoped that the understanding of loyal customers will be helpful in identifying better marketing strategies. 1.
Data Mining: Trends and Issues
"... this paper addresses the database size problem. The methods are illustrated with a running example from a database of car test results. The second paper, by Lingras and Yao, seeks to remove some limitations the basic rough set model, which is based on the concept of an equivalence relation. The auth ..."
Abstract
- Add to MetaCart
this paper addresses the database size problem. The methods are illustrated with a running example from a database of car test results. The second paper, by Lingras and Yao, seeks to remove some limitations the basic rough set model, which is based on the concept of an equivalence relation. The authors show that when the type of accessibility relation used in the rough set model is more general, it is possible to derive rules for classification queries from incomplete databases. One generalization is called the non-symmetric rough set model; the other is called non-transitive rough set model. The generated rules are based on plausibility functions proposed by Shafer. In the paper by Choubey et al., the problem deriving rules for a classification query is investigated. The classifier given by the basic method of Pawlak is termed the lower classifier. This is generalized to yield upper and elementary set classifiers. Four algorithms for feature selection are proposed and experimentally compared, in the context of upper classifiers. The work addresses the problem of database size via feature selection heuristics and the problem of noisy environment by the adoption of the upper classifier. Their results suggest that, compared to the lower classifier, an upper classifier has some important features that make it suitable for data mining applications. In particular, it is shown that the upper classifier can be summarized at a desired level of abstraction by using extended decision tables. The use of extended decision tables is important for updating decision rules incrementally, when the database is dynamic. The fourth paper, by Wu, presents a heuristic, attribute-based program, called HCV (Version 2.0), for handling a classification query. It is based on the extension matri...
ILA-2: An Inductive Learning Algorithm over uncertain data
"... ABSTRACT AND CONCLUSION NEEDS TO BE RE-WRITTEN. ESPECIALLY WE SHOULD EMPHASIZE OUR CONTRIBUTION AND ORGINALITY OF THE WORK IN CONCLUSION. In this paper we describe the ILA-2 rule induction algorithm from the machine learning domain. ILA2 is the improved version of a novel inductive learning algorith ..."
Abstract
- Add to MetaCart
ABSTRACT AND CONCLUSION NEEDS TO BE RE-WRITTEN. ESPECIALLY WE SHOULD EMPHASIZE OUR CONTRIBUTION AND ORGINALITY OF THE WORK IN CONCLUSION. In this paper we describe the ILA-2 rule induction algorithm from the machine learning domain. ILA2 is the improved version of a novel inductive learning algorithm, namely ILA. We first describe the basic algorithm ILA, then present how the algorithm was improved. We also compare ILA-2 to a range of induction algorithms, including ILA. According to the empirical comparisons, ILA-2 appears to be comparable to CN2 and C4.5 algorithms in terms of output classifiers' accuracy and size. Keywords: Data Mining, Knowledge Discovery, Machine Learning, Inductive Learning, Rule Induction. 1. Introduction A data-mining process involves extracting valid, previously unknown, potentially useful, and comprehensible patterns from large databases. As described in~\cite{fayyad96,simoudis96}, this process is typically made up of selection and sampling, preprocessing and...
Bicimsel Kavram Analizinin Eslestirme Sorgularnda Uygulanmas (T UB ITAK IK INC I BASAMAK PROJE ONER IS I)
"... Bicimsel kavram analizi (BKA) son on yldan beri arastrmaclar tarafndan ilgi ceken bir konu olmustur. Kavramn matamatiksel nosyonu orijin olarak bicimsel mantga dayanmaktadr. Bununla birlikte kavram, bir deneysel nosyon olarak bir cok disiplinde sezgisel yaklasmlar temel alnarak tanmlanms ve evri ..."
Abstract
- Add to MetaCart
Bicimsel kavram analizi (BKA) son on yldan beri arastrmaclar tarafndan ilgi ceken bir konu olmustur. Kavramn matamatiksel nosyonu orijin olarak bicimsel mantga dayanmaktadr. Bununla birlikte kavram, bir deneysel nosyon olarak bir cok disiplinde sezgisel yaklasmlar temel alnarak tanmlanms ve evrim gecirmistir. Bu durum, zamanla teoriksel ve pratiksel uygulamalar arasnda bir bosluk dogurmustur. Bunun onemli nedenlerinden birisini asagdaki gibi sralayabiliriz. BKA verilen bir baglam ic inde basit (ya da atomik) kavramlar c izge kafes yapsn kullanarak modeller ve bilesik kavramlarn tanm ve aralarndaki iliskileri modellemede eksik kalr. Bu projede, biz Wille'nin BKA cercevesini basit kavramlarn yansra bilesik ve genel kavramlar kapsayacak sekilde genisletecek ve kavramlar aras kesin olmyan iliskileri inceliyecegiz. Teoriksel olarak bu calsmann ilginc olan yonu asagdaki gibi acklanabilir. Baglam uzay nesneler ve ozellikler arasndaki iliski geregince olusturulan bir cizge kafesi...
ILA-2: An Inductive Learning Algorithm for Knowledge Discovery
"... In this paper we describe the ILA-2 rule induction algorithm which is the improved version of a novel inductive learning algorithm, ILA. We first outline the basic algorithm ILA, and then present how the algorithm is improved using a new evaluation metric that handles uncertainty in the data. By usi ..."
Abstract
- Add to MetaCart
In this paper we describe the ILA-2 rule induction algorithm which is the improved version of a novel inductive learning algorithm, ILA. We first outline the basic algorithm ILA, and then present how the algorithm is improved using a new evaluation metric that handles uncertainty in the data. By using a new soft computing metric, users can reflect their preferences through a penalty factor to control the performance of the algorithm. ILA has also a faster pass criteria feature which reduces the processing time without sacrificing much from the accuracy that is not available in basic ILA. We experimentally show that the performance of ILA-2 is comparable to that of well-known inductive learning algorithms, namely CN2, OC1, ID3 and C4.5. Keywords: Data Mining, Knowledge Discovery, Machine Learning, Inductive Learning, Rule Induction. 1. Introduction A knowledge discovery process involves extracting valid, previously unknown, potentially useful, and comprehensible patterns from large ...

