Results 1 - 10
of
10
A tree projection algorithm for generation of frequent itemsets
- Journal of Parallel and Distributed Computing
, 2000
"... In this paper we propose algorithms for generation of frequent itemsets by successive construction of the nodes of a lexicographic tree of itemsets. We discuss di erent strategies in generation and traversal of the lexicographic tree such as breadth- rst search, depth- rst search or a combination of ..."
Abstract
-
Cited by 123 (0 self)
- Add to MetaCart
In this paper we propose algorithms for generation of frequent itemsets by successive construction of the nodes of a lexicographic tree of itemsets. We discuss di erent strategies in generation and traversal of the lexicographic tree such as breadth- rst search, depth- rst search or a combination of the two. These techniques provide di erent trade-o s in terms of the I/O, memory and computational time requirements. We use the hierarchical structure of the lexicographic tree to successively project transactions at each node of the lexicographic tree, and use matrix counting on this reduced set of transactions for nding frequent itemsets. We tested our algorithm on both real and synthetic data. We provide an implementation of the tree projection method which is up to one order of magnitude faster than other recent techniques in the literature. The algorithm has a well structured data access pattern which provides data locality and reuse of data for multiple levels of the cache. We also discuss methods for parallelization of the
A New Framework For Itemset Generation
- In: PODS 98, Symposium on Principles of Database Systems
, 1998
"... The problem of finding association rules in a large database of sales transactions has been widely studied in the literature. We discuss some of the weaknesses of the large itemset method for association rule generation. A different method for evaluating and finding itemsets referred to as strongly ..."
Abstract
-
Cited by 53 (3 self)
- Add to MetaCart
The problem of finding association rules in a large database of sales transactions has been widely studied in the literature. We discuss some of the weaknesses of the large itemset method for association rule generation. A different method for evaluating and finding itemsets referred to as strongly collective itemsets is proposed. The concepts of "support" of an itemset and correlation of the items within an itemset are related, though not quite the same. This criterion stresses the importance of the actual correlation of the items with one another rather than the absolute support. Previously proposed methods to provide correlated itemsets are not necessarily applicable to very large databases. We provide an algorithm which provides very good computational efficiency, while maintaining statistical robustness. The fact that this algorithm relies on relative measures rather than absolute measures such as support also implies that the method can be applied to find association rules in datasets in which items may appear in a sizeable percentage of the transactions (dense datasets), datasets in which the items have varying density, or even negative association rules.
Online Generation of Association Rules
- IBM Research Division, T.J. Watson Research
, 1998
"... We have a large database consisting of sales transactions. We investigate the problem of online mining of association rules in this large database. We show how to preprocess the data effectively in order to make it suitable for repeated online queries. The preprocessing algorithm takes into account ..."
Abstract
-
Cited by 47 (4 self)
- Add to MetaCart
We have a large database consisting of sales transactions. We investigate the problem of online mining of association rules in this large database. We show how to preprocess the data effectively in order to make it suitable for repeated online queries. The preprocessing algorithm takes into account the storage space available. We store the preprocessed data in such a way that online processing may be done by applying a graph theoretic search algorithm whose complexity is proportional to the size of the output. This results in an online algorithm which is practically instantaneous in terms of response time. The algorithm also supports techniques for quickly discovering association rules from large itemsets. The algorithm is capable of finding rules with specific items in the antecedent or consequent. These association rules are presented in a compact form, eliminating redundancy. We believe that the elimination of redundancy in online generation of association rules from large itemsets is interesting in its own right.
Finding localized associations in market basket data
- Knowledge and Data Engineering
, 2002
"... In this paper, we discuss a technique for discovering localized associations in segments of the data using clustering. Often the aggregate behavior of a data set may be very di erent from localized segments. In such cases, it is desirable to design algorithms which are e ective in discovering locali ..."
Abstract
-
Cited by 25 (0 self)
- Add to MetaCart
In this paper, we discuss a technique for discovering localized associations in segments of the data using clustering. Often the aggregate behavior of a data set may be very di erent from localized segments. In such cases, it is desirable to design algorithms which are e ective in discovering localized associations, because they expose a customer pattern which is more speci c than the aggregate behavior. This information may bevery useful for target marketing. We present empirical results which show that the method is indeed able to nd a signi cantly larger number of associations than what can be discovered by analysis of the aggregate data.
A New Method for Similarity Indexing of Market Basket Data
- In Proc. 1999 ACM SIGMOD Int. Conf. on Management of data
, 1999
"... In recent years, many data mining methods have been proposed for finding useful and structured information from market basket data. The association rule model was recently proposed in order to discover useful patterns and dependencies in such data. This paper discusses a method for indexing market b ..."
Abstract
-
Cited by 19 (0 self)
- Add to MetaCart
In recent years, many data mining methods have been proposed for finding useful and structured information from market basket data. The association rule model was recently proposed in order to discover useful patterns and dependencies in such data. This paper discusses a method for indexing market basket data efficiently for similarity search. The technique is likely to be very useful in applications which utilize the similarity in customer buying behavior in order to make peer recommendations. We propose an index called the signature table, which is very flexible in supporting a wide range of similarity functions. The construction of the index structure is independent of the similarity function, which can be specified at query time. The resulting similarity search algorithm shows excellent scalability with increasing memory availability and database size.
Mining Large Itemsets for Association Rules
- Bulletin of the IEEE Computer Society Technical Comittee on Data Engineering
, 1998
"... This paper provides a survey of the itemset method for association rule generation. The paper discusses past research on the topic and also studies the relevance and importance of the itemset method in generating association rules. We discuss a number of variations of the association rule problem wh ..."
Abstract
-
Cited by 11 (0 self)
- Add to MetaCart
This paper provides a survey of the itemset method for association rule generation. The paper discusses past research on the topic and also studies the relevance and importance of the itemset method in generating association rules. We discuss a number of variations of the association rule problem which have been proposed in the literature and their practical applications. Some inherent weaknesses of the large itemset method for association rule generation have been explored. We also discuss some other formulations of associations which can be viable alternatives to the traditional association rule generation method. 1 Introduction Association rules find the relationships between the different items in a database of sales transactions. Such rules track the buying patterns in consumer behavior eg. finding how the presence of one item in the transaction affects the presence of another and so forth. The problem of association rule generation has recently gained considerable prominence in ...
Data Mining Techniques for Personalization
- Data Engineering Bulletin
, 2000
"... This paper discusses an overview of data mining techniques for personalization. It discusses some of the standard techniques which are used in order to adapt and increase the ability of the system to tailor itself to specific user behavior. We discuss several such techniques such as collaborative fi ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
This paper discusses an overview of data mining techniques for personalization. It discusses some of the standard techniques which are used in order to adapt and increase the ability of the system to tailor itself to specific user behavior. We discuss several such techniques such as collaborative filtering, content based methods, and content based collaborative filtering methods. We examine the specific applicability of these techniques to various scenarios and the broad advantages of each in specific situations. 1
Design and Evaluation of Visualization Support to Facilitate Association Rules Modeling
"... Association rules mining is a popular data mining modeling tool. It discovers interesting associations or correlation relationships among a large set of data items, showing attribute values that occur frequently together in a given dataset. Despite their great potential benefit, current association ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Association rules mining is a popular data mining modeling tool. It discovers interesting associations or correlation relationships among a large set of data items, showing attribute values that occur frequently together in a given dataset. Despite their great potential benefit, current association rules modeling tools are far from optimal. This article studies how visualization techniques can be applied to facilitate the association rules modeling process, particularly what visualization elements should be incorporated and how they can be displayed. Original designs for visualization of rules, integration of data and rule visualizations, and visualization of rule derivation process for supporting interactive visual association rules modeling are proposed in this research. Experimental results indicated that, compared to an automatic association rules modeling process, the proposed interactive visual association rules modeling can significantly improve the effectiveness of modeling, enhance understanding of the applied algorithm, and bring users greater satisfaction with the task. The proposed integration of data and rule visualizations can significantly facilitate understanding rules compared to their nonintegrated counterpart. 1.
Finding Profile Association Rules
, 1998
"... We have a large database consisting of user profile information together with behavioral patterns. We introduce the concept of profile association rules, which discusses the problem of relating consumer buying behavior to behavioral information. We investigate the problem of online mining of profile ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
We have a large database consisting of user profile information together with behavioral patterns. We introduce the concept of profile association rules, which discusses the problem of relating consumer buying behavior to behavioral information. We investigate the problem of online mining of profile association rules in this large database. We show how to use multidimensional indexing structures in order to actually perform the mining. The use of multidimensional indexing structures to perform profile mining provides considerable advantages in terms of the ability to perform very generic range based online queries.
An Efficient Algorithm for Mining Maximal Frequent Item Sets
, 2008
"... Problem Statement: In today’s life, the mining of frequent patterns is a basic problem in data mining applications. The algorithms which are used to generate these frequent patterns must perform efficiently. The objective was to propose an effective algorithm which generates frequent patterns in le ..."
Abstract
- Add to MetaCart
Problem Statement: In today’s life, the mining of frequent patterns is a basic problem in data mining applications. The algorithms which are used to generate these frequent patterns must perform efficiently. The objective was to propose an effective algorithm which generates frequent patterns in less time. Approach: We proposed an algorithm which was based on hashing technique and combines a vertical tidset representation of the database with effective pruning mechanisms. It removes all the non-maximal frequent item-sets to get exact set of MFI directly. It worked efficiently when the number of item-sets and tid-sets is more. Results: The performance of our algorithm had been compared with recently developed MAFIA algorithm and the results show how our algorithm gives better performance. Conclusions: Hence, the proposed algorithm performs effectively and generates frequent patterns faster.

