## Calculating a New Data Mining Algorithm for Market Basket Analysis (2000)

### Cached

### Download Links

- [www.ipl.t.u-tokyo.ac.jp]
- [www.ipl.t.u-tokyo.ac.jp]
- [scholar.lib.vt.edu]
- DBLP

### Other Repositories/Bibliography

Citations: | 6 - 1 self |

### BibTeX

@MISC{Hu00calculatinga,

author = {Zhenjiang Hu and Wei-ngan Chin and Masato Takeichi},

title = {Calculating a New Data Mining Algorithm for Market Basket Analysis},

year = {2000}

}

### OpenURL

### Abstract

The general goal of data mining is to extract interesting correlated information from large

### Citations

3080 |
Uci repository of machine learning databases
- Blake, Merz
- 1998
(Show Context)
Citation Context ...in Haskell by representing sets using lists. The input sample data was extracted from the Richard Forsyth's zoological database, which is available in the UCI Repository of Machine Learning Databases =-=[BM98]-=-. It contains 17 objects (corresponding to 17 boolean attributes in the database) and 101 transactions (corresponding to 101 instances). We set the threshold to be 20 (20% of frequency), and did exper... |

2897 | R.: "Fast Algorithms for Mining Association Rules - Agrawal - 1994 |

2640 | Mining association rules between sets of items in large databases
- Agrawal, Imielinski, et al.
- 1993
(Show Context)
Citation Context ...tly enough - exceeding a given threshold. More concrete explanation of the problem can be found in Section 2. The most well-known classical algorithm for finding frequent set is the Apriori algorithm =-=[AIS93]-=- (from which many improved versions have been proposed) which relies on the property that a set can only be frequent if and only if all of its subsets are frequent. This algorithm builds a tree of fre... |

1354 | Introduction to Functional Programming - Bird - 1998 |

744 |
The Art of Computer Programming, volume 3: Sorting and Searching
- Knuth
- 1973
(Show Context)
Citation Context ...We can preprocess the database to fit our algorithm by transposing it through a single pass. This preprocessing can be done in an efficient way even for a huge database saved in external storage (see =-=[Knu97]-=-). In fact, as such preprocessing need only be done once for a given transaction database, we can easily amortize its costs over many data mining runs for the discovery of interesting information/rule... |

520 | Dynamic itemset counting and implication rules for market basket data
- Brin, Motwani, et al.
(Show Context)
Citation Context ...-- The database that records all transactions is likely to be very large, so it is often beneficial for as much information to be discovered from each pass, so as to reduce the total number of passes =-=[BMUT97]-=-. -- In each pass, we hope that counting can be done efficiently and less candidates are generated for later check. This has led to the studies of different pruning algorithms as in [Toi96, LK98]. Two... |

341 | New algorithms for fast discovery of association rules - Zaki, Parthasarathy, et al. |

227 |
Introduction to Functional Programming using Haskell
- Bird
(Show Context)
Citation Context ... the filter-map property (that is commonly used in program derivation e.g. [Bir84]): (p⊳) ◦ ((x :)∗) =((x :)∗) ◦ ((p ◦ (x :))⊳) (1) 1 We assume that the readers are familiar with the Haskell language =-=[Bir98]-=- in this paper. In addition, we say that our Haskell programs are “pseudo” in the sense that they include some additional notations for sets. 4sand the filter-pipeline property: (p⊳) ◦ (q⊳) =(λx.(p x ... |

201 | A short cut to deforestation - Gill, Launchbury, et al. - 1993 |

121 | Parallel and Distributed Association Mining: A Survey
- Zaki
- 1999
(Show Context)
Citation Context ...ransaction-id-lists 13sof the lexicographically first two (k − 1)-length subsets that share common prefix to reduce searching space. However, their approach still needs more than three database scans =-=[Zak99]-=-. Comparatively, our algorithm adopts a simpler strategy to reduce searching space by organizing the intermediate frequent sets in a tree with specific constraints on sibling nodes. As a nice conseque... |

106 | A new algorithm for discovering the maximum frequent itemset - Lin, Kedem |

93 | Shortcut deforestation in calculational form - Takano, Meijer - 1995 |

92 |
The promotion and accumulation strategies in transformational programming
- BIRD
- 1984
(Show Context)
Citation Context ...ion, we use the shorten notation: # to denote function length, and p/ to denote f ilter p. The filter operator enjoys the filter-element-map property (that is commonly used in program derivation e.g. =-=[Bir84]-=-): (p/) ffi ((x :)3) = ((x :)3) ffi ((p ffi (x :))/) and the filter-pipeline property: (p/) ffi (q/) = (x:(p xsq x)) / : In addition, xs `isSublist` ys is true if xs is a sublist of ys, and false othe... |

92 | Multiple Uses of Frequent Sets and Condensed Representations
- Mannila, Toivonen
- 1996
(Show Context)
Citation Context ... Such a study needs to take account of both the distribution as well as the size of data sample. Rather we use a simple experiment to compare our algorithm with an existing improved Apriori algorithm =-=[MT96]-=-, one of the best algorithms used in the data mining community. We start by considering the case of a small database which can be put in the memory. We tested three algorithms in Haskell: our initial ... |

71 | Fast Sequential and Parallel Algorithms for Association Rule Mining: A Comparision
- Mueller
- 1995
(Show Context)
Citation Context ... if #vss ≥ least then {[]} else {} fs (o : os) vss least = fs os vss least ∪ (o :)∗(fs os ((o ∈) ⊳vss) least) The benefits of our fusion optimization can be compared to the technique of pass-bundling =-=[Mue95]-=- which is used to eliminate some unnecessary candidates that end up infrequent in the partitioned parallel association rule algorithm. Compared to that in [Mue95], our study is more formal and general... |

69 | Tabulation Techniques for Recursive Programs - Bird - 1980 |

61 |
Safe fusion of functional expressions
- CHIN
- 1992
(Show Context)
Citation Context ...n. Specifically, we will derive an efficient program for finding frequent sets from the specification fs os vss least = (fsp vss least) / (subs os) by using the known calculation techniques of fusion =-=[Chi92]-=-, generalization (accumulation) [Bir84, HIT99], base-case filter promotion [Chi90], and tabulation [Bir80, CH95]. 5 3.1 Fusion Fusion is used to merge two passes (from nested recursive calls) into a s... |

52 | New parallel algorithms for fast discovery of association rules - Zaki, Parthasarathy, et al. - 1997 |

39 | Theories for Algorithm Calculation
- Jeuring
- 1993
(Show Context)
Citation Context ...ed program calculation [Bir89, BdM96], as opposed to simply program derivation. Many attempts have been made to apply the program calculation for the derivation of various kinds of efficient programs =-=[Jeu93]-=-, and for the construction of optimization passes of compilers [GLJ93, TM95]. However, people are still expecting more convincing and practical applications where program calculation can give a better... |

37 | Discovery of Frequent Patterns in Large Data Collections
- Toivonen
- 1996
(Show Context)
Citation Context ...th respect to the initial straightforward specification, because the whole derivation is done in a semantics-preserving manner. In contrast, the correctness of existing algorithms, well summarized in =-=[Toi96]-=-, are often proved in an ad-hoc manner. Simplicity Our derived algorithm is surprisingly simple, compared to the existing algorithms which pass over the database many times and use complicated and cos... |

35 |
Automatic methods for program transformation
- Chin
- 1990
(Show Context)
Citation Context ...m the specification fs os vss least = (fsp vss least) / (subs os) by using the known calculation techniques of fusion [Chi92], generalization (accumulation) [Bir84, HIT99], base-case filter promotion =-=[Chi90]-=-, and tabulation [Bir80, CH95]. 5 3.1 Fusion Fusion is used to merge two passes (from nested recursive calls) into a single one, by eliminating intermediate the data structure passing between the two ... |

34 | A.: Tupling Calculation Eliminates Multiple Data Traversals - Hu, Iwasaki, et al. - 1997 |

33 | W.N.: Parallelization in calculational forms
- Hu, Takeichi, et al.
- 1998
(Show Context)
Citation Context ...ets in P i+1 could be merged with frequent sets computed in P i . Note that this parallel algorithm can be obtained directly from the sequential program tab in Figure 1 by parallelization calculation =-=[HTC98]-=-, which is omitted here. Practical Issues The derived algorithm can be used practically to win over the existing algorithms. To be able to compare our results more convincingly with those in data mini... |

33 | A Calculational Fusion System HYLO - Onoue, Hu, et al. - 1997 |

17 | Calculating accumulations - Hu, Iwasaki, et al. - 1999 |

9 | A transformational method for dynamic-sized tabulation - Chin, Hagiya - 1995 |

5 | A compiler for HDC - Herrmann, Lengauer, et al. - 1999 |

4 | Program transformation in calculational form
- Takano, Hu, et al.
- 1998
(Show Context)
Citation Context ...n and tabulation. These calculation techniques are quite well-known in the functional programming community. This work is a continuation of our effort to apply calculational transformation techniques =-=[THT98]-=- to the development of efficient programs [OHIT97, HITT97, HTC98]. Our previous work put emphasis on mechanical implementation of the transformation techniques, while this paper shows that this calcul... |