## Mining optimized gain rules for numeric attributes (1999)

### Cached

### Download Links

- [pages.cpsc.ucalgary.ca]
- [cs.kaist.ac.kr]
- [static.googleusercontent.com]
- [gkmc.utah.edu]
- DBLP

### Other Repositories/Bibliography

Venue: | Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining |

Citations: | 11 - 0 self |

### BibTeX

@INPROCEEDINGS{Brin99miningoptimized,

author = {Sergey Brin and Rajeev Rastogi and Kyuseok Shim},

title = {Mining optimized gain rules for numeric attributes},

booktitle = {Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining},

year = {1999},

pages = {135--144},

publisher = {ACM Press}

}

### Years of Citing Articles

### OpenURL

### Abstract

Association rules are useful for determining correlations between attributes of a relation and have applications in marketing, nancial and retail sectors. Furthermore, optimized association rules are an e ective way to focus on the most interesting characteristics involving certain attributes. Optimized association rules are permitted to contain uninstantiated attributes and the problem is to determine instantiations such that either the support, con dence or gain of the rule is maximized. In this paper, we generalize the optimized gain association rule problem by permitting rules to contain disjunctions over uninstantiated numeric attributes. Our generalized association rules enable us to extract more useful information about seasonal and local patterns involving the uninstantiated attribute. For rules containing a single numeric attribute, we present an algorithm with linear complexity for computing optimized gain rules. Furthermore, we propose a bucketing technique that can result in a signi cant reduction in input size by coalescing contiguous values without sacri cing optimality. We also present an approximation algorithm based on dynamic programming for two numeric attributes. Using recent results on binary space partitioning trees, we show that the approximations are within a constant factor of the optimal optimized gain rules. Our experimental results with synthetic data sets for a single numeric attribute demonstrate that our algorithm scales up linearly with the attribute's domain size as well as the number of disjunctions. In addition, we show that applying our optimized rule framework to a population survey real-life data set enables us to discover interesting underlying correlations among the attributes.

### Citations

2665 | Fast Algorithms for Mining Association Rules - Agrawal, Srikant - 1994 |

453 | Mining generalized association rules - Agrawal, Srikant - 1995 |

376 | Discovery of Multiple-Level Association Rules from Large Databases - Han, Fu - 1995 |

373 | An Efficient Algorithm for Mining Association Rules in Large Databases - Savasere, Omiecinski, et al. - 1995 |

349 | Mining quantitative association rules in large relational tables - Srikant, Agrawal - 1996 |

210 | Mining Association Rules between Sets of Items - Agrawal, Imielinski, et al. - 1993 |

202 | An effective hash based algorithm for mining association rules - Park, Chen, et al. - 1995 |

137 | Optimal histograms with quality guarantees - Jagadish, Koudas, et al. - 1998 |

125 | Mining the most interesting rules
- Bayardo, Agrawal
- 1999
(Show Context)
Citation Context ...the attribute domain -- in contrast, the histogram construction algorithm of [JKM + 98] has time complexity that is quadratic in the number of distinct values of the attribute under consideration. In =-=[BA99]-=-, the authors propose a general framework for optimized rule mining, which can be used to express our optimized gain problem as a special case. However, the generality precludes the development of eff... |

117 | Data mining using two-dimensional optimized association rules: scheme, algorithms, and visualization - Fukuda, Morimoto, et al. - 1996 |

115 | Clustering association rules
- Lent, Swami, et al.
- 1997
(Show Context)
Citation Context .... This made the problem tractable for the one attribute case. 3 Schemes for clustering quantitative association rules with two uninstantiated numeric attributes in the left hand side are presented in =-=[LSW97]-=-. For a given support and confidence, the authors present a clustering algorithm to generate a set of non-overlapping rectangles, such that every point in each rectangle has the required confidence an... |

114 |
Tomasz ImieliĆski, and Arun Swami. Mining association rules between sets of items in large databases
- Agrawal
- 1993
(Show Context)
Citation Context ...lations among the attributes. Keywords: Association rules, support, confidence, gain, dynamic programming, region bucketing, binary space partitioning. 1 Introduction Association rules, introduced in =-=[AIS93]-=-, provide a useful mechanism for discovering correlations among the underlying data and have applications in marketing, financial and retail sectors. In its most general form, an association rule can ... |

82 | Mining Optimized Association Rules for Numeric Attributes
- Fukuda, Morimoto, et al.
- 1999
(Show Context)
Citation Context ... of the calls that originated from NY are to France. 1.1 Optimized Association Rules The optimized association rules problem, motivated by applications in marketing and advertising, was introduced in =-=[FMMT96a]-=-. An association rule R has the form (A 1 2 [l 1 ; u 1 ])sC 1 ! C 2 , where A 1 is a numeric attribute, l 1 and u 1 are uninstantiated variables, and C 1 and C 2 contain only instantiated conditions (... |

49 | Verkamo A I., "Efficient algorithms for discovering association rules - Mannila, Toivonen - 1994 |

38 | On Approximating Rectangle Tiling and Packing
- Khanna, Muthukrishnan, et al.
- 1998
(Show Context)
Citation Context ...are two uninstantiated numeric attributes. In this case, we need to compute a set of k non-overlapping rectangles in two-dimensional space whose gain is maximum. Unfortunately, this problem in NPhard =-=[KMP98]-=-. In the following subsection, we describe a dynamic programming algorithm with polynomial time complexity that computes approximations to optimized sets. 10 procedure optGain2D((i; j); (p; q); k) beg... |

22 | Mining optimized support rules for numeric attributes
- Rastogi, Shim
(Show Context)
Citation Context ...stantiated numeric attributes. Thus, unlike [FMMT96a] and [FMMT96b], that only compute a single optimal region, our generalized rules enable upto k optimal regions to be computed. Furthermore, unlike =-=[RS99]-=-, in which we only addressed the optimized support problem, in this paper, we focus on the optimized gain problem and consider both the one and two attribute cases. In addition, for rules containing a... |

18 |
On the Optimal Binary Plane Partition for Sets of Isothetic Rectangles
- Amore, Franciosa
- 1992
(Show Context)
Citation Context ...lt in order to show that in the general case, the approximate optimized gain set computed by procedure optGain2D is within a factor of 1 4 of the optimized gain set. The proof also uses a result from =-=[AF92]-=-, in which it is shown that for any set of rectangles in a plane, there exists a binary space partitioning (that is, a recursive partitioning) of the plane such that each rectangle is cut into at most... |

12 |
Mining Optimized Association Rules for Categorical and Numeric
- Rastogi, Shim
- 1998
(Show Context)
Citation Context ... presented a dynamic programming algorithm for computing the optimized support rule and whose complexity is O(n 2 k), where n is the number of values in the domain of the uninstantiated attribute. In =-=[RS98]-=-, we considered a different formulation of the optimized support problem which we showed to be NP-hard even for the case of one uninstantiated attribute. The optimized support problem described in [RS... |

6 |
Rakesh Agrawal, and Dimitrios Gunopulos. Constraintbased rule mining in large, dense databases
- Bayardo
(Show Context)
Citation Context ...ed gain problem as a special case. However, the generality precludes the development of efficient algorithms for computing optimized rules. Specifically, the authors use a variant of Dense-Miner from =-=[BAG97]-=- which essentially relies on enumerating optimized rules in order to explore the search space. Since there are an exponential number of optimized rules, the authors propose pruning strategies to reduc... |

5 | Inkeri Verkamo. E cient algorithms for discovering association rules - Mannila, Toivonen, et al. - 1994 |

3 | Optimal Histograms with Quality Guarantees - Sevcik, Suel - 1998 |

3 | received the BTech degree in computer science from the Indian Institute of Technology, Bombay, in 1988, and the master's and PhD degrees in computer science from the University of Texas - Rastogi - 1990 |

1 | AS94] Rakesh Agrawal and Ramakrishnan Srikant. Fast algorithms for mining association rules - Tokuyama - 1994 |

1 | Mining optimized association rules for numeric attributes - Tokuyama - 1996 |

1 | they founded Google, Inc. in 1998. He is a recipient of a US National Science Foundation Graduate Fellowship - Together |

1 | been a featured speaker at a number of national and international academic, business, and technology forums, including the Academy of American Achievement, European Technology Forum, Technology, Entertainment and Design, and Silicon Alley, 2001. His resea - has |