## Ranking the Interestingness of Summaries from Data Mining Systems (1999)

Venue: | In Proceedings of the 12th Annual Florida Artificial Intelligence Research Symposium (FLAIRS'99 |

Citations: | 6 - 3 self |

### BibTeX

@INPROCEEDINGS{Hilderman99rankingthe,

author = {Robert J. Hilderman and Howard J. Hamilton and Brock Barber},

title = {Ranking the Interestingness of Summaries from Data Mining Systems},

booktitle = {In Proceedings of the 12th Annual Florida Artificial Intelligence Research Symposium (FLAIRS'99},

year = {1999},

pages = {100--106}

}

### Years of Citing Articles

### OpenURL

### Abstract

We study data mining where the task is description by summarization, the representation language is generalized relations, the evaluation criteria are based on heuristic measures of interestingness, and the method for searching is the Multi-Attribute Generalization algorithm for domain generalization graphs. We present and empirically compare four heuristics for ranking the interestingness of generalized relations (or summaries). The measures are based on common measures of the diversity of a population, statistical variance, the Simpson index, and the Shannon index. All four measures rank less complex summaries (i.e., those with few tuples and/or non-ANY attributes) as most interesting. Highly ranked summaries provide a reasonable starting point for further analysis of discovered knowledge.

### Citations

6134 |
The Mathematical Theory of Communication
- Shannon, Weaver
- 1949
(Show Context)
Citation Context ... which is the most common measure of variability used in statistics [22]. The I avg and I tot measures, based upon a relative entropy measure (also known as the Shannon index) from information theory =-=[23; 27]-=-, measure the average information content in a single tuple in a summary and the total information content in a summary, respectively. The I con measure, a variance-like measure based upon the Simpson... |

2460 | Mining association rules between sets of items in large databases
- Agrawal, Imielinski, et al.
- 1993
(Show Context)
Citation Context ... of interestingness measures are proposed in [18] that evaluate the coverage and certainty of a set of discovered implication rules that have previously been identified as potentially interesting. In =-=[1]-=-, transaction support, confidence, and syntactic constraints are proposed to construct rules from databases containing binary-valued attributes. A measure is proposed in [10] which determines the inte... |

366 | Knowledge Discovery in Databases: An Overview
- Frawley, Matheus, et al.
- 1991
(Show Context)
Citation Context ...ransformation, data mining to identify interesting patterns, interpretation and evaluation, and application [7]. The goal is to identify valid, previously unknown, potentially useful patterns in data =-=[7; 9]-=-. The data mining step requires the choice of four items: a data mining task (such as prediction, description, or anomaly detection), a representation language for patterns, evaluation criteria for pa... |

169 |
analysis, and presentation of strong rules
- Piatetsky-Shapiro, “Discovery
(Show Context)
Citation Context ...interestingness of summaries. Techniques for determining the interestingness of discovered knowledge have previously received some attention in the literature. A rule-interest function is proposed in =-=[20]-=- which prunes uninteresting implication rules based upon a statistical correlation threshold. In [2], two interestingness functions are proposed. The first function measures the difference between the... |

128 |
From Data Mining to Knowledge Discovery in
- Fayyad, Piatetsky-Shapiro, et al.
- 1996
(Show Context)
Citation Context ...ases includes these steps: data selection, cleaning and other preprocessing, reduction and transformation, data mining to identify interesting patterns, interpretation and evaluation, and application =-=[7]-=-. The goal is to identify valid, previously unknown, potentially useful patterns in data [7; 9]. The data mining step requires the choice of four items: a data mining task (such as prediction, descrip... |

91 | Knowledge discovery in textual databases
- Feldman, Dagan
- 1995
(Show Context)
Citation Context ... specified values in a pair of attributes and the proportion that would be expected if the values were statistically independent. A measure from information theory, called KL-distance, is proposed in =-=[8]-=- which measures the distance of the actual distribution of terms in text files from that of the expected distribution. KL-distance is also proposed in [12] for measuring the distance between the actua... |

78 |
Rule induction using information theory
- Smyth, Goodman
- 1990
(Show Context)
Citation Context ...f the expected distribution. KL-distance is also proposed in [12] for measuring the distance between the actual distribution of tuples in a summary to that of a uniform distribution of the tuples. In =-=[25]-=-, another measure from information theory is proposed which measures the average information content of a probabilistic rule. In [19], deviations are proposed which compare the difference between meas... |

64 |
Measurement of diversity
- Simpson
- 1949
(Show Context)
Citation Context ...re the average information content in a single tuple in a summary and the total information content in a summary, respectively. The I con measure, a variance-like measure based upon the Simpson index =-=[24]-=-, measures the extent to which the counts are distributed over the tuples in a summary, rather than being concentrated in any single one of them. The tuples in a summary are unique, and therefore, can... |

45 | Selecting among Rules Induced from a Hurricane Database
- Major, Mangano
- 1993
(Show Context)
Citation Context ...sed that measure the potential for knowledge discovery based upon the complexity of concept hierarchies associated with attributes in a database. A variety of interestingness measures are proposed in =-=[18]-=- that evaluate the coverage and certainty of a set of discovered implication rules that have previously been identified as potentially interesting. In [1], transaction support, confidence, and syntact... |

30 | On objective measures of rule surprisingness
- Freitas
- 1998
(Show Context)
Citation Context ...otentially interesting. In [1], transaction support, confidence, and syntactic constraints are proposed to construct rules from databases containing binary-valued attributes. A measure is proposed in =-=[10]-=- which determines the interestingness (called surprise there) of discovered knowledge via the explicit detection of occurrences of Simpson's paradox. Finally, an excellent survey of informationtheoret... |

22 |
Selecting and reporting what is interesting: The kefir application to healthcare data
- Matheus, Piatetsky-shapiro, et al.
(Show Context)
Citation Context ...s in a summary to that of a uniform distribution of the tuples. In [25], another measure from information theory is proposed which measures the average information content of a probabilistic rule. In =-=[19]-=-, deviations are proposed which compare the difference between measured values and some previously known or normative values. In [11], two interestingness measures are proposed that measure the potent... |

18 | Identifying relevant databases for multidatabase mining
- Liu, Lu, et al.
- 1998
(Show Context)
Citation Context ...lable in the public domain) and the Customer Database (a confidential database supplied by an industrial partner). The NSERC Research Awards Database, frequently used in previous data mining research =-=[3; 4; 11; 17]-=-, consists of 10,000 tuples in six tables describing a total of 22 attributes. The Customer Database, also frequently used in previous data mining research [6; 14; 16], consists of 8,000,000 tuples in... |

16 |
Efficient attribute-oriented algorithms for knowledge discovery from large databases
- Carter, Hamilton
(Show Context)
Citation Context ... requires the creation of a generalized relation (or summary) where specific attribute values in a relation are replaced with more general concepts according to user-defined concept hierarchies (CHs) =-=[5]-=-. If the original relation is the result of a database query, the generalized relation is a summary of these results, where, for example, names of particular laundry soaps might be replaced by the gen... |

15 | Parallel Knowledge Discovery Using Domain Generalization Graphs
- Hilderman, Hamilton, et al.
- 1997
(Show Context)
Citation Context ...imited in their ability to efficiently generate summaries when multiple CHs were associated with an attribute. To resolve this problem, we previously introduced new serial and parallel AOG algorithms =-=[12; 16]-=- and a data structure called a domain generalization graph (DGG) [12; 13; 16; 21]. A DGG for an attribute is a directed graph where each node represents a domain of values created by partitioning the ... |

14 | Mining market basket data using share measures and characterized itemsets
- Hilderman, Carter, et al.
- 1998
(Show Context)
Citation Context ...previous data mining research [3; 4; 11; 17], consists of 10,000 tuples in six tables describing a total of 22 attributes. The Customer Database, also frequently used in previous data mining research =-=[6; 14; 16]-=-, consists of 8,000,000 tuples in 22 tables describing a total of 56 attributes. The largest table contains over 3,300,000 tuples representing account activity for over 500,000 customer accounts and o... |

14 | Heuristics for ranking the interestingness of discovered knowledge
- Hilderman, Hamilton
- 1999
(Show Context)
Citation Context ...ons, the evaluation criteria are based on heuristic measures of interestingness, and the method for searching is the Multi-Attribute Generalization algorithm [12] for domain generalization graphs. In =-=[15]-=-, we proposed four heuristics, based upon information theory and statistics, for ranking the interestingness of summaries generated from a database. Preliminary results suggested that the order in whi... |

13 | Attribute Focusing: Machine-Assisted Knowledge Discovery Applied to
- Bhandari
- 1993
(Show Context)
Citation Context ...have previously received some attention in the literature. A rule-interest function is proposed in [20] which prunes uninteresting implication rules based upon a statistical correlation threshold. In =-=[2]-=-, two interestingness functions are proposed. The first function measures the difference between the number of tuples containing an attribute value and the number that would be expected if the values ... |

13 | Attribute-Oriented Induction Using Domain Generalization Graphs
- Hamilton, Hilderman, et al.
- 1996
(Show Context)
Citation Context ...imited in their ability to efficiently generate summaries when multiple CHs were associated with an attribute. To resolve this problem, we previously introduced new serial and parallel AOG algorithms =-=[12; 16]-=- and a data structure called a domain generalization graph (DGG) [12; 13; 16; 21]. A DGG for an attribute is a directed graph where each node represents a domain of values created by partitioning the ... |

11 | Performance Evaluation of Attribute-Oriented Algorithms for Knowledge Discovery from Databases
- Carter, Hamilton
- 1995
(Show Context)
Citation Context ...lable in the public domain) and the Customer Database (a confidential database supplied by an industrial partner). The NSERC Research Awards Database, frequently used in previous data mining research =-=[3; 4; 11; 17]-=-, consists of 10,000 tuples in six tables describing a total of 22 attributes. The Customer Database, also frequently used in previous data mining research [6; 14; 16], consists of 8,000,000 tuples in... |

11 |
Introduction to probability and statistics for scientists and engineers
- Rosenkrantz
- 1997
(Show Context)
Citation Context ...e application in several areas of the physical, social, management, and computer sciences. The I var measure is based upon variance, which is the most common measure of variability used in statistics =-=[22]-=-. The I avg and I tot measures, based upon a relative entropy measure (also known as the Shannon index) from information theory [23; 27], measure the average information content in a single tuple in a... |

8 |
incremental generalization and regeneralization for knowledge discovery from databases
- Fast
- 1995
(Show Context)
Citation Context ...lable in the public domain) and the Customer Database (a confidential database supplied by an industrial partner). The NSERC Research Awards Database, frequently used in previous data mining research =-=[3; 4; 11; 17]-=-, consists of 10,000 tuples in six tables describing a total of 22 attributes. The Customer Database, also frequently used in previous data mining research [6; 14; 16], consists of 8,000,000 tuples in... |

8 | On information-theoretic measures of attribute importance
- Yao, Wong, et al.
- 1999
(Show Context)
Citation Context ...red knowledge via the explicit detection of occurrences of Simpson's paradox. Finally, an excellent survey of informationtheoretic measures for evaluating the importance of attributes is described in =-=[26]-=-. Although our measures were developed and utilized for ranking the interestingness of generalized relations as described earlier in this section, they are more generally applicable to other problem d... |

7 |
Share-based measures for itemsets
- Carter, Hamilton, et al.
- 1997
(Show Context)
Citation Context ... 0.11935 22 Experimental Results In this section, we present experimental results which contrast the various interestingness measures. All summaries in our experiments were generated using DBDiscover =-=[5; 6]-=-, a software tool which uses AOG for KDD. DB-Discover was run on a Silicon Graphics Challenge M, with twelve 150 MHz MIPS R4400 CPUs, using Oracle Release 7.3 for database management. Description of D... |

6 |
Theory and
- Young, Hamer
- 1994
(Show Context)
Citation Context ... which is the most common measure of variability used in statistics [22]. The I avg and I tot measures, based upon a relative entropy measure (also known as the Shannon index) from information theory =-=[23; 27]-=-, measure the average information content in a single tuple in a summary and the total information content in a summary, respectively. The I con measure, a variance-like measure based upon the Simpson... |

4 |
Measuring the potential for knowledge discovery in databases with DBLearn
- Hamilton, Fudger
- 1995
(Show Context)
Citation Context ...asures the average information content of a probabilistic rule. In [19], deviations are proposed which compare the difference between measured values and some previously known or normative values. In =-=[11]-=-, two interestingness measures are proposed that measure the potential for knowledge discovery based upon the complexity of concept hierarchies associated with attributes in a database. A variety of i... |

3 | Generalization lattices
- Hamilton, Hilderman, et al.
- 1998
(Show Context)
Citation Context ...Hs were associated with an attribute. To resolve this problem, we previously introduced new serial and parallel AOG algorithms [12; 16] and a data structure called a domain generalization graph (DGG) =-=[12; 13; 16; 21]-=-. A DGG for an attribute is a directed graph where each node represents a domain of values created by partitioning the original domain for the attribute, and each edge represents a generalization rela... |

2 | Temporal generalization with domain generalization graphs
- Randall, Hamilton, et al.
(Show Context)
Citation Context ...Hs were associated with an attribute. To resolve this problem, we previously introduced new serial and parallel AOG algorithms [12; 16] and a data structure called a domain generalization graph (DGG) =-=[12; 13; 16; 21]-=-. A DGG for an attribute is a directed graph where each node represents a domain of values created by partitioning the original domain for the attribute, and each edge represents a generalization rela... |