## High-order pattern discovery from discrete-valued data (1997)

Venue: | Issue: 6 , Nov.-Dec. 1997 Pages:877 – 893 |

Citations: | 11 - 3 self |

### BibTeX

@ARTICLE{Wong97high-orderpattern,

author = {Andrew K. C. Wong and Yang Wang},

title = {High-order pattern discovery from discrete-valued data},

journal = {Issue: 6 , Nov.-Dec. 1997 Pages:877 – 893},

year = {1997},

pages = {877--893}

}

### Years of Citing Articles

### OpenURL

### Abstract

Abstract—To uncover qualitative and quantitative patterns in a data set is a challenging task for research in the area of machine learning and data analysis. Due to the complexity of real-world data, high-order (polythetic) patterns or event associations, in addition to first-order class-dependent relationships, have to be acquired. Once the patterns of different orders are found, they should be represented in a form appropriate for further analysis and interpretation. In this paper, we propose a novel method to discover qualitative and quantitative patterns (or event associations) inherent in a data set. It uses the adjusted residual analysis in statistics to test the significance of the occurrence of a pattern candidate against its expectation. To avoid exhaustive search of all possible combinations of primary events, techniques of eliminating the impossible pattern candidates are developed. The detected patterns of different orders are then represented in an attributed hypergraph which is lucid for pattern interpretation and analysis. Test results on artificial and real-world data are discussed toward the end of the paper. Index Terms—Adjusted residual, attributed hypergraph, data analysis, database mining, machine learning, pattern discovery, pattern representation.

### Citations

679 | Knowledge Acquisition via Incremental Conceptual Clustering
- Fisher
- 1987
(Show Context)
Citation Context ... original version of ID3 [29], DISCON [24], and RUMMAGE [11], are monothetic in that they detect only relationships between two attributes. Some other systems, such as AQ [25], Cluster/2 [26], COBWEB =-=[12]-=-, ITRULE [31], and some Bayesian-based systems, are polythetic in that they consider the conjunction of more than two attributes. Due to the nature of most real world data, monothetic relations are in... |

114 | An interval classifier for database mining applications
- Agrawal, Ghosh, et al.
- 1992
(Show Context)
Citation Context ...been intensively studied recently. Although the accuracy of neural networks in learning is very promising comparing with the symbolic approaches, it is considered not suitable for data mining purpose =-=[1]-=-. In addition to the learning speed problem, the knowledge generated by neural networks is not explicitly represented in the forms of conceptual patterns understandable by humans [21]. To overcome the... |

106 |
Hypergraphs: Combinatorics of Finite Sets
- Berge
- 1989
(Show Context)
Citation Context ...ture algorithms can be adopted for the implementation of various operations, which help retrieve and/or reorganize the patterns encoded in an AHG. First, let us give a formal definition of hypergraph =-=[2]-=-. DEFINITION 7. Let Y = {y1, y2, , yn} be a finite set. A hypergraph on Y is a family H = (E1, E2, , Em) of subsets of Y such that 1) Ei › I (i = 1, 2, , m) m 2) U Ei = Y. i= 1 The elements y1, y2, , ... |

86 | An empirical comparison of ID3 and backpropagation
- Fisher, McKusick
- 1989
(Show Context)
Citation Context .... There are efficient systems available for detecting only the monothetic patterns, such as [4]. There are also systems for detecting polythetic patterns but, with them, exhaustive search is required =-=[14]-=-. Many well-known systems, including the original version of ID3 [29], DISCON [24], and RUMMAGE [11], are monothetic in that they detect only relationships between two attributes. Some other systems, ... |

56 |
Class-Dependent Discretization for Inductive Learning from Continuous and Mixed Mode Data
- Ching, Wong, et al.
- 1995
(Show Context)
Citation Context ...m discrete data sets. It should be noted, however, that the method can also handle other types (continuous or mixed-mode) of data using an appropriate discretization scheme, such as those reported in =-=[5]-=-, [37]. A current limitation of the present method (as well as many other methods) is that structural patterns, like those in the “rectangles” problem [32], still cannot be discovered. This limitation... |

40 | Constructor: a system for the induction of probabilistic models
- Fung, Crawford
(Show Context)
Citation Context ...nipulated according to certain probabilistic rules. Typical work is related to probabilistic network technology [28]. For example, to automatically induce a probabilistic model from data, CONSTRUCTOR =-=[16]-=- was proposed. It induces discrete Markov networks of arbitrary topology from data. These networks contain a quantitative (i.e., probabilistic) characterization and a qualitative (i.e., structural) de... |

33 |
Information discovery through hierarchical maximum entropy discretization and synthesis
- Chin, Wong, et al.
- 1991
(Show Context)
Citation Context ...j . (4) The standardized residuals have the property that 2 Â s z s x x j j (summation of every z j xs 2 in an |s|-way contingency table) is distributed as F 2 with ’ ( mi - 1) degrees iŒs of freedom =-=[8]-=-, [19], [20]. Also, as z x j s is the square root of F 2 , it has an asymptotic normal distribution with a mean of approximately zero and a variance of approximately one. Hence, if the absolute value ... |

20 |
Conceptual clustering, learning from examples, and inferences
- Fisher
- 1987
(Show Context)
Citation Context ...rules are exploited until evidence confirms that they are idiosyncratic. Optimistic learning has advantages in incremental learning [14], [15]. It is polythetic and able to learn from data with noise =-=[13]-=-. Because of the nature of optimistic learning, the concept tree generated by COBWEB might be very large and post-pruning techniques have to be applied. For deterministic pattern discovery, such as th... |

16 |
An architecture for probabilistic conceptbased information retrieval
- Fung, Crawford, et al.
- 1990
(Show Context)
Citation Context ...ed by other nodes outside the boundary. CONSTRUCTOR is reported to work well when tested with training sets generated from probabilistic models and with real data in information retrieval application =-=[17]-=-. When going to high-order cases, the contingency table introduces a heavy computation load. Furthermore, since it is a variable-oriented method, it would not discover dependency among events. It is w... |

12 |
C.: Synthesizing knowledge: A cluster analysis approach using event-covering
- Chiu, Wong
- 1986
(Show Context)
Citation Context ... probabilities, the category utility of the class is computed. The class that maximizes category utility after adding the new object is 1. This argument does not apply to some of the early work [35], =-=[7]-=- by the author.WONG AND WANG: HIGH-ORDER PATTERN DISCOVERY FROM DISCRETE-VALUED DATA 879 chosen as the class for that object and distributions are updated. COBWEB is an “optimistic” learning strategy... |

9 |
Information-theoretic rule induction
- Goodman, Smyth
(Show Context)
Citation Context ...es [33]. Smyth and Goodman [31] argue that (if-then) rules provide a much more flexible representation than tree structures, especially from the viewpoint of expert systems. Thus, ITRULE was proposed =-=[18]-=-, [31] to obtain rules directly from a data set. ITRULE recursively selects one attribute to be the right-hand result of a candidate rule and then searches the combinations of the propositions of othe... |

5 |
A Hierarchical Conceptual Clustering Algorithm
- Fisher
- 1984
(Show Context)
Citation Context ...re are also systems for detecting polythetic patterns but, with them, exhaustive search is required [14]. Many well-known systems, including the original version of ID3 [29], DISCON [24], and RUMMAGE =-=[11]-=-, are monothetic in that they detect only relationships between two attributes. Some other systems, such as AQ [25], Cluster/2 [26], COBWEB [12], ITRULE [31], and some Bayesian-based systems, are poly... |

4 |
APACS: A systems for automated pattern analysis and classification
- Chan, Wong
- 1990
(Show Context)
Citation Context ...aries. The second problem concerns the detection of polythetic patterns without relying on exhaustive search. There are efficient systems available for detecting only the monothetic patterns, such as =-=[4]-=-. There are also systems for detecting polythetic patterns but, with them, exhaustive search is required [14]. Many well-known systems, including the original version of ID3 [29], DISCON [24], and RUM... |

4 |
The Analysis of Residuals
- Haberman
(Show Context)
Citation Context ...ary event of this database occurs 500 times. The number of expected occurrences of the three-compound event [A = T, B = T, C = F] is 0.5 – 0.5 – 0.5 – 1,000 = 125. The standardized residual [3], [4], =-=[19]-=-, [20] is used to test the significance of the occurrence of this compound event. Suppose that the actual occurrence of this compound event is 250. The standardized residual is 11.18, larger than 1.96... |

3 |
Statistical guidance in symbolic learning
- Fisher
- 1990
(Show Context)
Citation Context ... updated. COBWEB is an “optimistic” learning strategy in that rules are exploited until evidence confirms that they are idiosyncratic. Optimistic learning has advantages in incremental learning [14], =-=[15]-=-. It is polythetic and able to learn from data with noise [13]. Because of the nature of optimistic learning, the concept tree generated by COBWEB might be very large and post-pruning techniques have ... |

1 |
Induction Learning in the Presence of Uncertainty
- Chan
- 1989
(Show Context)
Citation Context ...ariable-oriented method, it would not discover dependency among events. It is worth pointing out that systems dealing with event-based dependencies are more efficient than variable based dependencies =-=[3]-=-, [6], [30], [36]. 3 DEFINITIONS AND NOTATIONS Consider that we have a data set D containing M samples. Every sample is described in terms of N attributes, each of which can assume values in a corresp... |

1 |
Pattern Analysis Using Event-Covering
- Chiu
- 1986
(Show Context)
Citation Context ...le-oriented method, it would not discover dependency among events. It is worth pointing out that systems dealing with event-based dependencies are more efficient than variable based dependencies [3], =-=[6]-=-, [30], [36]. 3 DEFINITIONS AND NOTATIONS Consider that we have a data set D containing M samples. Every sample is described in terms of N attributes, each of which can assume values in a correspondin... |