Results 1 
5 of
5
Mining Frequent Patterns with Counting Inference
 Sigkdd Explorations
, 2000
"... ACB(D,?E= A&F"=@F"<G?8&:H?E>CI J"FCA; 8:HKMLONQPR1NQSEDT:H; U:V; W 8GA&F XHYHU?</>Z71FC["?I\F"= 8; K]; ^>C8&; F"7VF*_8&:1?`D?I I W ab71FDc7d>*I J"F*A&; 8&:1K e = A&; F*A&;gfih:1; <F"= 8; K]; ^> ..."
Abstract

Cited by 112 (9 self)
 Add to MetaCart
ACB(D,?E= A&F"=@F"<G?8&:H?E>CI J"FCA; 8:HKMLONQPR1NQSEDT:H; U:V; W 8GA&F XHYHU?</>Z71FC["?I\F"= 8; K]; ^>C8&; F"7VF*_8&:1?`D?I I W ab71FDc7d>*I J"F*A&; 8&:1K e = A&; F*A&;gfih:1; <F"= 8; K]; ^>C8&; F"7; <j1>*<G?XF"7E>.7H?Dk<G8GA>C8&?J*lU>*I I ?X mHn*o opqrks&t*u rHogv r wxv rCypqpr@sp 8:1>C8TA?I ; ?<.F*7z8&:1?/UF"7HU?=H8{F*_c p} mHn*o opqrH~ f?9<G:1FD8&:@>C8]8&:H?9<GY1=H=(FCA&8xFC_`_ A?Y1?78x71F*7HWa?l =1>C8G8?A&7H<U>C7j@?x; _ ?AGA&?X_ AF*KM_ A&?bYH?7b8a?l=1>C8G8&?A&71<`DT; 8&: W F"Y 8E>*UU?<G<&; 71J98:H?ZX1>8>Cj@>C<&?"f\H=@?A&; KE?7b8&<`UF"KE=1>CA&; 71JLNP R1NS/8&F8&:1?T8: A&??`>*I J"F*A&; 8&:HK]< e = A&; F*A&;gB@,I F*<&?`>*71Xzz>CbWGZ; 71?AB <G:1FD8&:@>C8xLNQPR1NS; <]>*KEF"7HJ8&:1?ZKEF"<8?EU; ?7b8]>CI J"F*A&; 8&:HK]< _ FCA{KE; 7H; 71J`_ A?Y1?78T=1>C8G8?A&7H<f 1.
Discovering significant patterns
, 2007
"... Pattern discovery techniques, such as association rule discovery, explore large search spaces of potential patterns to find those that satisfy some userspecified constraints. Due to the large number of patterns considered, they suffer from an extreme risk of type1 error, that is, of finding patter ..."
Abstract

Cited by 58 (4 self)
 Add to MetaCart
(Show Context)
Pattern discovery techniques, such as association rule discovery, explore large search spaces of potential patterns to find those that satisfy some userspecified constraints. Due to the large number of patterns considered, they suffer from an extreme risk of type1 error, that is, of finding patterns that appear due to chance alone to satisfy the constraints on the sample data. This paper proposes techniques to overcome this problem by applying wellestablished statistical practices. These allow the user to enforce a strict upper limit on the risk of experimentwise error. Empirical studies demonstrate that standard pattern discovery techniques can discover numerous spurious patterns when applied to random data and when applied to realworld data result in large numbers of patterns that are rejected when subjected to sound statistical evaluation. They also reveal that a number of pragmatic choices about how such tests are performed can greatly affect their power.
Watermill: An Optimized Fingerprinting System for Databases,” http://watermill.sourceforge.net
, 2007
"... Abstract—This paper presents a watermarking/fingerprinting system for relational databases. It features a builtin declarative language to specify usability constraints that watermarked data sets must comply with. For a subset of these constraints, namely, weightindependent constraints, we propose ..."
Abstract

Cited by 10 (5 self)
 Add to MetaCart
(Show Context)
Abstract—This paper presents a watermarking/fingerprinting system for relational databases. It features a builtin declarative language to specify usability constraints that watermarked data sets must comply with. For a subset of these constraints, namely, weightindependent constraints, we propose a novel watermarking strategy that consists of translating them into an integer linear program. We show two watermarking strategies: an exhaustive one based on integer linear programming constraint solving and a scalable pairing heuristic. Fingerprinting applications, for which several distinct watermarks need to be computed, benefit from the reduced computation time of our method that precomputes the watermarks only once. Moreover, we show that our method enables practical collusionsecure fingerprinting since the precomputed watermarks are based on binary alterations located at exactly the same positions. The paper includes an indepth analysis of falsehit and falsemiss occurrence probabilities for the detection algorithm. Experiments performed on our open source software WATERMILL assess the watermark robustness against common attacks and show that our method outperforms the existing ones concerning the watermark embedding speed. Index Terms—Fingerprinting, relational databases, linear optimization, query optimization. Ç 1
SelfSufficient Itemsets: An Approach to Screening Potentially Interesting Associations Between Items
"... Selfsufficient itemsets are those whose frequency cannot explained solely by the frequency of either their subsets or of their supersets. We argue that itemsets that are not selfsufficient will often be of little interest to the data analyst, as their frequency should be expected once that of the ..."
Abstract

Cited by 9 (0 self)
 Add to MetaCart
Selfsufficient itemsets are those whose frequency cannot explained solely by the frequency of either their subsets or of their supersets. We argue that itemsets that are not selfsufficient will often be of little interest to the data analyst, as their frequency should be expected once that of the itemsets on which their frequency depends is known. We present statistical tests for statistically sound discovery of selfsufficient itemsets, and computational techniques that allow those tests to be applied as a postprocessing step for any itemset discovery algorithm. We also present a measure for assessing the degree of potential interest in an itemset that complements these statistical measures.
Statistically sound exploratory rule discovery
, 2004
"... Association rule discovery and other exploratory rule discovery techniques explore large search spaces of potential rules to find those that appear interesting by some userselected criterion of interestingness. Due to the large number of rules considered, they suffer from an extreme risk of type1 ..."
Abstract

Cited by 4 (2 self)
 Add to MetaCart
(Show Context)
Association rule discovery and other exploratory rule discovery techniques explore large search spaces of potential rules to find those that appear interesting by some userselected criterion of interestingness. Due to the large number of rules considered, they suffer from an extreme risk of type1 error, that is, of finding rules that appear due to chance alone to satisfy the interestingness criteria on the sample data. This paper proposes a technique to overcome this problem by using holdout data for statistical evaluation. Experiments demonstrate that standard exploratory rule discovery can result in large numbers of rules that are rejected when subjected to statistical evaluation on holdout data. They also reveal that modification of the rule discovery process to anticipate subsequent statistical evaluation can increase the number of rules that satisfy an interestingness criterion that are accepted by statistical evaluation on holdout data.