Results 1 
4 of
4
Detecting group differences: Mining contrast sets
 Data Mining and Knowledge Discovery
, 2001
"... A fundamental task in data analysis is understanding the differences between several contrasting groups. These groups can represent different classes of objects, such as male or female students, or the same group over time, e.g. freshman students in 1993 through 1998. We present the problem of mini ..."
Abstract

Cited by 108 (3 self)
 Add to MetaCart
A fundamental task in data analysis is understanding the differences between several contrasting groups. These groups can represent different classes of objects, such as male or female students, or the same group over time, e.g. freshman students in 1993 through 1998. We present the problem of mining contrast sets: conjunctions of attributes and values that differ meaningfully in their distribution across groups. We provide a search algorithm for mining contrast sets with pruning rules that drastically reduce the computational complexity. Once the contrast sets are found, we postprocess the results to present a subset that are surprising to the user given what we have already shown. We explicitly control the probability of Type I error (false positives) and guarantee a maximum error rate for the entire analysis by using Bonferroni corrections.
Detecting Change in Categorical Data: Mining Contrast Sets
 In Proceedings of the Fifth International Conference on Knowledge Discovery and Data Mining
, 1999
"... A fundamental task in data analysis is understanding the differences between several contrasting groups. These groups can represent different classes of objects, such as male or female students, or the same group over time, e.g. freshman students in 1993 versus 1998. We present the problem of mining ..."
Abstract

Cited by 91 (5 self)
 Add to MetaCart
A fundamental task in data analysis is understanding the differences between several contrasting groups. These groups can represent different classes of objects, such as male or female students, or the same group over time, e.g. freshman students in 1993 versus 1998. We present the problem of mining contrastsets: conjunctions of attributes and values that differ meaningfully in their distribution across groups. We provide an algorithm for mining contrastsets as well as several pruning rules to reduce the computational complexity. Once the deviations are found, we postprocess the results to present a subset that are surprising to the user given what we have already shown. We explicitly control the probability of Type I error (false positives) and guarantee a maximum error rate for the entire analysis by using Bonferroni corrections. 1 Introduction A common question in exploratory research is: "How do several contrasting groups differ?" Learning about group differences is a central ...
The Strength of Statistical Evidence for Composite Hypotheses: Inference to the Best Explanation
, 2010
"... A general function to quantify the weight of evidence in a sample of data for one hypothesis over another is derived from the law of likelihood and from a statistical formalization of inference to the best explanation. For a fixed parameter of interest, the resulting weight of evidence that favors o ..."
Abstract

Cited by 19 (12 self)
 Add to MetaCart
A general function to quantify the weight of evidence in a sample of data for one hypothesis over another is derived from the law of likelihood and from a statistical formalization of inference to the best explanation. For a fixed parameter of interest, the resulting weight of evidence that favors one composite hypothesis over another is the likelihood ratio using the parameter value consistent with each hypothesis that maximizes the likelihood function over the parameter of interest. Since the weight of evidence is generally only known up to a nuisance parameter, it is approximated by replacing the likelihood function with a reduced likelihood function on the interest parameter space. Unlike the Bayes factor and unlike the pvalue under interpretations that extend its scope, the weight of evidence is coherent in the sense that it cannot support a hypothesis over any hypothesis that it entails. Further, when comparing the hypothesis that the parameter lies outside a nontrivial interval to the hypothesis that it lies within the interval, the proposed method of weighing evidence almost always asymptotically favors the correct hypothesis
Genetic Algorithm for the Improved Discovery of DNA Regulatory Elements
, 2007
"... 1.2 Transcription and Translation.................... 6 ..."