Results 1  10
of
43
Beyond Market Baskets: Generalizing Association Rules To Dependence Rules
, 1998
"... One of the more wellstudied problems in data mining is the search for association rules in market basket data. Association rules are intended to identify patterns of the type: “A customer purchasing item A often also purchases item B. Motivated partly by the goal of generalizing beyond market bask ..."
Abstract

Cited by 489 (7 self)
 Add to MetaCart
One of the more wellstudied problems in data mining is the search for association rules in market basket data. Association rules are intended to identify patterns of the type: “A customer purchasing item A often also purchases item B. Motivated partly by the goal of generalizing beyond market basket data and partly by the goal of ironing out some problems in the definition of association rules, we develop the notion of dependence rules that identify statistical dependence in both the presence and absence of items in itemsets. We propose measuring significance of dependence via the chisquared test for independence from classical statistics. This leads to a measure that is upwardclosed in the itemset lattice, enabling us to reduce the mining problem to the search for a border between dependent and independent itemsets in the lattice. We develop pruning strategies based on the closure property and thereby devise an efficient algorithm for discovering dependence rules. We demonstrate our algorithm’s effectiveness by testing it on census data, text data (wherein we seek term dependence), and synthetic data.
Information Theoretic Measures for Clusterings Comparison: Is a Correction for Chance Necessary?
"... Information theoretic based measures form a fundamental class of similarity measures for comparing clusterings, beside the class of paircounting based and setmatching based measures. In this paper, we discuss the necessity of correction for chance for information theoretic based measures for clust ..."
Abstract

Cited by 34 (1 self)
 Add to MetaCart
Information theoretic based measures form a fundamental class of similarity measures for comparing clusterings, beside the class of paircounting based and setmatching based measures. In this paper, we discuss the necessity of correction for chance for information theoretic based measures for clusterings comparison. We observe that the baseline for such measures, i.e. average value between random partitions of a data set, does not take on a constant value, and tends to have larger variation when the ratio between the number of data points and the number of clusters is small. This effect is similar in some other noninformation theoretic based measures such as the wellknown Rand Index. Assuming a hypergeometric model of randomness, we derive the analytical formula for the expected mutual information value between a pair of clusterings, and then propose the adjusted version for several popular information theoretic based measures. Some examples are given to demonstrate the need and usefulness of the adjusted measures. 1.
Quantifying and visualizing attribute interactions: An approach based on entropy
 http://arxiv.org/abs/cs.AI/0308002 v3
, 2004
"... Interactions are patterns between several attributes in data that cannot be inferred from any subset of these attributes. While mutual information is a wellestablished approach to evaluating the interactions between two attributes, we surveyed its generalizations as to quantify interactions between ..."
Abstract

Cited by 25 (4 self)
 Add to MetaCart
Interactions are patterns between several attributes in data that cannot be inferred from any subset of these attributes. While mutual information is a wellestablished approach to evaluating the interactions between two attributes, we surveyed its generalizations as to quantify interactions between several attributes. We have chosen McGill’s interaction information, which has been independently rediscovered a number of times under various names in various disciplines, because of its many intuitively appealing properties. We apply interaction information to visually present the most important interactions of the data. Visualization of interactions has provided insight into the structure of data on a number of domains, identifying redundant attributes and opportunities for constructing new features, discovering unexpected regularities in data, and have helped during construction of predictive models; we illustrate the methods on numerous examples. A machine learning method that disregards interactions may get caught in two traps: myopia is caused by learning algorithms assuming independence in spite of interactions, whereas fragmentation arises from assuming an interaction in spite of independence.
Efficient estimation in the bivariate normal copula model: normal margins are least favorable
 BERNOULLI
, 1997
"... Consider semiparametric bivariate copula models in which the family of copula functions is parametrized by a Euclidean parameter of interest and in which the two unknown marginal distributions are the (infinite dimensional) nuisance parameters. The efficient score for can be characterized in terms ..."
Abstract

Cited by 22 (2 self)
 Add to MetaCart
Consider semiparametric bivariate copula models in which the family of copula functions is parametrized by a Euclidean parameter of interest and in which the two unknown marginal distributions are the (infinite dimensional) nuisance parameters. The efficient score for can be characterized in terms of the solutions of two coupled SturmLiouville equations. In case the family of copula functions corresponds to the normal distributions with mean 0, variance 1, and correlation, the solution of these equations is given, and we thereby show that the Van der Waerden normal scores rank correlation coe cient is asymptotically efficient. We also show that the bivariate normal model with equal variances constitutes the least favorable parametric submodel. Finally, we discuss the interpretation of j j in the normal copula model as the maximum (monotone) correlation coefficient.
The uniqueness of a good optimum for kmeans
 In ICML
, 2006
"... If we have found a ”good ” clustering C of a data set, can we prove that C is not far from the (unknown) best clustering Copt of these data? Perhaps surprisingly, the answer to this question is sometimes yes. When “goodness ” is measured by the distortion of Kmeans clustering, this paper proves spe ..."
Abstract

Cited by 20 (3 self)
 Add to MetaCart
If we have found a ”good ” clustering C of a data set, can we prove that C is not far from the (unknown) best clustering Copt of these data? Perhaps surprisingly, the answer to this question is sometimes yes. When “goodness ” is measured by the distortion of Kmeans clustering, this paper proves spectral bounds on the distance d(C, Copt). The bounds exist in the case when the data admits a low distortion clustering. 1.
Gibbs sampling, exponential families and orthogonal polynomials
 Statistical Sciences
, 2008
"... Abstract. We give families of examples where sharp rates of convergence to stationarity of the widely used Gibbs sampler are available. The examples involve standard exponential families and their conjugate priors. In each case, the transition operator is explicitly diagonalizable with classical ort ..."
Abstract

Cited by 19 (6 self)
 Add to MetaCart
Abstract. We give families of examples where sharp rates of convergence to stationarity of the widely used Gibbs sampler are available. The examples involve standard exponential families and their conjugate priors. In each case, the transition operator is explicitly diagonalizable with classical orthogonal polynomials as eigenfunctions. Key words and phrases: Gibbs sampler, running time analyses, exponential families, conjugate priors, location families, orthogonal polynomials, singular value decomposition. 1.
L.Xu. Regularized spectral learning
 Proceedings of the Artificial Intelligence and Statistics Workshop(AISTATS 05
, 2005
"... Spectral clustering is a technique for finding groups in data consisting of similarities Sij between pairs of points. We approach the problem of learning the similarity as a function of other observed features, in order to optimize spectral clustering results on future data. This paper formulates a ..."
Abstract

Cited by 9 (3 self)
 Add to MetaCart
Spectral clustering is a technique for finding groups in data consisting of similarities Sij between pairs of points. We approach the problem of learning the similarity as a function of other observed features, in order to optimize spectral clustering results on future data. This paper formulates a new objective for learning in spectral clustering, that balances a clustering accuracy term, the gap, and a stability term, the eigengap with the later in the role of a regularizer. We derive an algorithm to optimize this objective, and semiautomatic methods to chose the optimal regularization. Preliminary experiments confirm the validity of the approach. 1
An Efficient Rigorous Approach for Identifying Statistically Significant Frequent Itemsets
"... As advances in technology allow for the collection, storage, and analysis of vast amounts of data, the task of screening and assessing the significance of discovered patterns is becoming a major challenge in data mining applications. In this work, we address significance in the context of frequent i ..."
Abstract

Cited by 6 (0 self)
 Add to MetaCart
As advances in technology allow for the collection, storage, and analysis of vast amounts of data, the task of screening and assessing the significance of discovered patterns is becoming a major challenge in data mining applications. In this work, we address significance in the context of frequent itemset mining. Specifically, we develop a novel methodology to identify a meaningful support threshold s ∗ for a dataset, such that the number of itemsets with support at least s ∗ represents a substantial deviation from what would be expected in a random dataset with the same number of transactions and the same individual item frequencies. These itemsets can then be flagged as statistically significant with a small false discovery rate. Our methodology hinges on a Poisson approximation to the Harvard School of Engineering and Applied Sciences, Cambridge,
Background subtraction on distributions
 In ECCV
, 2008
"... Abstract. Environmental monitoring applications present a challenge to current background subtraction algorithms that analyze the temporal variability of pixel intensities, due to the complex texture and motion of the scene. They also present a challenge to segmentation algorithms that compare inten ..."
Abstract

Cited by 6 (2 self)
 Add to MetaCart
Abstract. Environmental monitoring applications present a challenge to current background subtraction algorithms that analyze the temporal variability of pixel intensities, due to the complex texture and motion of the scene. They also present a challenge to segmentation algorithms that compare intensity or color distributions between the foreground and the background in each image independently, because objects of interest such as animals have adapted to blend in. Therefore, we have developed a background modeling and subtraction scheme that analyzes the temporal variation of intensity or color distributions, instead of either looking at temporal variation of point statistics, or the spatial variation of region statistics in isolation. Distributional signatures are less sensitive to movements of the textured background, and at the same time they are more robust than individual pixel statistics in detecting foreground objects. They also enable slow background update, which is crucial in monitoring applications where processing power comes at a premium, and where foreground objects, when present, may move less than the background and therefore disappear into it when a fast update scheme is used. Our approach compares favorably with the state of the art both in generic lowlevel detection metrics, as well as in applicationdependent criteria. 1