## Parameter-free spatial data mining using MDL (2005)

### Cached

### Download Links

- [www.cs.cmu.edu]
- [www.bitquill.net]
- [www.cin.ufpe.br]
- [www.cis.temple.edu]
- [knight.cis.temple.edu]
- DBLP

### Other Repositories/Bibliography

Venue: | In 5th International Conference on Data Mining (ICDM |

Citations: | 7 - 1 self |

### BibTeX

@INPROCEEDINGS{Papadimitriou05parameter-freespatial,

author = {Spiros Papadimitriou and Aristides Gionis and Panayiotis Tsaparas and Risto A. Väisänen and Heikki Mannila and Christos Faloutsos},

title = {Parameter-free spatial data mining using MDL},

booktitle = {In 5th International Conference on Data Mining (ICDM},

year = {2005}

}

### OpenURL

### Abstract

Consider spatial data consisting of a set of binary features taking values over a collection of spatial extents (grid cells). We propose a method that simultaneously finds spatial correlation and feature co-occurrence patterns, without any parameters. In particular, we employ the Minimum Description Length (MDL) principle coupled with a natural way of compressing regions. This defines what “good” means: a feature co-occurrence pattern is good, if it helps us better compress the set of locations for these features. Conversely, a spatial correlation is good, if it helps us better compress the set of features in the corresponding region. Our approach is scalable for large datasets (both number of locations and of features). We evaluate our method on both real and synthetic datasets. 1

### Citations

1881 |
Data Mining: Concepts and Techniques
- Han, Kamber
- 2001
(Show Context)
Citation Context ...3] and G-means [10]). Besides these, there are many other recent clustering algorithms that use an altogether different approach, e.g., CURE [9], BIRCH [31], Chameleon [16] and DENCLUE [14] (see also =-=[11]-=-). The LIMBO algorithm [2] uses a related, information theoretic approach for clustering categorical data. The problem of finding spatially coherent groupings is related to image segmentation; see, e.... |

1145 | Yin Y (2000) “Mining Frequent Patterns without Candidate Generation
- Han, Pei
(Show Context)
Citation Context ... for classic data mining tasks (i.e., clustering, anomaly detection, classification) based on standard compression tools. Frequent itemset mining brought a revolution [1] with a lot of follow-up work =-=[11, 12]-=-. These techniques have also been extended for mining spatial collocation patterns [20, 27, 32, 15]. However, all these approaches require the user to specify a support and/or other parameters (e.g., ... |

1097 | On spectral clustering: Analysis and an algorithm
- Ng, Jordan, et al.
- 2001
(Show Context)
Citation Context ...l deal only with spatial correlations and cannot be directly used for simultaneously discovering feature co-occurrences. Prevailing graph partitioning methods are METIS [17] and spectral partitioning =-=[22]-=-. Related is also the work on conjunctive clustering [21] and community detection [25]. However, these techniques also require some user-specified parameters and, more importantly, do not deal with sp... |

863 |
The Elements of
- Hastie, Tibshirani, et al.
- 2009
(Show Context)
Citation Context ....2 Algorithms Finding a global optimum of the total codelength is computationally very expensive. Therefore, we take the usual course of employing a greedy local search (as in, e.g., standard k-means =-=[13]-=- or in [4]). At each step we make a local move that always reduces the objective function L(D). The search for cell and feature groups is done in two levels: • INNER level (Figure 4): We assume that t... |

769 |
Fast algorithms for mining association rules in large databases
- Agrawal, Srikant
- 1994
(Show Context)
Citation Context ...8] propose parameter-free methods for classic data mining tasks (i.e., clustering, anomaly detection, classification) based on standard compression tools. Frequent itemset mining brought a revolution =-=[1]-=- with a lot of follow-up work [11, 12]. These techniques have also been extended for mining spatial collocation patterns [20, 27, 32, 15]. However, all these approaches require the user to specify a s... |

566 | CURE: an efficient clustering algorithm for large databases
- Guha, Rastogi, et al.
- 1998
(Show Context)
Citation Context ... or determining k based on some criterion (e.g., X-means [23] and G-means [10]). Besides these, there are many other recent clustering algorithms that use an altogether different approach, e.g., CURE =-=[9]-=-, BIRCH [31], Chameleon [16] and DENCLUE [14] (see also [11]). The LIMBO algorithm [2] uses a related, information theoretic approach for clustering categorical data. The problem of finding spatially ... |

436 | BIRCH: an efficient data clustering method for very large databases
- Zhang, Ramakrishnan, et al.
- 1996
(Show Context)
Citation Context ...ning k based on some criterion (e.g., X-means [23] and G-means [10]). Besides these, there are many other recent clustering algorithms that use an altogether different approach, e.g., CURE [9], BIRCH =-=[31]-=-, Chameleon [16] and DENCLUE [14] (see also [11]). The LIMBO algorithm [2] uses a related, information theoretic approach for clustering categorical data. The problem of finding spatially coherent gro... |

304 | D.S.: Concept decomposition for large sparse text data using clustering
- Dhillon, Modha
- 2001
(Show Context)
Citation Context ...imilarity. The most popular approach is k-means (see, e.g., [13]). There are several interesting variants, which aim at improving clustering quality (e.g., k-harmonic means [30] and spherical k-means =-=[7]-=-) or determining k based on some criterion (e.g., X-means [23] and G-means [10]). Besides these, there are many other recent clustering algorithms that use an altogether different approach, e.g., CURE... |

271 | A.W.Moore. X-means: Extending K-means with Efficient Estimation of the Number of Clusters
- Pelleg
- 2000
(Show Context)
Citation Context ...13]). There are several interesting variants, which aim at improving clustering quality (e.g., k-harmonic means [30] and spherical k-means [7]) or determining k based on some criterion (e.g., X-means =-=[23]-=- and G-means [10]). Besides these, there are many other recent clustering algorithms that use an altogether different approach, e.g., CURE [9], BIRCH [31], Chameleon [16] and DENCLUE [14] (see also [1... |

250 | Information-theoretic coclustering
- Dhillon, Mallela, et al.
(Show Context)
Citation Context ...ering [21] and community detection [25]. However, these techniques also require some user-specified parameters and, more importantly, do not deal with spatial data. Information theoretic coclustering =-=[6]-=- is related, but focuses on lossy compression of contingency tables, with distortion implicitly specified by providing the number of row and column clusters. In contrast, we employ MDL and a lossless ... |

207 | KEIM,." An efficient approach to clustering large multimedia databases with noise
- HINNEBURG
(Show Context)
Citation Context ....g., X-means [23] and G-means [10]). Besides these, there are many other recent clustering algorithms that use an altogether different approach, e.g., CURE [9], BIRCH [31], Chameleon [16] and DENCLUE =-=[14]-=- (see also [11]). The LIMBO algorithm [2] uses a related, information theoretic approach for clustering categorical data. The problem of finding spatially coherent groupings is related to image segmen... |

183 | A probabilistic framework for semi-supervised clustering
- Basu, Bilenko, et al.
(Show Context)
Citation Context ... data. The problem of finding spatially coherent groupings is related to image segmentation; see, e.g., [29]. Other more general models and techniques that could be adapted to this problem are, e.g., =-=[3, 19, 24]-=-. However, all deal only with spatial correlations and cannot be directly used for simultaneously discovering feature co-occurrences. Prevailing graph partitioning methods are METIS [17] and spectral ... |

158 | Approximation algorithms for classification problems with pairwise relationships: Metric labeling and markov random fields
- Kleinberg, Tardos
- 1999
(Show Context)
Citation Context ... data. The problem of finding spatially coherent groupings is related to image segmentation; see, e.g., [29]. Other more general models and techniques that could be adapted to this problem are, e.g., =-=[3, 19, 24]-=-. However, all deal only with spatial correlations and cannot be directly used for simultaneously discovering feature co-occurrences. Prevailing graph partitioning methods are METIS [17] and spectral ... |

149 | Multilevel algorithms for multi-constraint graph partitioning
- Karypis, Kumar
- 1998
(Show Context)
Citation Context ... e.g., [3, 19, 24]. However, all deal only with spatial correlations and cannot be directly used for simultaneously discovering feature co-occurrences. Prevailing graph partitioning methods are METIS =-=[17]-=- and spectral partitioning [22]. Related is also the work on conjunctive clustering [21] and community detection [25]. However, these techniques also require some user-specified parameters and, more i... |

141 |
Chameleon: Hierarchical clustering using dynamic modeling
- Karypis, Han, et al.
- 1999
(Show Context)
Citation Context ...some criterion (e.g., X-means [23] and G-means [10]). Besides these, there are many other recent clustering algorithms that use an altogether different approach, e.g., CURE [9], BIRCH [31], Chameleon =-=[16]-=- and DENCLUE [14] (see also [11]). The LIMBO algorithm [2] uses a related, information theoretic approach for clustering categorical data. The problem of finding spatially coherent groupings is relate... |

133 |
Some Generalized Order-Disorder Transformations
- Potts
- 1952
(Show Context)
Citation Context ... data. The problem of finding spatially coherent groupings is related to image segmentation; see, e.g., [29]. Other more general models and techniques that could be adapted to this problem are, e.g., =-=[3, 19, 24]-=-. However, all deal only with spatial correlations and cannot be directly used for simultaneously discovering feature co-occurrences. Prevailing graph partitioning methods are METIS [17] and spectral ... |

118 | Towards parameter-free data mining
- Keogh, Lonardi, et al.
- 2004
(Show Context)
Citation Context ...for binary matrices which also incorporates spatial information. The moresrecent work on cross-associations [4] is also parameter-free, but it cannot handle spatial information. Finally, Keogh et al. =-=[18]-=- propose parameter-free methods for classic data mining tasks (i.e., clustering, anomaly detection, classification) based on standard compression tools. Frequent itemset mining brought a revolution [1... |

85 | Learning the K in K-means
- Hamerly, Elkan
- 2003
(Show Context)
Citation Context ...everal interesting variants, which aim at improving clustering quality (e.g., k-harmonic means [30] and spherical k-means [7]) or determining k based on some criterion (e.g., X-means [23] and G-means =-=[10]-=-). Besides these, there are many other recent clustering algorithms that use an altogether different approach, e.g., CURE [9], BIRCH [31], Chameleon [16] and DENCLUE [14] (see also [11]). The LIMBO al... |

65 | Fully automatic cross-associations
- Chakrabarti, Papadimitriou, et al.
- 2004
(Show Context)
Citation Context ...ms Finding a global optimum of the total codelength is computationally very expensive. Therefore, we take the usual course of employing a greedy local search (as in, e.g., standard k-means [13] or in =-=[4]-=-). At each step we make a local move that always reduces the objective function L(D). The search for cell and feature groups is done in two levels: • INNER level (Figure 4): We assume that the number ... |

58 | A tutorial introduction to the minimum description length principle
- Grunwald
- 2004
(Show Context)
Citation Context ...urposes. 2.1 Minimum description length (MDL) In this section we give a brief overview of a practical formulation of the minimum description length (MDL) principle. For further information see, e.g., =-=[5, 8]-=-. Intuitively,sthe main idea behind MDL is the following: Let us assume that we have a family M of models with varying degrees of complexity. More complex models M ∈ M involve more parameters but, giv... |

23 | Mining confident co-location rules without a support threshold
- Huang, Xiong, et al.
- 2003
(Show Context)
Citation Context ...on standard compression tools. Frequent itemset mining brought a revolution [1] with a lot of follow-up work [11, 12]. These techniques have also been extended for mining spatial collocation patterns =-=[20, 27, 32, 15]-=-. However, all these approaches require the user to specify a support and/or other parameters (e.g., significance, confidence, etc). 7 Conclusion We propose a method to automatically discover spatial ... |

23 |
Fast mining of spatial collocations
- Zhang, Mamoulis, et al.
- 2004
(Show Context)
Citation Context ...on standard compression tools. Frequent itemset mining brought a revolution [1] with a lot of follow-up work [11, 12]. These techniques have also been extended for mining spatial collocation patterns =-=[20, 27, 32, 15]-=-. However, all these approaches require the user to specify a support and/or other parameters (e.g., significance, confidence, etc). 7 Conclusion We propose a method to automatically discover spatial ... |

21 | Scalable Clustering of Categorical Data
- Andritsos, Tsaparas, et al.
- 2004
(Show Context)
Citation Context ...es these, there are many other recent clustering algorithms that use an altogether different approach, e.g., CURE [9], BIRCH [31], Chameleon [16] and DENCLUE [14] (see also [11]). The LIMBO algorithm =-=[2]-=- uses a related, information theoretic approach for clustering categorical data. The problem of finding spatially coherent groupings is related to image segmentation; see, e.g., [29]. Other more gener... |

16 |
Spatially Coherent Clustering with Graph Cuts
- Zabih, Kolmogorov
- 2004
(Show Context)
Citation Context ...lustrate the intuition) and real. We implemented our algorithms in Matlab 6.5. In order to evaluate the spatial coherence of the cell groups, we plot the spatial extents of each group (e.g., see also =-=[29]-=-). In each case we compare against non-spatial bi-grouping (as presented in Section 3.2). This non-spatial approach produces cell groups of quality similar to or better than, e.g., straight k-means (w... |

13 |
Variable block-size image coding, in
- Vaisey, Gersho
- 1992
(Show Context)
Citation Context ...hat can be used to efficiently index contiguous regions of variable size in a grid. It has been used successfully in image coding and has the benefit of small overhead and very efficient construction =-=[28]-=-. Figure 1 shows a simple example. Each internal node in a quadtree corresponds to a partitioning of a rectangular region into four quadrants. The leaf nodes of a quadtree represent rectangular groups... |

10 | On finding large conjunctive clusters
- Mishra, Ron, et al.
- 2003
(Show Context)
Citation Context ...tly used for simultaneously discovering feature co-occurrences. Prevailing graph partitioning methods are METIS [17] and spectral partitioning [22]. Related is also the work on conjunctive clustering =-=[21]-=- and community detection [25]. However, these techniques also require some user-specified parameters and, more importantly, do not deal with spatial data. Information theoretic coclustering [6] is rel... |

9 | An approach to relate the web communities through bipartite graphs
- Reddy, Kitsuregawa
- 2001
(Show Context)
Citation Context ...iscovering feature co-occurrences. Prevailing graph partitioning methods are METIS [17] and spectral partitioning [22]. Related is also the work on conjunctive clustering [21] and community detection =-=[25]-=-. However, these techniques also require some user-specified parameters and, more importantly, do not deal with spatial data. Information theoretic coclustering [6] is related, but focuses on lossy co... |

9 |
Langdon Jr. Arithmetic coding
- Rissanen, G
- 1979
(Show Context)
Citation Context ...,d(2),...,d(n)] of n coin tosses. A simple model M (1) might consist of specifying the number h of heads. Given this model M (1) ≡ {h/n}, we can encode the dataset D using L(D|M (1) ) := nH(h/n) bits =-=[26]-=-, where H(·) is the Shannon entropy function. However, in order to be fair, we should also include the number L(M (1) ) of bits to transmit the fraction h/n, which can be done using log ⋆ n bits for t... |

5 |
K-harmonic means—a spatial clustering algorithm with boosting
- Zhang, Hsu, et al.
- 2000
(Show Context)
Citation Context ...ome notion of distance or similarity. The most popular approach is k-means (see, e.g., [13]). There are several interesting variants, which aim at improving clustering quality (e.g., k-harmonic means =-=[30]-=- and spherical k-means [7]) or determining k based on some criterion (e.g., X-means [23] and G-means [10]). Besides these, there are many other recent clustering algorithms that use an altogether diff... |

3 | Rule discovery and probabilistic modeling for onomastic data
- Leino, Mannila, et al.
- 2003
(Show Context)
Citation Context ...on standard compression tools. Frequent itemset mining brought a revolution [1] with a lot of follow-up work [11, 12]. These techniques have also been extended for mining spatial collocation patterns =-=[20, 27, 32, 15]-=-. However, all these approaches require the user to specify a support and/or other parameters (e.g., significance, confidence, etc). 7 Conclusion We propose a method to automatically discover spatial ... |

2 |
Evaluating attraction in spatial point patterns with an application in the field of cultural history
- Salmenkivi
- 2004
(Show Context)
Citation Context |