## Convex Group Clustering of Large Geo-referenced Data Sets (1999)

Venue: | In Abstracts for the Eleventh Canadian Conference on Computational Geometry (CCCG'99 |

Citations: | 1 - 0 self |

@INPROCEEDINGS{Estivill-Castro99convexgroup,

author = {Vladimir Estivill-Castro},

title = {Convex Group Clustering of Large Geo-referenced Data Sets},

booktitle = {In Abstracts for the Eleventh Canadian Conference on Computational Geometry (CCCG'99},

year = {1999},

pages = {http://www.cs.ubc.ca}

}

Clustering partitions a data set S = fs1 ; : : : ; sng ae ! m into groups of nearby points. Distance-based clustering uses optimisation criteria for defining the quality of the partition. Formulations using representatives (means or medians of groups) have received much more attention than minimisation of the total within group distance (TWGD). However, this non-representative approach has attractive properties while remaining distance-based. While representative approaches produce partitions with non-overlapping clusters, TWGD does not. We investigate the restriction of TWGD to producing convex-hull disjoint groups and show that this problem is NP-complete in the Euclidean case as soon as m 2. Nevertheless we provide efficient algorithms for solving it approximately. Keywords: clustering, optimisation, computational geometry, problem complexity, data mining in spatial databases. 1 Introduction Clustering is a fundamental task in data analysis since it identifies groups in heterog...

