• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

Cluster Stability and the Use of Noise in Interpretation of Clustering (2001)

Cached

  • Download as a PDF

Download Links

  • [www.cs.sandia.gov]
  • [www.cs.ubc.ca]
  • [www.cs.ubc.ca]
  • [wwwisg.cs.uni-magdeburg.de]
  • [www.cs.ubc.ca]
  • [www.cs.ubc.ca]
  • [www.cs.ubc.ca]
  • [www.cs.ubc.ca]
  • [www.cs.ubc.ca]
  • [www.cs.ubc.ca]
  • [www.cs.ubc.ca]

  • Other Repositories/Bibliography

  • DBLP
  • Save to List
  • Add to Collection
  • Correct Errors
  • Monitor Changes
by George S. Davidson , Brian N. Wylie, et al.
Citations:18 - 8 self
  • Summary
  • Active Bibliography
  • Co-citation
  • Clustered Documents
  • Version History

BibTeX

@MISC{Davidson01clusterstability,
    author = {George S. Davidson and Brian N. Wylie and et al.},
    title = {Cluster Stability and the Use of Noise in Interpretation of Clustering},
    year = {2001}
}

Years of Citing Articles

Bookmark

citeulike Connotea Bibsonomy Del.icio.us Digg Reddit

OpenURL

 

Abstract

A clustering and ordination algorithm suitable for mining extremely large databases, including those produced by microarray expression studies, is described and analyzed for stability. Data from a yeast cell cycle experiment with 6000 genes and 18 experimental measurements per gene are used to test this algorithm under practical conditions. The process of assigning database objects to an X, Y coordinate, ordination, is shown to be stable with respect to random starting conditions, and with respect to minor perturbations in the starting similarity estimates. Careful analysis of the way clusters typically co-locate, versus the occasional large displacements under different starting conditions are shown to be useful in interpreting the data. This extra stability information is lost when only a single cluster is reported, which is currently the accepted practice. However, it is believed that the approaches presented here should become a standard part of best practices in analyzing computer clustering of large data collections.

Citations

2801 Optimisation by simulated annealing - Kirkpatrick, Gellatt, et al. - 1982
1255 Some methods for classification and analysis of multivariate observations - MacQueen - 1967
936 Self-Organized Formation of Topologically Correct Feature Maps - Kohonen - 1982
520 4913 Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization - Spellman, Sherlock, et al. - 1998
444 A Heuristic for Graph Drawing - Eades - 1984
361 An algorithm for drawing general undirected graph - Kamada, Kawai - 1989
190 Visualizing the non-visual: spatial analysis and interaction with information from text documents - Wise, Thomas, et al. - 1995
149 Drawing graphs nicely using simulated annealing - Davidson, Harel - 1996
87 The Annealing Algorithm - Otten, Ginneken - 1989
49 On the ‘Probable Error’ of a Coefficient of Correlation Deduced from a Small Sample - Fisher - 1921
47 A force directed component placement procedure for printed circuit boards - Quinn, Breuer - 1979
43 RR: Introduction to Robust Estimation and Hypothesis Testing. 2nd edition - Wilcox - 2005
40 The ecological approach to text visualization - Wise - 1999
39 A simple method for computing general positions in displaying three-dimensional objects - Kamada, Kawai - 1988
37 Knowledge mining with VxInsight: Discovery through interaction - Johnson - 1998
17 Automatic Display of Network Structures for Human Understanding - Kamada, Kawai - 1988
12 Graph drawing by forcedirected placement - Fruchtermann, Reingold - 1990
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University