Results 1 
5 of
5
The history of the cluster heat map
 The American Statistician
, 2009
"... The cluster heat map is an ingenious display that simultaneously reveals row and column hierarchical cluster structure in a data matrix. It consists of a rectangular tiling with each tile shaded on a color scale to represent the value of the corresponding element of the data matrix. The rows (column ..."
Abstract

Cited by 16 (0 self)
 Add to MetaCart
The cluster heat map is an ingenious display that simultaneously reveals row and column hierarchical cluster structure in a data matrix. It consists of a rectangular tiling with each tile shaded on a color scale to represent the value of the corresponding element of the data matrix. The rows (columns) of the tiling are ordered such that similar rows (columns) are near each other. On the vertical and horizontal margins of the tiling there are hierarchical cluster trees. This cluster heat map is a synthesis of several different graphic displays developed by statisticians over more than a century. We locate the earliest sources of this display in late 19th century publications. And we trace a diverse 20th century statistical literature that provided a foundation for this most widely used of all bioinformatics displays. 1
Dissimilarity Plots: A Visual Exploration Tool for Partitional Clustering
, 2009
"... For hierarchical clustering, dendrograms provide convenient and powerful visualization. Although many visualization methods have been suggested for partitional clustering, their usefulness deteriorates quickly with increasing dimensionality of the data and/or they fail to represent structure between ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
For hierarchical clustering, dendrograms provide convenient and powerful visualization. Although many visualization methods have been suggested for partitional clustering, their usefulness deteriorates quickly with increasing dimensionality of the data and/or they fail to represent structure between and within clusters simultaneously. In this paper we extend (dissimilarity) matrix shading with several reordering steps based on seriation. Both methods, matrix shading and seriation, have been wellknown for a long time. However, only recent algorithmic improvements allow to use seriation for larger problems. Furthermore, seriation is used in a novel stepwise process (within each cluster and between clusters) which leads to a visualization technique that is independent of the dimensionality of the data. A big advantage is that it presents the structure between clusters and the microstructure within clusters in one concise plot. This not only allows for judging cluster quality but also makes misspecification of the number of clusters apparent. We give a detailed discussion of the construction of dissimilarity plots and demonstrate their usefulness with several examples.
Seriation in the Presence of Errors: A Factor 16 Approximation Algorithm for l∞Fitting Robinson Structures to Distances
 ALGORITHMICA
, 2007
"... The classical seriation problem consists in finding a permutation of the rows and the columns of the distance (or, more generally, dissimilarity) matrix d on a finite set X so that small values should be concentrated around the main diagonal as close as possible, whereas large values should fall as ..."
Abstract
 Add to MetaCart
The classical seriation problem consists in finding a permutation of the rows and the columns of the distance (or, more generally, dissimilarity) matrix d on a finite set X so that small values should be concentrated around the main diagonal as close as possible, whereas large values should fall as far from it as possible. This goal is best achieved by considering the Robinson property: a distance dR on X is Robinsonian if its matrix can be symmetrically permuted so that its elements do not decrease when moving away from the main diagonal along any row or column. If the distance d fails to satisfy the Robinson property, then we are lead to the problem of finding a reordering of d which is as close as possible to a Robinsonian distance. In this paper, we present a factor 16 approximation algorithm for the following NPhard fitting problem: given a finite set X and a dissimilarity d on X, wewish to find a Robinsonian dissimilarity dR on X minimizing the lâerror âd â dRâ â = maxx,yâX{d(x,y) â dR(x, y)} between d and dR.
History Corner The History of the Cluster Heat Map
"... The cluster heat map is an ingenious display that simultaneously reveals row and column hierarchical cluster structure in a data matrix. It consists of a rectangular tiling, with each tile shaded on a color scale to represent the value of the corresponding element of the data matrix. The rows (colum ..."
Abstract
 Add to MetaCart
The cluster heat map is an ingenious display that simultaneously reveals row and column hierarchical cluster structure in a data matrix. It consists of a rectangular tiling, with each tile shaded on a color scale to represent the value of the corresponding element of the data matrix. The rows (columns) of the tiling are ordered such that similar rows (columns) are near each other. On the vertical and horizontal margins of the tiling are hierarchical cluster trees. This cluster heat map is a synthesis of several different graphic displays developed by statisticians over more than a century. We locate the earliest sources of this display in late 19th century publications, and trace a diverse 20th century statistical literature that provided a foundation for this most widely used of all bioinformatics displays. KEY WORDS: Cluster analysis; Heatmap; Microarray; Visualization. 1.