MetaCart Sign in to MyCiteSeerX

Include Citations | Advanced Search | Help

Disambiguated Search | Include Citations | Advanced Search | Help

Model-Based Clustering and Data Transformations for Gene Expression Data (2001) [2 citations — 0 self]

by Adrian E. Raftery ,  Ka Yee Yeung ,  Ka Yee Yeung ,  Chris Fraley ,  Chris Fraley ,  Alejandro Murua ,  Alejandro Murua ,  Walter L. Ruzzo ,  Walter L. Ruzzo
Bioinformatics
Add To MetaCart

Abstract:

Clustering is a useful exploratory technique for the analysis of gene expression data. Many different heuristic clustering algorithms have been proposed in this context. Clustering algorithms based on probability models offer a principled alternative to heuristic algorithms. In particular, model-based clustering assumes that the data is generated by a finite mixture of underlying probability distributions such as multivariate normal distributions. This Gaussian mixture model has been shown to be a powerful tool for many applications. In addition, the issues of selecting a "good" clustering method and determining the "correct" number of clusters are reduced to model selection problems in the probability framework.

Citations

971 Estimating the dimension of a model – Schwarz - 1978
614 Human behavior and the principle of least-effort – Zipf - 1949
506 Bayes factors – Kaas, Raftery - 1995
320 Mixture models: inference and applications to clustering – McLachlan, Basford - 1998
293 Interpreting Patterns of Gene Expression with Self-Organizing Maps: Methods And Application to Hematopoictic Differentiation – Tamayo, Slonim, et al. - 1999
266 G: Systematic determination of genetic network architecture. Nature Genet – Tavazoie, Hughes, et al. - 1999
177 Comparing partitions – Hubert, Arabie - 1985
142 Objective criteria for the evaluation of clustering methods – Rand - 1971
105 Estimating the number of clusters in a dataset via the Gap statistic – Tibshirani, Walther, et al. - 2000
57 Validating clustering for gene expression data – Yeung, Haynor - 2001
36 Measures of multivariate skewness and kurtosis with applications – Mardia - 1970
36 A study of the comparability of external criteria for hierarchical cluster analysis – Milligan, Cooper - 1986
32 Array of hope – Lander - 1999
30 MIPS: a database for protein sequences and complete genomes – Mewes, Heumann, et al. - 1999
28 An empirical study of principal component analysis for clustering gene expression data – Yeung, Ruzzo - 2001
23 Applied Multivariate Data Analysis – Jobson - 1991
20 Comparative hybridization of an array of 21,500 ovarian cDNAs for the discovery of genes overexpressed in ovarian carcinomas – Schummer, Ng, et al. - 1999
6 Model based document classification and clustering. Manuscript in preparation – Murua, Tantrum, et al. - 2001
4 Speed group microarray page: Hints and prejudices. Http://statwww. berkeley.edu/users/terry/zarray/Html/hintsindex.html – Speed - 2000