Results 1 -
4 of
4
Clustering Gene Expression Patterns
, 1999
"... Recent advances in biotechnology allow researchers to measure expression levels for thousands of genes simultaneously, across different conditions and over time. Analysis of data produced by such experiments offers potential insight into gene function and regulatory mechanisms. A key step in the ana ..."
Abstract
-
Cited by 273 (10 self)
- Add to MetaCart
Recent advances in biotechnology allow researchers to measure expression levels for thousands of genes simultaneously, across different conditions and over time. Analysis of data produced by such experiments offers potential insight into gene function and regulatory mechanisms. A key step in the analysis of gene expression data is the detection of groups of genes that manifest similar expression patterns. The corresponding algorithmic problem is to cluster multi-condition gene expression patterns. In this paper we describe a novel clustering algorithm that was developed for analysis of gene expression data. We define an appropriate stochastic error model on the input, and prove that under the conditions of the model, the algorithm recovers the cluster structure with high probability. The running time of the algorithm on an n-gene dataset is O(n 2 (log(n)) c ). We also present a practical heuristic based on the same algorithmic ideas. The heuristic was implemented and its p...
An Algorithm for Clustering cDNAs for Gene Expression Analysis
- In RECOMB99: Proceedings of the Third Annual International Conference on Computational Molecular Biology
, 1999
"... We have developed a novel algorithm for cluster analysis that is based on graph theoretic techniques. A similarity graph is defined and clusters in that graph correspond to highly connected subgraphs. A polynomial algorithm to compute them efficiently is presented. Our algorithm produces a clusterin ..."
Abstract
-
Cited by 35 (4 self)
- Add to MetaCart
We have developed a novel algorithm for cluster analysis that is based on graph theoretic techniques. A similarity graph is defined and clusters in that graph correspond to highly connected subgraphs. A polynomial algorithm to compute them efficiently is presented. Our algorithm produces a clustering with some provably good properties. The application that motivated this study was gene expression analysis, where a collection of cDNAs must be clustered based on their oligonucleotide fingerprints. The algorithm has been tested intensively on simulated libraries and was shown to outperform extant methods. It demonstrated robustness to high noise levels. In a blind test on real cDNA fingerprint data the algorithm obtained very good results. Utilizing the results of the algorithm would have saved over 70% of the cDNA sequencing cost on that data set. 1 Introduction Cluster analysis seeks grouping of data elements into subsets, so that elements in the same subset are in some sense more cl...
Universal DNA tag systems: A combinatorial design scheme
- J. Comput. Biol
, 2000
"... Custom-designed DNA arrays offer the possibility of simultaneously monitoring thousands of hybridization reactions These arrays show great potential for many medical and scientific applications such as polymorphism analysis and genotyping. Relatively high costs are associated with the need to specif ..."
Abstract
-
Cited by 30 (1 self)
- Add to MetaCart
Custom-designed DNA arrays offer the possibility of simultaneously monitoring thousands of hybridization reactions These arrays show great potential for many medical and scientific applications such as polymorphism analysis and genotyping. Relatively high costs are associated with the need to specifically design and synthesize problem specific arrays. Recently, an alternative approach was suggested that utilizes fixed, universal arrays. This approach presents an interesting design problem--the arrays should contain as many probes as possible, while minimizing experimental error ~ caused by cross-hybridization. We use a simple thermodynamic model to cast this design problem in a formal mathematical framework. Employing new combinatorial ideas, we derive an efficient construction for the design problem, and prove that our construction is near-optimal. 1

