Results 1 - 10
of
74
Using Bayesian networks to analyze expression data
- Journal of Computational Biology
, 2000
"... DNA hybridization arrays simultaneously measure the expression level for thousands of genes. These measurements provide a “snapshot ” of transcription levels within the cell. A major challenge in computational biology is to uncover, from such measurements, gene/protein interactions and key biologica ..."
Abstract
-
Cited by 526 (16 self)
- Add to MetaCart
DNA hybridization arrays simultaneously measure the expression level for thousands of genes. These measurements provide a “snapshot ” of transcription levels within the cell. A major challenge in computational biology is to uncover, from such measurements, gene/protein interactions and key biological features of cellular systems. In this paper, we propose a new framework for discovering interactions between genes based on multiple expression measurements. This framework builds on the use of Bayesian networks for representing statistical dependencies. A Bayesian network is a graph-based model of joint multivariate probability distributions that captures properties of conditional independence between variables. Such models are attractive for their ability to describe complex stochastic processes and because they provide a clear methodology for learning from (noisy) observations. We start by showing how Bayesian networks can describe interactions between genes. We then describe a method for recovering gene interactions from microarray data using tools for learning Bayesian networks. Finally, we demonstrate this method on the S. cerevisiae cell-cycle measurements of Spellman et al. (1998). Key words: gene expression, microarrays, Bayesian methods. 1.
Tissue Classification with Gene Expression Profiles
- Journal of Computational Biology
, 2000
"... Constantly improving gene expression profiling technologies are expected to provide understanding and insight into cancer related cellular processes. Gene expression data is also expected to significantly aid in the development of efficient cancer diagnosis and classification platforms. In this work ..."
Abstract
-
Cited by 143 (9 self)
- Add to MetaCart
Constantly improving gene expression profiling technologies are expected to provide understanding and insight into cancer related cellular processes. Gene expression data is also expected to significantly aid in the development of efficient cancer diagnosis and classification platforms. In this work we examine two sets of gene expression data measured across sets of tumor and normal clinical samples. One set consists of 2,000 genes, measured in 62 epithelial colon samples [1]. The second consists of 100,000 clones, measured in 32 ovarian samples (unpublished, extension of data set described in [26]). We examine the use of scoring methods, measuring separation of tumors from normals using individual gene expression levels. These are then coupled with high dimensional classification methods to assess the classification power of complete expression profiles. We present results of performing leave-one-out cross validation (LOOCV) experiments on the two data sets, employing SVM [8], AdaB...
A Hierarchical Unsupervised Growing Neural Network for Clustering Gene Expression Patterns
, 2001
"... Motivation: We describe a new approach to the analysis of gene expression data coming from DNA array experiments, using an unsupervised neural network. DNA array technologies allow monitoring thousands of genes rapidly and efficiently. One of the interests of these studies is the search for correlat ..."
Abstract
-
Cited by 98 (8 self)
- Add to MetaCart
Motivation: We describe a new approach to the analysis of gene expression data coming from DNA array experiments, using an unsupervised neural network. DNA array technologies allow monitoring thousands of genes rapidly and efficiently. One of the interests of these studies is the search for correlated gene expression patterns, and this is usually achieved by clustering them. The Self-Organising Tree Algorithm, (SOTA) (Dopazo,J. and Carazo,J.M. (1997) J. Mol. Evol., 44, 226--233), is a neural network that grows adopting the topology of a binary tree. The result of the algorithm is a hierarchical cluster obtained with the accuracy and robustness of a neural network. Results: SOTA clustering confers several advantages over classical hierarchical clustering methods. SOTA is a divisive method: the clustering process is performed from top to bottom, i.e. the highest hierarchical levels are resolved before going to the details of the lowest levels. The growing can be stopped at the desired hierarchical level. Moreover, a criterion to stop the growing of the tree, based on the approximate distribution of probability obtained by randomisation of the original data set, is provided. By means of this criterion, a statistical support for the definition of clusters is proposed. In addition, obtaining average gene expression patterns is a built-in feature of the algorithm. Different neurons defining the different hierarchical levels represent the averages of the gene expression patterns contained in the clusters. Since SOTA runtimes are approximately linear with the number of items to be classified, it is especially suitable for dealing with huge amounts of data. The method proposed is very general and applies to any data providing that they can be coded as a series of numbers and t...
CLICK and EXPANDER: a system for clustering and visualizing gene expression data
- Bioinformatics
, 2003
"... Motivation: Microarrays have become a central tool in biological research. Their applications range from functional annotation to tissue classification and genetic network inference. A key step in the analysis of gene expression data is the identification of groups of genes that manifest similar exp ..."
Abstract
-
Cited by 42 (6 self)
- Add to MetaCart
Motivation: Microarrays have become a central tool in biological research. Their applications range from functional annotation to tissue classification and genetic network inference. A key step in the analysis of gene expression data is the identification of groups of genes that manifest similar expression patterns. This translates to the algorithmic problem of clustering genes based on their expression patterns. Results: We present a novel clustering algorithm, called CLICK, and its applications to gene expression analysis. The algorithm utilizes graph-theoretic and statistical techniques to identify tight groups (kernels) of highly similar elements, which are likely to belong to the same true cluster. Several heuristic procedures are then used to expand the kernels into the full clusters. We report on the application of CLICK to a variety of gene expression data sets. In all those applications it outperformed extant algorithms according to several common figures of merit. We also point out that CLICK can be successfully used for the identification of common regulatory motifs in the upstream regions of co-regulated genes. Furthermore, we demonstrate how CLICK can be used to accurately classify tissue samples into disease types, based on their expression profiles. Finally, we present a new java-based graphical tool, called EXPANDER, for gene expression analysis and visualization, which incorporates CLICK and several other popular clustering algorithms.
Context-Specific Bayesian Clustering for Gene Expression Data
, 2002
"... The recent growth in genomic data and measurements of genome-wide expression patterns allows us to apply computational tools to examine gene regulation by transcription factors. ..."
Abstract
-
Cited by 41 (5 self)
- Add to MetaCart
The recent growth in genomic data and measurements of genome-wide expression patterns allows us to apply computational tools to examine gene regulation by transcription factors.
Analysis of Gene Expression Microarrays for Phenotype Classification
- Proc. Int. Conf. Intell. Syst. Mol. Biol
, 2000
"... Several microarray technologies that monitor the level of expression of a large number of genes have recently emerged. Given DNA-microarray data for a set of cells characterized by a given phenotype and for a set of control cells, an important problem is to identify "patterns" of gene expressio ..."
Abstract
-
Cited by 37 (4 self)
- Add to MetaCart
Several microarray technologies that monitor the level of expression of a large number of genes have recently emerged. Given DNA-microarray data for a set of cells characterized by a given phenotype and for a set of control cells, an important problem is to identify "patterns" of gene expression that can be used to predict cell phenotype. The potential number of such patterns is exponential in the number of genes. In this paper, we propose a solution to this problem based on a supervised learning algorithm, which differs substantially from previous schemes. It couples a complex, non-linear similarity metric, which maximizes the probability of discovering discriminative gene expression patterns, and a pattern discovery algorithm called SPLASH. The latter discovers efficiently and deterministically all statistically significant gene expression patterns in the phenotype set. Statistical significance is evaluated based on the probability of a pattern to occur by chance in ...
Computational Methods for the Identification of Differential and Coordinated Gene Expression
- Human Molecular Genetics
, 1999
"... this article, I review the theoretical and computational approaches used to: (i) identify genes differentially expressed (across cell types, developmental stages, pathological conditions, etc.); (ii) identify genes expressed in a coordinated manner across a set of conditions; and (iii) delineate clu ..."
Abstract
-
Cited by 34 (0 self)
- Add to MetaCart
this article, I review the theoretical and computational approaches used to: (i) identify genes differentially expressed (across cell types, developmental stages, pathological conditions, etc.); (ii) identify genes expressed in a coordinated manner across a set of conditions; and (iii) delineate clusters of genes sharing coherent expression features, eventually defining global biological pathways
Accuracy and calibration of commercial oligonucleotide and custom cDNA microarrays
, 2002
"... We compared the accuracy of microarray measurements obtained with oligonucleotide arrays (GeneChip, Affymetrix) with a laboratory-developed cDNA array by assaying test RNA samples from an experiment using a paradigm known to regulate many genes measured on both arrays. We selected 47 genes represent ..."
Abstract
-
Cited by 26 (1 self)
- Add to MetaCart
We compared the accuracy of microarray measurements obtained with oligonucleotide arrays (GeneChip, Affymetrix) with a laboratory-developed cDNA array by assaying test RNA samples from an experiment using a paradigm known to regulate many genes measured on both arrays. We selected 47 genes represented on both arrays, including both known regulated and unregulated transcripts, and established reference relative expression measurements for these genes in the test RNA samples using quantitative reverse transcriptase realtime PCR (QRTPCR) assays. The validity of the reproducible (average coefficient of variation = 11.8%) QRTPCR measurements were established through application of a new mathematical model. The performance of both array platforms in identifying regulated and non-regulated genes was identical. With either platform, 16 of 17 definitely regulated genes were correctly identified, and no definitely unregulated transcript was falsely identified as regulated. Accuracy of the fold-change measurements obtained with each platform was assessed by determining measurement bias. Both platforms consistently underestimate the relative changes in mRNA expression between experimental and control samples. The bias observed with cDNA arrays was predictable for fold-changes <250-fold by QRTPCR and could be corrected by the calibration function F c = F a(cDNA) ,whereF a(cDNA) is the microarray-determined foldchange comparing experimental with control samples, q is the correction factor and F c is the calibrated value. The bias observed with the commercial oligonucleotide arrays was less predictable and calibration was unfeasible. Following calibration, fold-change measurements generated by custom cDNA arrays were more accurate than those obtained by commercial oligonucleotide ar...
Estimating coarse gene network structure from large-scale gene perturbation data
- Genome Res
, 2002
"... service ..."

