Results 1 - 10
of
14
Genetic Network Inference: From Co-Expression Clustering To Reverse Engineering
, 2000
"... motivation: Advances in molecular biological, analytical and computational technologies are enabling us to systematically investigate the complex molecular processes underlying biological systems. In particular, using highthroughput gene expression assays, we are able to measure the output of the ge ..."
Abstract
-
Cited by 156 (0 self)
- Add to MetaCart
motivation: Advances in molecular biological, analytical and computational technologies are enabling us to systematically investigate the complex molecular processes underlying biological systems. In particular, using highthroughput gene expression assays, we are able to measure the output of the gene regulatory network. We aim here to review datamining and modeling approaches for conceptualizing and unraveling the functional relationships implicit in these datasets. Clustering of co-expression profiles allows us to infer shared regulatory inputs and functional pathways. We discuss various aspects of clustering, ranging from distance measures to clustering algorithms and multiple-cluster memberships. More advanced analysis aims to infer causal connections between genes directly, i.e. who is regulating whom and how. We discuss several approaches to the problem of reverse engineering of genetic networks, from discrete Boolean networks, to continuous linear and non-linear models. We conclude that the combination of predictive modeling with systematic experimental verification will be required to gain a deeper insight into living organisms, therapeutic targeting and bioengineering.
'Gene shaving' as a method for identifying distinct sets of genes with similar expression patterns
, 2000
"... Background: Large gene expression studies, such as those conducted using DNA arrays, often provide millions of different pieces of data. To address the problem of analyzing such data, we describe a statistical method, which we have called `gene shaving'. The method identifies subsets of genes with c ..."
Abstract
-
Cited by 88 (4 self)
- Add to MetaCart
Background: Large gene expression studies, such as those conducted using DNA arrays, often provide millions of different pieces of data. To address the problem of analyzing such data, we describe a statistical method, which we have called `gene shaving'. The method identifies subsets of genes with coherent expression patterns and large variation across conditions. Gene shaving differs from hierarchical clustering and other widely used methods for analyzing gene expression studies in that genes may belong to more than one cluster, and the clustering may be supervised by an outcome measure. The technique can be `unsupervised', that is, the genes and samples are treated as unlabeled, or partially or fully supervised by using known properties of the genes or samples to assist in finding meaningful groupings.
A Gibbs Sampling Method to Detect Over-Represented Motifs in the Upstream Regions of Co-Expressed Genes
, 2002
"... Microarray experiments can reveal important information about transcriptional regulation. ..."
Abstract
-
Cited by 53 (7 self)
- Add to MetaCart
Microarray experiments can reveal important information about transcriptional regulation.
Analysis of Gene Expression Data with Pathway Scores
, 2000
"... We present a new approach for the evaluation of gene expression data. The basic idea is to generate biologically possible pathways and to score them with respect to gene expression measurements. We suggest sample scoring functions for different problem specifications. The significance of the scores ..."
Abstract
-
Cited by 31 (1 self)
- Add to MetaCart
We present a new approach for the evaluation of gene expression data. The basic idea is to generate biologically possible pathways and to score them with respect to gene expression measurements. We suggest sample scoring functions for different problem specifications. The significance of the scores for the investigated pathways is assessed by comparison to a number of scores for random pathways. We show that simple scoring functions can assign statistically significant scores to biologically relevant pathways. This suggests that the combination of appropriate scoring functions with the systematic generation of pathways can be used in order to select the most interesting pathways based on gene expression measurements.
Adaptive Quality-Based Clustering of Gene Expression Profiles
- Bioinformatics
, 2002
"... Motivation: Microarray experiments generate a considerable amount of data, which analysed properly help us gain a huge amount of biologically relevant information about the global cellular behaviour. Clustering (grouping genes with similar expression profiles) is one of the first steps in data analy ..."
Abstract
-
Cited by 18 (6 self)
- Add to MetaCart
Motivation: Microarray experiments generate a considerable amount of data, which analysed properly help us gain a huge amount of biologically relevant information about the global cellular behaviour. Clustering (grouping genes with similar expression profiles) is one of the first steps in data analysis of highthroughput expression measurements. A number of clustering algorithms have proved useful to make sense of such data. These classical algorithms, though useful, suffer from several drawbacks (e.g., they require the predefinition of arbitrary parameters like the number of clusters; they force every gene into a cluster despite a low correlation with other cluster members). In the following we describe a novel adaptive quality-based clustering algorithm that tackles some of these drawbacks.
Genetic Network Models: A Comparative Study
, 2001
"... Currently, the need arises for tools capable of unraveling the functionality of genes based on the analysis of microarray measurements. Modeling genetic interactions by means of genetic network models provides a methodology to infer functional relationships between genes. Although a wide variety of ..."
Abstract
-
Cited by 11 (4 self)
- Add to MetaCart
Currently, the need arises for tools capable of unraveling the functionality of genes based on the analysis of microarray measurements. Modeling genetic interactions by means of genetic network models provides a methodology to infer functional relationships between genes. Although a wide variety of different models have been introduced so far, it remains, in general, unclear what the strengths and weaknesses of each of these approaches are and where these models overlap and differ. This paper compares different genetic modeling approaches that attempt to extract the gene regulation matrix from expression data. A taxonomy of continuous genetic network models is proposed and the following important characteristics are suggested and employed to compare the models: (1) inferential power; (2) predictive power; (3) robustness; (4) consistency; (5) stability and (6) computational cost. Where possible, synthetic time series data are employed to investigate some of these properties. The comparison shows that although genetic network modeling might provide valuable information regarding genetic interactions, current models show disappointing results on simple artificial problems. For now, the simplest models are favored because they generalize better, but more complex models will probably prevail once their bias is more thoroughly understood and their variance is better controlled.
An algorithm to analyze stability of gene-expression patterns
-
, 2002
"... Many problems in the field of computational biology consist of the analysis of so-called gene-expression data. The successful application of approximation and optimization techniques, dynamical systems, algorithms and the utilization of the underlying combinatorial structures lead to a better unders ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
Many problems in the field of computational biology consist of the analysis of so-called gene-expression data. The successful application of approximation and optimization techniques, dynamical systems, algorithms and the utilization of the underlying combinatorial structures lead to a better understanding in that field. For the concrete example of gene-expression data we extend an algorithm, which exploits discrete information. This is lying in extremal points of polyhedra, which grow step by step, up to a possible stopping. We study gene-expression data in time, mathematically model it by a time-continuous system, and time-discretize this system. By our algorithm we compute the regions of stability and instability. We give an motivating introduction from genetics, present biological and mathematical interpretations of (in)stability, point out structural frontiers and give an outlook to future research.
Bioinformatics
, 2003
"... Selection of significant genes via expression patterns is an important problem in microarray experiments. Owing to small sample size and the large number of variables (genes), the selection process can be unstable. This paper proposes a hierarchical Bayesian model for gene (variable) selection. We e ..."
Abstract
- Add to MetaCart
Selection of significant genes via expression patterns is an important problem in microarray experiments. Owing to small sample size and the large number of variables (genes), the selection process can be unstable. This paper proposes a hierarchical Bayesian model for gene (variable) selection. We employ latent variables to specialize the model to a regression setting and uses a Bayesian mixture prior to perform the variable selection. We control the size of the model by assigning a prior distribution over the dimension (number of significant genes) of the model. The posterior distributions of the parameters are not in explicit form and we need to use a combination of truncated sampling and Markov Chain Monte Carlo (MCMC) based computation techniques to simulate the parameters from the posteriors. The Bayesian model is flexible enough to identify significant genes as well as to perform future predictions. The method is applied to cancer classification via cDNA microarrays where the genes BRCA1 and BRCA2 are associated with a hereditary disposition to breast cancer, and the method is used to identify a set of significant genes. The method is also applied successfully to the leukemia data.
A Gene Network Inference Method from Continuous-Value Gene Expression Data of Wild-Type and Mutants
"... In this paper we introduce a new inference method of a ene re ulatory network from steadystate ene expression data. Our method determines a re ulatory structure consistent with an observed set of steady-state expression profiles, each enerated from wild-type and sin le deletion mutant of the tar et ..."
Abstract
- Add to MetaCart
In this paper we introduce a new inference method of a ene re ulatory network from steadystate ene expression data. Our method determines a re ulatory structure consistent with an observed set of steady-state expression profiles, each enerated from wild-type and sin le deletion mutant of the tar et network. Our method derives the re ulatory relationships in the network usin a raph theoretic approach. The advanta e of our method is to be able to deal with continuous values of steady-state data, while most of the methods proposed in past use a Boolean network model with binary data. Performance of our method is evaluated on simulated networks with varyin the size of networks, inde ree of each ene, and the data characteristics (continuous-value/binary), and is compared with that of predictor method proposed by Ideker et al. As a result, we show the superiority of usin continuous values to binary values, and the performance of our method is much better than that of the predictor method. Keywords: inference of a gene regulatory network, steady-state gene expression profiles, graph theory, continuous value 1
Learning Regulatory Networks from Sparsely Sampled
, 2002
"... We present a probabilistic modeling approach to learning gene transcriptional regulation networks from time series gene expression data that is appropriate for the sparsely and irregularly sampled time series datasets currently available. We use a clustering algorithm based on statistical splines ..."
Abstract
- Add to MetaCart
We present a probabilistic modeling approach to learning gene transcriptional regulation networks from time series gene expression data that is appropriate for the sparsely and irregularly sampled time series datasets currently available. We use a clustering algorithm based on statistical splines to estimate continuous probabilistic models for clusters of genes with similar time expression profiles and for individual genes. Using the learned models, we present a novel mutual information score for causal edges between pairs of clusters and between pairs of genes corresponding to a given time lag #. This score computes dependency between expression values as continuous quantities rather than discretizing them. We present empirical results on times series data for the yeast cell cycle, using randomization trials to determine statistically significant candidate network edges and the Chow-Liu graph learning algorithm to learn the network structure, to obtain a dynamic model of cell cycle regulation. Biological validation of the inferred network suggests that our method can learn a meaningful, higher-level view of regulatory networks from sparse time series data.

