Results 1  10
of
259
Modeling and simulation of genetic regulatory systems: A literature review
 Journal of Computational Biology
, 2002
"... In order to understand the functioning of organisms on the molecular level, we need to know which genes are expressed, when and where in the organism, and to which extent. The regulation of gene expression is achieved through genetic regulatory systems structured by networks of interactions between ..."
Abstract

Cited by 426 (9 self)
 Add to MetaCart
In order to understand the functioning of organisms on the molecular level, we need to know which genes are expressed, when and where in the organism, and to which extent. The regulation of gene expression is achieved through genetic regulatory systems structured by networks of interactions between DNA, RNA, proteins, and small molecules. As most genetic regulatory networks of interest involve many components connected through interlocking positive and negative feedback loops, an intuitive understanding of their dynamics is hard to obtain. As a consequence, formal methods and computer tools for the modeling and simulation of genetic regulatory networks will be indispensable. This paper reviews formalisms that have been employed in mathematical biology and bioinformatics to describe genetic regulatory systems, in particular directed graphs, Bayesian networks, Boolean networks and their generalizations, ordinary and partial differential equations, qualitative differential equations, stochastic equations, and rulebased formalisms. In addition, the paper discusses how these formalisms have been used in the simulation of the behavior of actual regulatory systems. Key words: genetic regulatory networks, mathematical modeling, simulation, computational biology.
Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation
, 2002
"... There are many sources of systematic variation in cDNA microarray experiments which affect the measured gene expression levels (e.g. differences in labeling efficiency between the two fluorescent dyes). The term normalization refers to the process of removing such variation. A constant adjustment is ..."
Abstract

Cited by 365 (7 self)
 Add to MetaCart
There are many sources of systematic variation in cDNA microarray experiments which affect the measured gene expression levels (e.g. differences in labeling efficiency between the two fluorescent dyes). The term normalization refers to the process of removing such variation. A constant adjustment is often used to force the distribution of the intensity log ratios to have a median of zero for each slide. However, such global normalization approaches are not adequate in situations where dye biases can depend on spot overall intensity and/or spatial location within the array. This article proposes normalization methods that are based on robust local regression and account for intensity and spatial dependence in dye biases for different types of cDNA microarray experiments. The selection of appropriate controls for normalization is discussed and a novel set of controls (microarray sample pool, MSP) is introduced to aid in intensitydependent normalization. Lastly, to allow for comparisons of expression levels across slides, a robust method based on maximum likelihood estimation is proposed to adjust for scale differences among slides.
Biclustering algorithms for biological data analysis: a survey
 IEEE/ACM Transactions on Computational Biology and Bioinformatics
, 2004
"... Abstract—A large number of clustering approaches have been proposed for the analysis of gene expression data obtained from microarray experiments. However, the results from the application of standard clustering methods to genes are limited. This limitation is imposed by the existence of a number of ..."
Abstract

Cited by 278 (9 self)
 Add to MetaCart
Abstract—A large number of clustering approaches have been proposed for the analysis of gene expression data obtained from microarray experiments. However, the results from the application of standard clustering methods to genes are limited. This limitation is imposed by the existence of a number of experimental conditions where the activity of genes is uncorrelated. A similar limitation exists when clustering of conditions is performed. For this reason, a number of algorithms that perform simultaneous clustering on the row and column dimensions of the data matrix has been proposed. The goal is to find submatrices, that is, subgroups of genes and subgroups of conditions, where the genes exhibit highly correlated activities for every condition. In this paper, we refer to this class of algorithms as biclustering. Biclustering is also referred in the literature as coclustering and direct clustering, among others names, and has also been used in fields such as information retrieval and data mining. In this comprehensive survey, we analyze a large number of existing approaches to biclustering, and classify them in accordance with the type of biclusters they can find, the patterns of biclusters that are discovered, the methods used to perform the search, the approaches used to evaluate the solution, and the target applications. Index Terms—Biclustering, simultaneous clustering, coclustering, subspace clustering, bidimensional clustering, direct clustering, block clustering, twoway clustering, twomode clustering, twosided clustering, microarray data analysis, biological data analysis, gene expression data. 1
Discovering Statistically Significant Biclusters in Gene Expression Data
 In Proceedings of ISMB 2002
, 2002
"... In gene expression data, a bicluster is a subset of the genes exhibiting consistent patterns over a subset of the conditions. We propose a new method to detect significant biclusters in large expression datasets. Our approach is graph theoretic coupled with statistical modelling of the data. Under p ..."
Abstract

Cited by 191 (4 self)
 Add to MetaCart
In gene expression data, a bicluster is a subset of the genes exhibiting consistent patterns over a subset of the conditions. We propose a new method to detect significant biclusters in large expression datasets. Our approach is graph theoretic coupled with statistical modelling of the data. Under plausible assumptions, our algorithm is polynomial and is guaranteed to find the most significant biclusters. We tested our method on a collection of yeast expression profiles and on a human cancer dataset. Cross validation results show high specificity in assigning function to genes based on their biclusters, and we are able to annotate in this way 196 uncharacterized yeast genes. We also demonstrate how the biclusters lead to detecting new concrete biological associations. In cancer data we are able to detect and relate finer tissue types than was previously possible. We also show that the method outperforms the biclustering algorithm of Cheng and Church (2000).
Inferring Subnetworks from Perturbed Expression Profiles
, 2001
"... Genomewide expression profiles of genetic mutants provide a wide variety of measurements of cellular responses to perturbations. Typical analysis of such data identifies genes affected by perturbation and uses clustering to group genes of similar function. In this paper we discover a finer structur ..."
Abstract

Cited by 159 (12 self)
 Add to MetaCart
Genomewide expression profiles of genetic mutants provide a wide variety of measurements of cellular responses to perturbations. Typical analysis of such data identifies genes affected by perturbation and uses clustering to group genes of similar function. In this paper we discover a finer structure of interactions between genes, such as causality, mediation, activation, and inhibition by using a Bayesian network framework. We extend this framework to correctly handle perturbations, and to identify significant subnetworks of interacting genes. We apply this method to expression data of S. cerevisiae mutants and uncover a variety of structured metabolic, signaling and regulatory pathways. Contact: danab@cs.huji.ac.il
Model selection through sparse maximum likelihood estimation
 Journal of Machine Learning Research
, 2008
"... We consider the problem of estimating the parameters of a Gaussian or binary distribution in such a way that the resulting undirected graphical model is sparse. Our approach is to solve a maximum likelihood problem with an added ℓ1norm penalty term. The problem as formulated is convex but the memor ..."
Abstract

Cited by 157 (1 self)
 Add to MetaCart
We consider the problem of estimating the parameters of a Gaussian or binary distribution in such a way that the resulting undirected graphical model is sparse. Our approach is to solve a maximum likelihood problem with an added ℓ1norm penalty term. The problem as formulated is convex but the memory requirements and complexity of existing interior point methods are prohibitive for problems with more than tens of nodes. We present two new algorithms for solving problems with at least a thousand nodes in the Gaussian case. Our first algorithm uses block coordinate descent, and can be interpreted as recursive ℓ1norm penalized regression. Our second algorithm, based on Nesterov’s first order method, yields a complexity estimate with a better dependence on problem size than existing interior point methods. Using a log determinant relaxation of the log partition function (Wainwright and Jordan, 2006), we show that these same algorithms can be used to solve an approximate sparse maximum likelihood problem for the binary case. We test our algorithms on synthetic data, as well as on gene expression and senate voting records data.
Relating WholeGenome Expression Data with ProteinProtein Interactions
, 2002
"... this paper is the interactions occurring within specific complexes. These were obtained from the MIPS complexes catalog (Fellenberg et al. 2000), which represents a carefully annotated, comprehensive data set of protein complexes culled from the scientific literature. In addition, we looked at other ..."
Abstract

Cited by 154 (14 self)
 Add to MetaCart
this paper is the interactions occurring within specific complexes. These were obtained from the MIPS complexes catalog (Fellenberg et al. 2000), which represents a carefully annotated, comprehensive data set of protein complexes culled from the scientific literature. In addition, we looked at other types of proteinprotein interactions from large "aggregated" data sets collecting many heterogeneous pairwise interactions. We collected these from the MIPS catalogs of physical and genetic interactions (Fellenberg et al. 2000), databases of interacting proteins (DIP and BIND) (Bader and Hogue 2000; Xenarios 2000), and a comprehensive collection of yeast twohybrid experiments (Cagney et al. 2000; lto et al. 2000; Schwikowski et al. 2000; Uetz et al. 2000; Uetz and Hughes 2000; lto et al. 2001). These interactions are subdivided into groups based on their method of discovery. They include physical interactions (e.g., collected through coimmunoprecipitation and copurification), genetic interactions (e.g., determined through genetic means such as synthetic lethality or suppression experiments), and yeast twohybrid pairs
Assessing Gene Significance from cDNA Microarray Expression Data via Mixed Models
, 2001
"... The determination of a list of differentially expressed genes is a basic objective in many cDNA microarray experiments. We present a statistical approach that allows direct control over the percentage of false positives in such a list and, under certain reasonable assumptions, improves on existing m ..."
Abstract

Cited by 109 (3 self)
 Add to MetaCart
The determination of a list of differentially expressed genes is a basic objective in many cDNA microarray experiments. We present a statistical approach that allows direct control over the percentage of false positives in such a list and, under certain reasonable assumptions, improves on existing methods with respect to the percentage of false negatives. The method accommodates a wide variety of experimental designs and can simultaneously assess significant differences between multiple types of biological samples. Two interconnected mixed linear models are central to the method and provide a flexible means to properly account for variability both across and within genes. The mixed model also provides a convenient framework for evaluating the statistical power of any particular experimental design and thus enables a researcher to a priori select an appropriate number of replicates. We also suggest some basic graphics for visualizing lists of significant genes. Analyses of published experiments studying human cancer and yeast cells illustrate the results.
Analyzing time series gene expression data
 Bioinformatics
, 2004
"... doi:10.1093/bioinformatics/bth283 ..."
ARACNE: an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context
 BMC Bioinformatics
, 2006
"... * To whom correspondence should be addressed ..."