Results 1  10
of
63
From Boolean to Probabilistic Boolean Networks as Models of Genetic Regulatory Networks
 Proc. IEEE
, 2002
"... Mathematical and computational modeling of genetic regulatory networks promises to uncover the fundamental principles governing biological systems in an integrarive and holistic manner. It also paves the way toward the development of systematic approaches for effective therapeutic intervention in di ..."
Abstract

Cited by 107 (20 self)
 Add to MetaCart
(Show Context)
Mathematical and computational modeling of genetic regulatory networks promises to uncover the fundamental principles governing biological systems in an integrarive and holistic manner. It also paves the way toward the development of systematic approaches for effective therapeutic intervention in disease. The central theme in this paper is the Boolean formalism as a building block for modeling complex, largescale, and dynamical networks of genetic interactions. We discuss the goals of modeling genetic networks as well as the data requirements. The Boolean formalism is justified from several points of view. We then introduce Boolean networks and discuss their relationships to nonlinear digital filters. The role of Boolean networks in understanding cell differentiation and cellular functional states is discussed. The inference of Boolean networks from real gene expression data is considered from the viewpoints of computational learning theory and nonlinear signal processing, touching on computational complexity of learning and robustness. Then, a discussion of the need to handle uncertainty in a probabilistic framework is presented, leading to an introduction of probabilistic Boolean networks and their relationships to Markov chains. Methods for quantifying the influence of genes on other genes are presented. The general question of the potential effect of individual genes on the global dynamical network behavior is considered using stochastic perturbation analysis. This discussion then leads into the problem of target identification for therapeutic intervention via the development of several computational tools based on firstpassage times in Markov chains. Examples from biology are presented throughout the paper. 1
Experimental Uncertainty Estimation and Statistics for Data Having Interval Uncertainty
 11733, SAND20070939. hal00839639, version 1  28 Jun 2013
"... Sandia is a multiprogram laboratory operated by Sandia Corporation, ..."
Abstract

Cited by 39 (20 self)
 Add to MetaCart
(Show Context)
Sandia is a multiprogram laboratory operated by Sandia Corporation,
On learning gene regulatory networks under the Boolean network model
 Machine Learning
, 2003
"... Boolean networks are a popular model class for capturing the interactions of genes and global dynamical behavior of genetic regulatory networks. Recently, a significant amount of attention has been focused on the inference or identification of the model structure from gene expression data. We consi ..."
Abstract

Cited by 33 (2 self)
 Add to MetaCart
Boolean networks are a popular model class for capturing the interactions of genes and global dynamical behavior of genetic regulatory networks. Recently, a significant amount of attention has been focused on the inference or identification of the model structure from gene expression data. We consider the Consistency as well as BestFit Extension problems in the context of inferring the networks from data. The latter approach is especially useful in situations when gene expression measurements are noisy and may lead to inconsistent observations. We propose simple efficient algorithms that can be used to answer the Consistency Problem and find one or all consistent Boolean networks relative to the given examples. The same method is extended to learning gene regulatory networks under the BestFit Extension paradigm. We also introduce a simple and fast way of finding all Boolean networks having limited error size in the BestFit Extension Problem setting. We apply the inference methods to a real gene expression data set and present the results for a selected set of genes.
Gene Clustering Based on Clusterwide Mutual Information
 of the school of engineering. Michel Verleysen was born in 1965 in Belgium. He received the M.S. and Ph.D. degrees in electrical engineering from the Université catholique de Louvain (Belgium) in 1987 and 1992, respectively. He was an Invited Professor at
, 2004
"... Cluster analysis of genewide expression data from DNA microarray hybridization studies has proved to be a useful tool for identifying biologically relevant groupings of genes and constructing gene regulatory networks. The motivation for considering mutual information is its capacity to measure a ge ..."
Abstract

Cited by 17 (4 self)
 Add to MetaCart
(Show Context)
Cluster analysis of genewide expression data from DNA microarray hybridization studies has proved to be a useful tool for identifying biologically relevant groupings of genes and constructing gene regulatory networks. The motivation for considering mutual information is its capacity to measure a general dependence among gene random variables. We propose a novel clustering strategy based on minimizing mutual information among gene clusters. Simulated annealing is employed to solve the optimization problem. Bootstrap techniques are employed to get more accurate estimates of mutual information when the data sample size is small. Moreover, we propose to combine the mutual information criterion and traditional distance criteria such as the Euclidean distance and the fuzzy membership metric in designing the clustering algorithm. The performances of the new clustering methods are compared with those of some existing methods, using both synthesized data and experimental data. It is seen that the clustering algorithm based on a combined metric of mutual information and fuzzy membership achieves the best performance. The supplemental material is available at www.gspsnap.tamu.edu/gspweb/zxb/glioma_zxb.
Biclustering genefeature matrices for statistically significant dense patterns
 in: IEEE Computer Society Bioinformatics Conf
, 2004
"... Biclustering is an important problem that arises in diverse applications, including analysis of gene expression and drug interaction data. The problem can be formalized in various ways through different interpretation of data and associated optimization functions. We focus on the problem of finding ..."
Abstract

Cited by 12 (1 self)
 Add to MetaCart
(Show Context)
Biclustering is an important problem that arises in diverse applications, including analysis of gene expression and drug interaction data. The problem can be formalized in various ways through different interpretation of data and associated optimization functions. We focus on the problem of finding unusually dense patterns in binary (01) matrices. This formulation is appropriate for analyzing experimental datasets that come from not only binary quantization of gene expression data, but also more comprehensive datasets such as genefeature matrices that include functions of coded proteins and motifs in the coding sequence. We formalize the notion of an “unusually ” dense submatrix to evaluate the interestingness of a pattern in terms of statistical significance based on the assumption of a uniform memoryless source. We then simplify it to assess statistical significance of discovered patterns. Using statistical significance as an objective function, we formulate the problem as one of finding significant dense submatrices of a large sparse matrix. Adopting a simple iterative heuristic along with randomized initialization techniques, we derive fast algorithms for discovering binary biclusters. We conduct experiments on a binary genefeature matrix and a quantized breast tumor gene expression matrix. Our experimental results show that the proposed method quickly discovers all interesting patterns in these datasets. 1.
Algorithms for Finding Small Attractors in Boolean Networks
, 2007
"... A Boolean network is a model used to study the interactions between different genes in genetic regulatory networks. In this paper, we present several algorithms using gene ordering and feedback vertex sets to identify singleton attractors and small attractors in Boolean networks. We analyze the aver ..."
Abstract

Cited by 10 (6 self)
 Add to MetaCart
A Boolean network is a model used to study the interactions between different genes in genetic regulatory networks. In this paper, we present several algorithms using gene ordering and feedback vertex sets to identify singleton attractors and small attractors in Boolean networks. We analyze the average case time complexities of some of the proposed algorithms. For instance, it is shown that the outdegreebased ordering algorithm for finding singleton attractors works in O(1.19 n)timeforK = 2, which is much faster than the naive O(2 n) time algorithm, where n is the number of genes and K is the maximum indegree. We performed extensive computational experiments on these algorithms, which resulted in good agreement with theoretical results. In contrast, we give a simple and complete proof for showing that finding an attractor with the shortest period is NPhard.
Which is better for cDNAmicroarraybased classification: ratios or direct intensities
, 2004
"... Motivation: There are two general methods for making geneexpression microarrays: one is to hybridize a single test set of labeled targets to the probe, and measure the backgroundsubtracted intensity at each probe site; the other is to hybridize both a test and a reference set of differentially label ..."
Abstract

Cited by 8 (3 self)
 Add to MetaCart
(Show Context)
Motivation: There are two general methods for making geneexpression microarrays: one is to hybridize a single test set of labeled targets to the probe, and measure the backgroundsubtracted intensity at each probe site; the other is to hybridize both a test and a reference set of differentially labeled targets to a single detector array, and measure the ratio of the backgroundsubtracted intensities at each probe site. Which method is better depends on the variability in the cell system and the random factors resulting from the microarray technology. It also depends on the purpose for which the microarray is being used. Classification is a fundamental application and it is the one considered here.
Statistical methods for microarray assays
 J Appl Genet
, 2002
"... Abstract: The paper shortly reviews statistical methods used in the area of DNA microarray studies. All stages of the experiment are taken into account: planning, data collection, data preprocessing, analysis and validation. Among the methods of data analysis, the algorithms for estimating different ..."
Abstract

Cited by 7 (0 self)
 Add to MetaCart
Abstract: The paper shortly reviews statistical methods used in the area of DNA microarray studies. All stages of the experiment are taken into account: planning, data collection, data preprocessing, analysis and validation. Among the methods of data analysis, the algorithms for estimating differential expression, multivariate approaches, clustering methods, as well as classification and discrimination are reviewed. The need is stressed for routine statistical data processing protocols and for the search of links of microarray data analysis with quantitative genetic models. Key words: data analysis, data collection, DNA microarrays, planning experiments, statistical methods, validation.
M.: Supplement to “Subnetwork state functions define dysregulated subnetworks
"... Abstract. Emerging research demonstrates the potential of proteinprotein interaction (PPI) networks in uncovering the mechanistic bases of cancers, through identification of interacting proteins that are coordinately dysregulated in tumorigenic and metastatic samples. When used as features for class ..."
Abstract

Cited by 7 (3 self)
 Add to MetaCart
(Show Context)
Abstract. Emerging research demonstrates the potential of proteinprotein interaction (PPI) networks in uncovering the mechanistic bases of cancers, through identification of interacting proteins that are coordinately dysregulated in tumorigenic and metastatic samples. When used as features for classification, such coordinately dysregulated subnetworks improve diagnosis and prognosis of cancer considerably over singlegene markers. However, existing methods formulate coordination between multiple genes through additive representation of their expression profiles and utilize greedy heuristics to identify dysregulated subnetworks, which may not be well suited to the potentially combinatorial nature of coordinate dysregulation. Here, we propose a combinatorial formulation of coordinate dysregulation and decompose the resulting objective function to cast the problem as one of identifying subnetwork state functions that are indicative of phenotype. Based on this formulation, we show that coordinate dysregulation of larger subnetworks can be bounded using simple statistics on smaller subnetworks. We then use these bounds to devise an efficient algorithm, Crane, that can search the subnetwork space more effectively than simple greedy algorithms. Comprehensive crossclassification experiments show that subnetworks identified by Crane significantly outperform those identified by greedy algorithms in predicting metastasis of colorectal cancer (CRC). 1
Fast Algorithms for Computing Statistics under Interval Uncertainty, with Applications to Computer Science and to Electrical and Computer Engineering
, 2007
"... Computing statistics is important. In many engineering applications, we are interested in computing statistics. For example, in environmental analysis, we observe a pollution level x(t) in a lake at different moments of time t, and we would like to estimate standard statistical characteristics such ..."
Abstract

Cited by 6 (3 self)
 Add to MetaCart
(Show Context)
Computing statistics is important. In many engineering applications, we are interested in computing statistics. For example, in environmental analysis, we observe a pollution level x(t) in a lake at different moments of time t, and we would like to estimate standard statistical characteristics such as mean, variance, autocorrelation, correlation with other measurements. For each of these characteristics C, there is an expression C(x1,..., xn) that enables us to provide an estimate for C based on the observed values x1,..., xn. For example: a reasonable statistic for estimating the mean value of a probability distribution is the population average E(x1,..., xn) = 1 n · (x1 +... + xn); a reasonable statistic for estimating the variance V is the population variance V (x1,..., xn) = 1 n · n∑