Results 1 -
8 of
8
Finding Optimal Gene Networks Using Biological Constraints
- Genome Informatics
, 2003
"... The accurate estimation of gene networks from gene expression measurements is a major challenge in the field of Bioinformatics. Since the problem of estimating gene networks is NP-hard and exhibits a search space of super-exponential size, researchers are using heuristic algorithms for this task. ..."
Abstract
-
Cited by 9 (3 self)
- Add to MetaCart
The accurate estimation of gene networks from gene expression measurements is a major challenge in the field of Bioinformatics. Since the problem of estimating gene networks is NP-hard and exhibits a search space of super-exponential size, researchers are using heuristic algorithms for this task. However, little can be said about the accuracy of heuristic estimations. In order to overcome this problem, we present a general approach to reduce the search space to a biologically meaningful subspace and to find optimal solutions within the subspace in linear time. We show the e#ectiveness of this approach in application to yeast and Bacillus subtilis data.
Applying dynamic bayesian networks to perturbed gene expression data
- BMC bioinformatics
, 2006
"... Abstract Motivation: A central goal of molecular biology is to understand the regulatory mechanisms of gene transcription and protein synthesis. Because of their solid basis in statistics, allowing to deal with the stochastic aspects of gene expressions and noisy measurements in a natural way, Bayes ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
Abstract Motivation: A central goal of molecular biology is to understand the regulatory mechanisms of gene transcription and protein synthesis. Because of their solid basis in statistics, allowing to deal with the stochastic aspects of gene expressions and noisy measurements in a natural way, Bayesian networks appear attractive in the field of inferring gene interactions structure from microarray experiments data. However, the basic formalism has some disadvantages, e.g. it is sometimes hard to distinguish between the origin and the object of an interaction. Two kinds of microarray experiments yield data particularly rich in information regarding the direction of interactions: time series and perturbation experiments. In order to correctly handle them, the basic formalism must be modified. For example, dynamic Bayesian networks apply to time series microarray data. Results: We extend the framework of dynamic Bayesian networks in order to handle perturbations. A new discretization method, specialized for datasets from time series perturbations experiments, is also introduced. We compare networks inferred from realistic simulations data by our method and by dynamic Bayesian networks learning techniques. We conclude that application of our method substantially improves inferring. 1 Introduction As most genetic regulatory systems involve many components connected through complex networks of interactions, formal methods and computer tools for modeling and simulating are needed. Therefore, various formalisms were proposed to describe genetic regulatory systems, including Boolean networks and their generalizations, ordinary and partial differential equations, stochastic equations and Bayesian networks (see [4] for a review). While differential and stochastic equations describe the biophysical processes at a very refined level of detail and prove useful in simulations of well studied systems, Bayesian networks appear attractive in the field of inferring the regulatory network structure from gene expression data. The reason is that their learning techniques have solid basis in statistics, allowing to deal with the stochastic aspects of gene expressions and noisy measurements in a natural way.
Utilizing evolutionary information and gene expression data for estimating gene networks with Bayesian network models
, 2005
"... Since microarray gene expression data do not contain sufficient information for estimating accurate gene networks, other biological information has been considered to improve the estimated networks. Recent studies have revealed that highly conserved proteins that exhibit similar expression patterns ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Since microarray gene expression data do not contain sufficient information for estimating accurate gene networks, other biological information has been considered to improve the estimated networks. Recent studies have revealed that highly conserved proteins that exhibit similar expression patterns in different organisms, have almost the same function in each organism. Such conserved proteins are also known to play similar roles in terms of the regulation of genes. Therefore, this evolutionary information can be used to refine regulatory relationships among genes, which are estimated from gene expression data. We propose a statistical method for estimating gene networks from gene expression data by utilizing evolutionarily conserved relationships between genes. Our method simultaneously estimates two gene networks of two distinct organisms, with a Bayesian network model utilizing the evolutionary information so that gene expression data of one organism helps to estimate the gene network of the other. We show the effectiveness of the method through the analysis on Saccharomyces cerevisiae and Homo sapiens cell cycle gene expression data. Our method was successful in estimating gene networks that capture many known relationships as well as several unknown relationships which are likely to be novel. Supplementary information is available at
Finding Optimal Bayesian Network Given a Super-Structure
"... Classical approaches used to learn Bayesian network structure from data have disadvantages in terms of complexity and lower accuracy of their results. However, a recent empirical study has shown that a hybrid algorithm improves sensitively accuracy and speed: it learns a skeleton with an independenc ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Classical approaches used to learn Bayesian network structure from data have disadvantages in terms of complexity and lower accuracy of their results. However, a recent empirical study has shown that a hybrid algorithm improves sensitively accuracy and speed: it learns a skeleton with an independency test (IT) approach and constrains on the directed acyclic graphs (DAG) considered during the search-and-score phase. Subsequently, we theorize the structural constraint by introducing the concept of super-structure S, which is an undirected graph that restricts the search to networks whose skeleton is a subgraph of S. We develop a super-structure constrained optimal search (COS): its time complexity is upper bounded by O(γm n), where γm < 2 depends on the maximal degree m of S. Empirically, complexity depends on the average degree ˜m and sparse structures allow larger graphs to be calculated. Our algorithm is faster than an optimal search by several orders and even finds more accurate results when given a sound super-structure. Practically, S can be approximated by IT approaches; significance level of the tests controls its sparseness, enabling to control the trade-off between speed and accuracy. For incomplete super-structures, a greedily post-processed version (COS+) still enables to significantly outperform other heuristic searches. Keywords: subset Bayesian networks, structure learning, optimal search, super-structure, connected 1.
Increasing Feasibility of Optimal Gene Network Estimation
- Genome Informatics
, 2004
"... Disentangling networks of regulation of gene expression is a major challenge in the field of computational biology. Harvesting the information contained in microarray data sets is a promising approach towards this challenge. We propose an algorithm for the optimal estimation of Bayesian networks fro ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Disentangling networks of regulation of gene expression is a major challenge in the field of computational biology. Harvesting the information contained in microarray data sets is a promising approach towards this challenge. We propose an algorithm for the optimal estimation of Bayesian networks from microarray data, which reduces the CPU time and memory consumption of previous algorithms. We prove that the space complexity can be reduced from O(n )toO(2 ), and that the expected calculation time can be reduced from O(n )toO(n ), where n is the number of genes. We make intrinsic use of a limitation of the maximal number of regulators of each gene, which has biological as well as statistical justifications. The improvements are significant for some applications in research.
Methods to Accelerate the Learning of Bayesian Network Structures
"... Bayesian networks have become a standard technique in the representation of uncertain knowledge. This paper proposes methods that can accelerate the learning of a Bayesian network structure from a data set. These methods are applicable when learning an equivalence class of Bayesian network structure ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Bayesian networks have become a standard technique in the representation of uncertain knowledge. This paper proposes methods that can accelerate the learning of a Bayesian network structure from a data set. These methods are applicable when learning an equivalence class of Bayesian network structures whilst using a score and search strategy. They work by constraining the number of validity tests that need to be done and by caching the results of validity tests. The results of experiments show that the methods improve the performance of algorithms that search through the space of equivalence classes multiple times and that operate on wide data sets. The experiments were performed by sampling data from six standard Bayesian networks and running an ant colony optimization algorithm designed to learn a Bayesian network equivalence class. 1
Enumeration of Likely Gene Networks and Network Motif Extraction for Large Gene Networks
, 2003
"... Introduction The reliable estimation of gene networks from gene expression measurements is a major challenge in the field of Bioinformatics. Recently, an algorithm for the optimal estimation of small gene networks within the Bayesian network framework was found [3]. This algorithm was further exten ..."
Abstract
- Add to MetaCart
Introduction The reliable estimation of gene networks from gene expression measurements is a major challenge in the field of Bioinformatics. Recently, an algorithm for the optimal estimation of small gene networks within the Bayesian network framework was found [3]. This algorithm was further extended to allow the enumeration of all optimal networks and also suboptimal networks in the order of their likelihood [2]. In this work, we show how this result can be applied to the enumeration of likely gene networks for a large number of genes. Enumerating a number of the most likely gene network models instead of just focusing on the single most likely network model allows to evaluate the reliability of the estimations. If we can find a partial network that is common to most of the likely network models, we can expect this part to be the most reliable part. We denote such common parts as gene network motifs. 2 Method Let us start with defining a class of subsets of the set of acyclic dir
Inference Submitted by Carey Pridgeon to the University of Exeter
"... This thesis is available for Library use on the understanding that it is copyright material and that no quotation from the thesis may be published without proper acknowledgement. I certify that all material in this thesis which is not my own work has been identified and that no material has previous ..."
Abstract
- Add to MetaCart
This thesis is available for Library use on the understanding that it is copyright material and that no quotation from the thesis may be published without proper acknowledgement. I certify that all material in this thesis which is not my own work has been identified and that no material has previously been submitted and approved for the award of a degree by this or any other University...................................... 3 In this thesis we are investigating the use of the Evolutionary Algorithm (EA) in application to problems in molecular biology. We examine two topics of current interest in the field of bioinformatics; Gene Expression Regulatory Network Reconstruction and Core Promoter recognition, spending most of this thesis on the latter problem. When exploring Gene network reconstruction we investigate the problem in terms of reconstructing large scale networks. We apply two forms of evolutionary algorithm and evaluate their effect on two classes of time series data, static (where the expression levels do not change over a time series, and dynamic (where expression levels are changing as a result of perturbation). We discover some of the properties of large scale GRN models, and reveal some issues regarding variation among GRNs

