Results 1 -
8 of
8
A space–time tradeoff for permutation problems
- In Proceedings of the ACM-SIAM Symposium on Discrete Algorithms (SODA
, 2010
"... Many combinatorial problems—such as the traveling salesman, feedback arcset, cutwidth, and treewidth problem— can be formulated as finding a feasible permutation of n elements. Typically, such problems can be solved by dynamic programming in time and space O ∗ (2 n), by divide and conquer in time O ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
Many combinatorial problems—such as the traveling salesman, feedback arcset, cutwidth, and treewidth problem— can be formulated as finding a feasible permutation of n elements. Typically, such problems can be solved by dynamic programming in time and space O ∗ (2 n), by divide and conquer in time O ∗ (4 n) and polynomial space, or by a combination of the two in time O ∗ (4 n 2 −s) and space O ∗ (2 s) for s = n, n/2, n/4,.... Here, we show that one can improve the tradeoff to time O ∗ (T n) and space O ∗ (S n) with T S < 4 at any √ 2 < S < 2. The idea is to find a small family of “thin ” partial orders on the n elements such that every linear order is an extension of one member of the family. Our construction is optimal within a natural class of partial order families. 1
Utilizing evolutionary information and gene expression data for estimating gene networks with Bayesian network models
, 2005
"... Since microarray gene expression data do not contain sufficient information for estimating accurate gene networks, other biological information has been considered to improve the estimated networks. Recent studies have revealed that highly conserved proteins that exhibit similar expression patterns ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Since microarray gene expression data do not contain sufficient information for estimating accurate gene networks, other biological information has been considered to improve the estimated networks. Recent studies have revealed that highly conserved proteins that exhibit similar expression patterns in different organisms, have almost the same function in each organism. Such conserved proteins are also known to play similar roles in terms of the regulation of genes. Therefore, this evolutionary information can be used to refine regulatory relationships among genes, which are estimated from gene expression data. We propose a statistical method for estimating gene networks from gene expression data by utilizing evolutionarily conserved relationships between genes. Our method simultaneously estimates two gene networks of two distinct organisms, with a Bayesian network model utilizing the evolutionary information so that gene expression data of one organism helps to estimate the gene network of the other. We show the effectiveness of the method through the analysis on Saccharomyces cerevisiae and Homo sapiens cell cycle gene expression data. Our method was successful in estimating gene networks that capture many known relationships as well as several unknown relationships which are likely to be novel. Supplementary information is available at
Finding Optimal Bayesian Network Given a Super-Structure
"... Classical approaches used to learn Bayesian network structure from data have disadvantages in terms of complexity and lower accuracy of their results. However, a recent empirical study has shown that a hybrid algorithm improves sensitively accuracy and speed: it learns a skeleton with an independenc ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Classical approaches used to learn Bayesian network structure from data have disadvantages in terms of complexity and lower accuracy of their results. However, a recent empirical study has shown that a hybrid algorithm improves sensitively accuracy and speed: it learns a skeleton with an independency test (IT) approach and constrains on the directed acyclic graphs (DAG) considered during the search-and-score phase. Subsequently, we theorize the structural constraint by introducing the concept of super-structure S, which is an undirected graph that restricts the search to networks whose skeleton is a subgraph of S. We develop a super-structure constrained optimal search (COS): its time complexity is upper bounded by O(γm n), where γm < 2 depends on the maximal degree m of S. Empirically, complexity depends on the average degree ˜m and sparse structures allow larger graphs to be calculated. Our algorithm is faster than an optimal search by several orders and even finds more accurate results when given a sound super-structure. Practically, S can be approximated by IT approaches; significance level of the tests controls its sparseness, enabling to control the trade-off between speed and accuracy. For incomplete super-structures, a greedily post-processed version (COS+) still enables to significantly outperform other heuristic searches. Keywords: subset Bayesian networks, structure learning, optimal search, super-structure, connected 1.
Increasing Feasibility of Optimal Gene Network Estimation
- Genome Informatics
, 2004
"... Disentangling networks of regulation of gene expression is a major challenge in the field of computational biology. Harvesting the information contained in microarray data sets is a promising approach towards this challenge. We propose an algorithm for the optimal estimation of Bayesian networks fro ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Disentangling networks of regulation of gene expression is a major challenge in the field of computational biology. Harvesting the information contained in microarray data sets is a promising approach towards this challenge. We propose an algorithm for the optimal estimation of Bayesian networks from microarray data, which reduces the CPU time and memory consumption of previous algorithms. We prove that the space complexity can be reduced from O(n )toO(2 ), and that the expected calculation time can be reduced from O(n )toO(n ), where n is the number of genes. We make intrinsic use of a limitation of the maximal number of regulators of each gene, which has biological as well as statistical justifications. The improvements are significant for some applications in research.
Methods to Accelerate the Learning of Bayesian Network Structures
"... Bayesian networks have become a standard technique in the representation of uncertain knowledge. This paper proposes methods that can accelerate the learning of a Bayesian network structure from a data set. These methods are applicable when learning an equivalence class of Bayesian network structure ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Bayesian networks have become a standard technique in the representation of uncertain knowledge. This paper proposes methods that can accelerate the learning of a Bayesian network structure from a data set. These methods are applicable when learning an equivalence class of Bayesian network structures whilst using a score and search strategy. They work by constraining the number of validity tests that need to be done and by caching the results of validity tests. The results of experiments show that the methods improve the performance of algorithms that search through the space of equivalence classes multiple times and that operate on wide data sets. The experiments were performed by sampling data from six standard Bayesian networks and running an ant colony optimization algorithm designed to learn a Bayesian network equivalence class. 1
Enumeration of Likely Gene Networks and Network Motif Extraction for Large Gene Networks
, 2003
"... Introduction The reliable estimation of gene networks from gene expression measurements is a major challenge in the field of Bioinformatics. Recently, an algorithm for the optimal estimation of small gene networks within the Bayesian network framework was found [3]. This algorithm was further exten ..."
Abstract
- Add to MetaCart
Introduction The reliable estimation of gene networks from gene expression measurements is a major challenge in the field of Bioinformatics. Recently, an algorithm for the optimal estimation of small gene networks within the Bayesian network framework was found [3]. This algorithm was further extended to allow the enumeration of all optimal networks and also suboptimal networks in the order of their likelihood [2]. In this work, we show how this result can be applied to the enumeration of likely gene networks for a large number of genes. Enumerating a number of the most likely gene network models instead of just focusing on the single most likely network model allows to evaluate the reliability of the estimations. If we can find a partial network that is common to most of the likely network models, we can expect this part to be the most reliable part. We denote such common parts as gene network motifs. 2 Method Let us start with defining a class of subsets of the set of acyclic dir
On Bayesian Networks and Partial Orders
"... The essence of the Bayesian network model is largely embodied by its structural component: a directed acyclic graph (DAG). A DAG encodes assertions of conditional independence, whereas the remaining model parameters specify the actual local conditional distributions. The DAG plays an important role, ..."
Abstract
- Add to MetaCart
The essence of the Bayesian network model is largely embodied by its structural component: a directed acyclic graph (DAG). A DAG encodes assertions of conditional independence, whereas the remaining model parameters specify the actual local conditional distributions. The DAG plays an important role, for instance, in causal discovery and inference, where the DAG is interpreted as a representation of direct causes; that is, an arc uv from node u to node v asserts that the events associated with u are direct causes of the events associated with v. Given a DAG it is often trivial to fit the parameters of the local conditional distributions to a data set. On the contrary, learning the (postulated, true) DAG from data is very challenging both statistically and computationally. We next address some of the challenges in more detail. Consider a DAG (N, A) with node set N = {1, 2,..., n} and arc set A ⊆ N × N. We identify the DAG with its arc set and write Av for the parents of node v, that is, Av = {u: uv ∈ A}. Each node v is associated with m random variables xv1,..., xvm, which are understood as the uth row of an n × m matrix X. Commonly made assumptions of independence (or exchangeability) [1] imply that the rows of X ∏are conditionally independent given each row’s parents. In terms of probability (density), p(X|A) = p(Xv|XAv v∈N, A). Typically, we assume that X is observed and the task is to “learn ” A, that is,
Bayesian structure discovery in Bayesian networks with less space
"... Current exact algorithms for score-based structure discovery in Bayesian networks on n nodes run in time and space within a polynomial factor of 2 n. For practical use, the space requirement is the bottleneck, which motivates trading space against time. Here, previous results on finding an optimal n ..."
Abstract
- Add to MetaCart
Current exact algorithms for score-based structure discovery in Bayesian networks on n nodes run in time and space within a polynomial factor of 2 n. For practical use, the space requirement is the bottleneck, which motivates trading space against time. Here, previous results on finding an optimal network structure in less space are extended in two directions. First, we consider the problem of computing the posterior probability of a given arc set. Second, we operate with the general partial order framework and its specialization to bucket orders, introduced recently for related permutation problems. The main technical contribution is the development of a fast algorithm for a novel zeta transform variant, which may be of independent interest. 1

