Results 1  10
of
26
Maximum Likelihood Haplotyping for General Pedigrees
, 2004
"... Haplotype data is valuable in mapping diseasesusceptibility genes in the study of Mendelian and complex diseases. We present algorithms for inferring a most likely haplotype configuration for general pedigrees, implemented in the newest version of the genetic linkage analysis system SUPERLINK. In S ..."
Abstract

Cited by 34 (2 self)
 Add to MetaCart
Haplotype data is valuable in mapping diseasesusceptibility genes in the study of Mendelian and complex diseases. We present algorithms for inferring a most likely haplotype configuration for general pedigrees, implemented in the newest version of the genetic linkage analysis system SUPERLINK. In SUPERLINK, genetic linkage analysis problems are represented internally using Bayesian networks. The use of Bayesian networks enables efficient maximum likelihood haplotyping for more complex pedigrees than was previously possible. Furthermore, to support efficient haplotyping for larger pedigrees, we have also incorporated a novel algorithm for determining a better elimination order for the variables of the Bayesian network. The presented optimization algorithm also improves likelihood computations. We present experimental results for the new algorithms on a variety of real and semiartificial data sets, and use our software to evaluate MCMC approximations for haplotyping.
New advances in inference by recursive conditioning
 In Proceedings of the 19th Conference on Uncertainty in Artificial Intelligence
"... Recursive Conditioning (RC) was introduced recently as an any–space algorithm for inference in Bayesian networks which can trade time for space by varying the size of its cache at the increment needed to store a floating point number. Under full caching, RC has an asymptotic time and space complexit ..."
Abstract

Cited by 21 (1 self)
 Add to MetaCart
Recursive Conditioning (RC) was introduced recently as an any–space algorithm for inference in Bayesian networks which can trade time for space by varying the size of its cache at the increment needed to store a floating point number. Under full caching, RC has an asymptotic time and space complexity which is comparable to mainstream algorithms based on variable elimination and clustering (exponential in the network treewidth and linear in its size). We show two main results about RC in this paper. First, we show that its actual space requirements under full caching are much more modest than those needed by mainstream methods and study the implications of this finding. Second, we show that RC can effectively deal with determinism in Bayesian networks by employing standard logical techniques, such as unit resolution, allowing a significant reduction in its time requirements in certain cases. We illustrate our results using a number of benchmark networks, including the very challenging ones that arise in genetic linkage analysis. 1
On Finding Minimal wCutset Problem
"... The complexity of a reasoning task over a graphical model is tied to the induced width of the underlying graph. It is wellknown that conditioning (assigning values) on a subset of variables yields a subproblem of the reduced complexity where instantiated variables are removed. If the ..."
Abstract

Cited by 16 (8 self)
 Add to MetaCart
The complexity of a reasoning task over a graphical model is tied to the induced width of the underlying graph. It is wellknown that conditioning (assigning values) on a subset of variables yields a subproblem of the reduced complexity where instantiated variables are removed. If the
Cutset sampling for Bayesian networks
 Journal of Artificial Intelligence Research
"... The paper presents a new sampling methodology for Bayesian networks that samples only a subset of variables and applies exact inference to the rest. Cutset sampling is a network structureexploiting application of the RaoBlackwellisation principle to sampling in Bayesian networks. It improves conve ..."
Abstract

Cited by 14 (6 self)
 Add to MetaCart
The paper presents a new sampling methodology for Bayesian networks that samples only a subset of variables and applies exact inference to the rest. Cutset sampling is a network structureexploiting application of the RaoBlackwellisation principle to sampling in Bayesian networks. It improves convergence by exploiting memorybased inference algorithms. It can also be viewed as an anytime approximation of the exact cutsetconditioning algorithm developed by Pearl. Cutset sampling can be implemented efficiently when the sampled variables constitute a loopcutset of the Bayesian network and, more generally, when the induced width of the network’s graph conditioned on the observed sampled variables is bounded by a constant w. We demonstrate empirically the benefit of this scheme on a range of benchmarks. 1.
SampleSearch: Importance Sampling in Presence of Determinism
, 2009
"... The paper focuses on developing effective importance sampling algorithms for mixed probabilistic and deterministic graphical models. The use of importance sampling in such graphical models is problematic because it generates many useless zero weight samples which are rejected yielding an inefficient ..."
Abstract

Cited by 14 (3 self)
 Add to MetaCart
The paper focuses on developing effective importance sampling algorithms for mixed probabilistic and deterministic graphical models. The use of importance sampling in such graphical models is problematic because it generates many useless zero weight samples which are rejected yielding an inefficient sampling process. To address this rejection problem, we propose the SampleSearch scheme that augments sampling with systematic constraintbased backtracking search. We characterize the bias introduced by the combination of search with sampling, and derive a weighting scheme which yields an unbiased estimate of the desired statistics (e.g. probability of evidence). When computing the weights exactly is too complex, we propose an approximation which has a weaker guarantee of asymptotic unbiasedness. We present results of an extensive empirical evaluation demonstrating that SampleSearch outperforms other schemes in presence of significant amount of determinism.
JoinGraph Propagation Algorithms
"... The paper investigates parameterized approximate messagepassing schemes that are based on bounded inference and are inspired by Pearl’s belief propagation algorithm (BP). We start with the bounded inference miniclustering algorithm and then move to the iterative scheme called Iterative JoinGraph ..."
Abstract

Cited by 10 (6 self)
 Add to MetaCart
The paper investigates parameterized approximate messagepassing schemes that are based on bounded inference and are inspired by Pearl’s belief propagation algorithm (BP). We start with the bounded inference miniclustering algorithm and then move to the iterative scheme called Iterative JoinGraph Propagation (IJGP), that combines both iteration and bounded inference. The algorithm IJGP belongs to the class of Generalized Belief Propagation algorithms, a framework that allowed connections with approximate algorithms from statistical physics and is shown empirically to surpass the performance of miniclustering and belief propagation, as well as a number of other stateoftheart algorithms on several classes of networks. We also provide insight into the accuracy of IBP and IJGP by relating these algorithms to well known classes of constraint propagation schemes.
Online system for faster multipoint linkage analysis via parallel execution on thousands of personal computers
 American Journal of Human Genetics
"... Computation of LOD scores is a valuable tool for mapping diseasesusceptibility genes in the study of Mendelian and complex diseases. However, computation of exact multipoint likelihoods of large inbred pedigrees with extensive missing data is often beyond the capabilities of a single computer. We p ..."
Abstract

Cited by 9 (2 self)
 Add to MetaCart
Computation of LOD scores is a valuable tool for mapping diseasesusceptibility genes in the study of Mendelian and complex diseases. However, computation of exact multipoint likelihoods of large inbred pedigrees with extensive missing data is often beyond the capabilities of a single computer. We present a distributed system called “SUPERLINKONLINE, ” for the computation of multipoint LOD scores of large inbred pedigrees. It achieves high performance via the efficient parallelization of the algorithms in SUPERLINK, a stateoftheart serial program for these tasks, and through the use of the idle cycles of thousands of personal computers. The main algorithmic challenge has been to efficiently split a large task for distributed execution in a highly dynamic, nondedicated running environment. Notably, the system is available online, which allows computationally intensive analyses to be performed with no need for either the installation of software or the maintenance of a complicated distributed environment. As the system was being developed, it was extensively tested by collaborating medical centers worldwide on a variety of real data sets, some of which are presented in this article. Computation of LOD is a valuable tool for mapping diseasesusceptibility genes in the study of Mendelian and complex diseases. Computation of the LOD score— defined as log 10 (L HA/L H0) , where LH0
Pushing the power of stochastic greedy ordering schemes for inference in graphical models
 IN AAAI 2011
, 2011
"... We study iterative randomized greedy algorithms for generating (elimination) orderings with small induced width and state space size two parameters known to bound the complexity of inference in graphical models. We propose and implement the Iterative Greedy Variable Ordering (IGVO) algorithm, a new ..."
Abstract

Cited by 8 (5 self)
 Add to MetaCart
We study iterative randomized greedy algorithms for generating (elimination) orderings with small induced width and state space size two parameters known to bound the complexity of inference in graphical models. We propose and implement the Iterative Greedy Variable Ordering (IGVO) algorithm, a new variant within this algorithm class. An empirical evaluation using different ranking functions and conditions of randomness, demonstrates that IGVO finds significantly better orderings than standard greedy ordering implementations when evaluated within an anytime framework. Additional order of magnitude improvements are demonstrated on a multicore system, thus further expanding the set of solvable graphical models. The experiments also confirm the superiority of the MinFill heuristic within the iterative scheme.
Studies in Lower Bounding Probability of Evidence using the Markov Inequality
"... Computing the probability of evidence even with known error bounds is NPhard. In this paper we address this hard problem by settling on an easier problem. We propose an approximation which provides high confidence lower bounds on probability of evidence but does not have any guarantees in terms of ..."
Abstract

Cited by 8 (4 self)
 Add to MetaCart
Computing the probability of evidence even with known error bounds is NPhard. In this paper we address this hard problem by settling on an easier problem. We propose an approximation which provides high confidence lower bounds on probability of evidence but does not have any guarantees in terms of relative or absolute error. Our proposed approximation is a randomized importance sampling scheme that uses the Markov inequality. However, a straightforward application of the Markov inequality may lead to poor lower bounds. We therefore propose several heuristic measures to improve its performance in practice. Empirical evaluation of our scheme with stateoftheart lower bounding schemes reveals the promise of our approach. 1
AND/OR Importance Sampling
"... The paper introduces AND/OR importance sampling for probabilistic graphical models. In contrast to importance sampling, AND/OR importance sampling caches samples in the AND/OR space and then extracts a new sample mean from the stored samples. We prove that AND/OR importance sampling may have lower v ..."
Abstract

Cited by 5 (4 self)
 Add to MetaCart
The paper introduces AND/OR importance sampling for probabilistic graphical models. In contrast to importance sampling, AND/OR importance sampling caches samples in the AND/OR space and then extracts a new sample mean from the stored samples. We prove that AND/OR importance sampling may have lower variance than importance sampling; thereby providing a theoretical justification for preferring it over importance sampling. Our empirical evaluation demonstrates that AND/OR importance sampling is far more accurate than importance sampling in many cases. 1