Results 1  10
of
14
Blocking Gibbs Sampling in Very Large Probabilistic Expert Systems
 Internat. J. Humanâ€“Computer Studies
, 1995
"... We introduce a methodology for performing approximate computations in very complex probabilistic systems (e.g. huge pedigrees). Our approach, called blocking Gibbs, combines exact local computations with Gibbs sampling in a way that complements the strengths of both. The methodology is illustrate ..."
Abstract

Cited by 46 (0 self)
 Add to MetaCart
We introduce a methodology for performing approximate computations in very complex probabilistic systems (e.g. huge pedigrees). Our approach, called blocking Gibbs, combines exact local computations with Gibbs sampling in a way that complements the strengths of both. The methodology is illustrated on a realworld problem involving a heavily inbred pedigree containing 20;000 individuals. We present results showing that blockingGibbs sampling converges much faster than plain Gibbs sampling for very complex problems.
Estimation of conditional multilocus gene identity among relatives
 STATISTICS IN MOLECULAR BIOLOGY AND GENETICS: SELECTED PROCEEDINGS OF A 1997 JOINT AMSIMSSIAM SUMMER CONFERENCE ON STATISTICS IN MOLECULAR BIOLOGY', VOL. 33 OF IMS LECTURE NOTEMONOGRAPH SERIES, INSTITUTE OF MATHEMATICAL STATISTICS
, 1999
"... Genetic Analysis Workshop 10 identified five key factors contributing to the resolution of the genetic factors affecting complex traits. These include analysis with multipoint methods, use of extended pedigrees, and selective sampling of pedigrees. By sampling the affected individuals in an extended ..."
Abstract

Cited by 12 (2 self)
 Add to MetaCart
Genetic Analysis Workshop 10 identified five key factors contributing to the resolution of the genetic factors affecting complex traits. These include analysis with multipoint methods, use of extended pedigrees, and selective sampling of pedigrees. By sampling the affected individuals in an extended pedigree, we obtain individuals who have an increased probability of sharing genes identical by descent (IBD) at marker loci that are linked to the trait locus or loci. Given marker data on specified members of a pedigree, the conditional IBD status among relatives can be assessed, but exact computation is often impractical for multiple linked markers on complex pedigrees. The use of Markov chain Monte Carlo (MCMC) methods greatly extends the range of models and data sets for which analysis is computationally feasible. Many forms of MCMC have now been implemented in the context of genetic analysis. Here we propose a new sampler, which takes as latent variables the segregation indicators at marker loci, and jointly updates all indicators corresponding to a given meiosis. The sampler has good mixing properties. Questions of irreducibility are also addressed.
Data simulation software for wholegenome association and other studies in human genetics
 Proceedings of the Pacific Symposium on Biocomputing
, 2006
"... Genomewide association studies have become a reality in the study of the genetics of complex disease. This technology provides a wealth of genomic information on patient samples, from which we hope to learn novel biology and detect important genetic and environmental factors for disease processes. ..."
Abstract

Cited by 10 (1 self)
 Add to MetaCart
Genomewide association studies have become a reality in the study of the genetics of complex disease. This technology provides a wealth of genomic information on patient samples, from which we hope to learn novel biology and detect important genetic and environmental factors for disease processes. Because strategies for analyzing these data have not kept pace with the laboratory methods that generate the data it is unlikely that these advances will immediately lead to an improved understanding of the genetic contribution to common human disease and drug response. Currently, no single analytical method will allow us to extract all information from a wholegenome association study. Thus, many novel methods are being proposed and developed. It will be vital for the success of these new methods, to have the ability to simulate datasets consisting of polymorphisms throughout the genome with realistic linkage disequilibrium patterns. Within these datasets, we can embed genetic models of disease whereby we can evaluate the ability of novel methods to detect these simulated effects. This paper describes a new software package, genomeSIM, for the simulation of largescale genomic data in population based casecontrol samples. It allows for single SNP, as well as genegene interaction models to be associated with disease risk. We describe the algorithm and demonstrate its utility for future genetic studies of wholegenome association. 1.
Multilocus linkage analysis by blocked Gibbs sampling
 Statistics and Computing
, 2000
"... The problem of multilocus linkage analysis is expressed as a graphical model, making explicit a previously implicit connection, and recent developments in the field are described in this context. A novel application of blocked Gibbs sampling for Bayesian networks is developed to generate inheritance ..."
Abstract

Cited by 9 (0 self)
 Add to MetaCart
The problem of multilocus linkage analysis is expressed as a graphical model, making explicit a previously implicit connection, and recent developments in the field are described in this context. A novel application of blocked Gibbs sampling for Bayesian networks is developed to generate inheritance matrices from an irreducible Markov chain. This is used as the basis for reconstruction of historical meiotic states and approximate calculation of the likelihood function for the location of an unmapped genetic trait. We believe this to be the only approach that currently makes fully informative multilocus linkage analysis possible on large extended pedigrees.
Problems with the Determination of the Noncommunicating Classes for MCMC Applications in Pedigree Analysis
, 1998
"... Exact calculations for probabilities on complex pedigrees are computationally intensive and very often infeasible. Markov chain Monte Carlo methods are frequently used to approximate probabilities and likelihoods of interest. However, when a locus with more than two alleles is considered, the ..."
Abstract

Cited by 4 (2 self)
 Add to MetaCart
Exact calculations for probabilities on complex pedigrees are computationally intensive and very often infeasible. Markov chain Monte Carlo methods are frequently used to approximate probabilities and likelihoods of interest. However, when a locus with more than two alleles is considered, the underlying Markov chain is not guaranteed to be irreducible and the results of such analyses are unreliable. A method for finding the noncommunicating classes of the Markov chain would be very useful in designing algorithms that can jump between these classes. In this paper we will examine some existing work on this problem and point out its limitations. We will also 1 comment on the difficulty of developing a useful algorithm. Keywords: Complex pedigrees, reducibility, islands, Gibbs sampling 2 1 Introduction The computation of probabilities on pedigrees is an essential component in any analysis of genetic data on groups of related individuals. Such computations are relevant ...
Linkage Analysis With Sequential Imputation
 GENET EPIDEMIOL
, 2003
"... ... In this article, we propose a Monte Carlo method for linkage analysis based on sequential imputation. Unlike exact methods, sequential imputation can handle large pedigrees with a moderate number of loci in its current implementation. This Monte Carlo method is an application of importance sampl ..."
Abstract

Cited by 3 (2 self)
 Add to MetaCart
... In this article, we propose a Monte Carlo method for linkage analysis based on sequential imputation. Unlike exact methods, sequential imputation can handle large pedigrees with a moderate number of loci in its current implementation. This Monte Carlo method is an application of importance sampling, in which we sequentially impute ordered genotypes locus by locus, and then impute inheritance vectors conditioned on these genotypes. The resulting inheritance vectors, together with the importance sampling weights, are used to derive a consistent estimator of any linkage statistic of interest. The linkage statistic can be parametric or nonparametric; we focus on nonparametric linkage statistics. We demonstrate that accurate estimates can be achieved within a reasonable computing time. A simulation study illustrates the potential gain in power using our method for multilocus linkage analysis with large pedigrees. We simulated data at six markers under three models. We analyzed them using both sequential imputation and GENEHUNTER. GENEHUNTER had to drop between 3854% of pedigree members, whereas our method was able to use all pedigree members. The power gains of using all pedigree members were substantial under 2 of the 3 models. We implemented sequential imputation for multilocus linkage analysis in a userfriendly software package called SIMPLE.
Pedigree generation for analysis of genetic linkage and association
, 2004
"... We have developed a software package, SIMLA (simulation of linkage and association), which can be used to generate pedigree data under userspecified conditions. The number and location of disease loci, disease penetrances, marker locations, and marker disequilibrium with a disease locus and with ot ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
We have developed a software package, SIMLA (simulation of linkage and association), which can be used to generate pedigree data under userspecified conditions. The number and location of disease loci, disease penetrances, marker locations, and marker disequilibrium with a disease locus and with other markers can be controlled. In addition, the pedigree size and availability of genotype data may also be specified, and a number of rules for family ascertainment are available. Estimates for power and type I errors can be evaluated under a variety of conditions, as needed by the user. We developed this simulation program because there are no publicly available programs to simulate variable levels of both recombination and linkage disequilibrium (LD) in general pedigrees. Genetic researchers are routinely applying both tests of linkage and familybased tests of association in the search for complex disease genes, and a plethora of different statistical approaches are available. Thus there is a need for the flexible statistical simulation program that we describe. This is the only program that we are aware of that allows simulation of linkage and association for multiple markers in extended pedigrees, nuclear families or in sets of unrelated cases and controls. Furthermore, the program not only allows for variable levels of LD among markers but also between markers and disease loci. SIMLA can simulate the complex and variable levels of LD that have been observed at close markers across the genome and allows for realistic simulation of complex relationships between markers. The program will be useful for studying and comparing existing statistical tests, for developing new genetic linkage and association statistics, planning sample sizes for new studies, and interpreting genetic analysis results.
Genetic Function Analysis
"... this paper, it can be shown that the vast majority of genetic circuits show partial correlations with all of their loci. There are pathological exceptions to this, such as the parity function, which only shows a correlation when its entire basis of support is included. The parity function yields 1 i ..."
Abstract
 Add to MetaCart
this paper, it can be shown that the vast majority of genetic circuits show partial correlations with all of their loci. There are pathological exceptions to this, such as the parity function, which only shows a correlation when its entire basis of support is included. The parity function yields 1 if the sum of the locus states (which we have represented as 0s and 1s) is even, 0 if odd. All partial measures of mutual information yield 0. We need to know the expectation value of the mutual information. We can enumerate the possible pg mappings as follows: Each entry in a Boolean disease table can be either zero or one. Form the following generating function: [29] G(D) = 4 Y i 1 ;...;i n=1 0 1 + (L 1 (i 1 )L 2 (i 2 ) . . . L n (i n )d) pg where n is the number of loci, p g is the population frequency of a given genotype, and the indices i 1 ; . . . ; i n are genotype indices, reflecting the four possible genotypes that can exist at each locus, assuming all alleles are distinct. The loci are denoted by the L, and the population of affecteds is tracked by the exponent of d in the final expansion. This generating function will automatically enumerate all of the genotype frequencies associated with all possible ways of filling out the Boolean disease truth table, and the final expression will have 4 terms in its product, or 3 terms if we assume the disease is autosomal. Each term reflects the presence of a zero (nonaffected status) or a one (affected status) in the disease column. Nonaffected status is represented by the 1 in the generating function factor, as nonaffectedness does not contribute to the genotypedisease correlation. The other term keeps track of the contribution to the genotypedisease correlation when there is a 1 in the disease column
Linkage Disequilibrium Analysis, Pedigree Data Corresponding Author:
, 2005
"... The impact of using related individuals for haplotype ..."