Results 1  10
of
20
Modelbased inference of haplotype block variation
 Proceedings of the Seventh Annual International Conference on Computational Molecular Biology (RECOMB 2003
, 2003
"... The uneven recombination structure of human DNA has been highlighted by several recent studies. Knowledge of the haplotype blocks generated by this phenomenon can be applied to dramatically increase the statistical power of genetic mapping. Several criteria have already been proposed for identifying ..."
Abstract

Cited by 49 (6 self)
 Add to MetaCart
The uneven recombination structure of human DNA has been highlighted by several recent studies. Knowledge of the haplotype blocks generated by this phenomenon can be applied to dramatically increase the statistical power of genetic mapping. Several criteria have already been proposed for identifying these blocks, all of which require haplotypes as input. We propose a comprehensive statistical model of haplotype block variation and show how the parameters of this model can be learned from haplotypes and/or unphased genotype data. Using realworld SNP data, we demonstrate that our approach can be used to resolve genotypes into their constituent haplotypes with greater accuracy than previously known methods.
Logical hidden markov models
 Journal of Artificial Intelligence Research
, 2006
"... Logical hidden Markov models (LOHMMs) upgrade traditional hidden Markov models to deal with sequences of structured symbols in the form of logical atoms, rather than flat characters. This note formally introduces LOHMMs and presents solutions to the three central inference problems for LOHMMs: evalu ..."
Abstract

Cited by 42 (10 self)
 Add to MetaCart
Logical hidden Markov models (LOHMMs) upgrade traditional hidden Markov models to deal with sequences of structured symbols in the form of logical atoms, rather than flat characters. This note formally introduces LOHMMs and presents solutions to the three central inference problems for LOHMMs: evaluation, most likely hidden state sequence and parameter estimation. The resulting representation and algorithms are experimentally evaluated on problems from the domain of bioinformatics. 1.
A BlockFree Hidden Markov Model for Genotypes and Its Application to Disease Association
 J. of Computational Biology
, 2005
"... We present a new stochastic model for genotype generation. The model offers a compromise between rigid block structure and no structure altogether: It reflects a general blocky structure of haplotypes, but also allows for “exchange ” of haplotypes at nonboundary SNP sites; it also accommodates rare ..."
Abstract

Cited by 20 (1 self)
 Add to MetaCart
We present a new stochastic model for genotype generation. The model offers a compromise between rigid block structure and no structure altogether: It reflects a general blocky structure of haplotypes, but also allows for “exchange ” of haplotypes at nonboundary SNP sites; it also accommodates rare haplotypes and mutations. We use a hidden Markov model and infer its parameters by an expectationmaximization algorithm. The algorithm was implemented in a software package called HINT (haplotype inference tool) and tested on 58 datasets of genotypes. To evaluate the utility of the model in association studies, we used biological human data to create a simple disease association search scenario. When comparing HINT to three other models, HINT predicted association most accurately.
Segmentation algorithms for time series and sequence data
 In A Tutorial in the SIAM International Conference on Data Mining
, 2005
"... data ..."
Hidden Markov Modelling Techniques for Haplotype Analysis
 In
, 2004
"... A hidden Markov model is introduced for descriptive modelling the mosaiclike structures of haplotypes, due to iterated recombinations within a population. Methods using the minimum description length principle are given for fitting such models to training data. Possible applications of the mod ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
A hidden Markov model is introduced for descriptive modelling the mosaiclike structures of haplotypes, due to iterated recombinations within a population. Methods using the minimum description length principle are given for fitting such models to training data. Possible applications of the models are delineated, and some preliminary analysis results on real sets of haplotypes are reported, demonstrating the potential of our methods.
Aggregating Time Partitions
"... Partitions of sequential data exist either per se or as a result of sequence segmentation algorithms. It is often the case that the same timeline is partitioned in many different ways. For example, different segmentation algorithms produce different partitions of the same underlying data points. In ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
Partitions of sequential data exist either per se or as a result of sequence segmentation algorithms. It is often the case that the same timeline is partitioned in many different ways. For example, different segmentation algorithms produce different partitions of the same underlying data points. In such cases, we are interested in producing an aggregate partition, i.e., a segmentation that agrees as much as possible with the input segmentations. Each partition is defined as a set of continuous nonoverlapping segments of the timeline. We show that this problem can be solved optimally in polynomial time using dynamic programming. We also propose faster greedy heuristics that work well in practice. We experiment with our algorithms and we demonstrate their utility in clustering the behavior of mobilephone users and combining the results of different segmentation algorithms on genomic sequences.
Dynamic programming algorithms for haplotype blocks partitioning with tagSNPs minimization
 In Proceedings of the 24rd Workshop on Combinatorial Mathematics and Computation Theory
"... Recent studies show that the patterns of linkage disequilibrium (LD) observed in human chromosome reveal a blocklike structure; the high LD regions are called haplotype blocks. The existence of haplotype block structures has serious implications for associationbased methods in mapping of disease g ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
Recent studies show that the patterns of linkage disequilibrium (LD) observed in human chromosome reveal a blocklike structure; the high LD regions are called haplotype blocks. The existence of haplotype block structures has serious implications for associationbased methods in mapping of disease genes. A Single Nucleotide Polymorphism or SNP is a DNA sequence variation occurring when a single nucleotide in the genome differs between members of species. In this paper, we propose several efficient algorithms for identifying haplotype blocks in the genome. Especially, we develop a dynamic programming algorithm for haplotype block partitioning to minimize the number of tagSNPs required to account for most of the common haplotypes in each block. We implement these algorithms and analyze the chromosome 21 haplotype data given by Patil et al. [14]. As a result, we identify a total of 2,266 blocks (3,260 tagSNPs) which is 45.2 % (28.6%) smaller than those identified by Patil et al. or Zhang et al. [18].
The Block Partitioning Results Using Different � and � a
, 2004
"... service This article cites 49 articles, 10 of which can be accessed free at: ..."
Abstract
 Add to MetaCart
service This article cites 49 articles, 10 of which can be accessed free at:
A Linear Space Algorithm for Haplotype Blocks Partitioning Using Limited Number of Tag SNPs
"... The pattern of linkage disequilibrium (LD) plays a central role in genomewide association studies of identifying genetic variation responsible of common human diseases. A Single Nucleotide Polymorphism or SNP is a DNA sequence variation occurring when a single nucleotide in the genome differs betwe ..."
Abstract
 Add to MetaCart
The pattern of linkage disequilibrium (LD) plays a central role in genomewide association studies of identifying genetic variation responsible of common human diseases. A Single Nucleotide Polymorphism or SNP is a DNA sequence variation occurring when a single nucleotide in the genome differs between members of species. Recent studies show that the patterns of linkage disequilibrium observed in human chromosome reveal a blocklike structure; the high LD regions are called haplotype blocks, and furthermore, a small subset of SNPs, called tag SNPs, is sufficient to capture the haplotype patterns in each haplotype block. Both Patil [18] and Zhang et al. [24] have proposed algorithms to partition haplotype sample into blocks fully under the circumstances of requiring minimal number of tag SNPs. However, when resources are limited, investigators and biologists may not be able to genotype all the tag SNPs and instead must restrict the number of tag SNPs used in their studies. In this paper, we examine several haplotype block diversity evaluation functions and propose dynamic programming algorithms for haplotype block partitioning with using the limited number of tag SNPs. We implement these algorithms and analyze the chromosome 21 haplotype data given by Patil et al. [18]. When the sample is partitioned into blocks fully, we identify a total of 2,266 blocks and 3,260 tag SNPs which is smaller than those identified by Zhang et al. [24]. We demonstrate that Zhang’s algorithm does not find the optimal solution due to ignoring the nonmonotonic property of common haplotype evaluation function. The algorithms described have been implemented in the webbased system as the analysis tools for bioinformaticists and ∗ This work is supported by grants from the Taichung