Results 1 - 10
of
14
Logical hidden markov models
- Journal of Artificial Intelligence Research
, 2006
"... Logical hidden Markov models (LOHMMs) upgrade traditional hidden Markov models to deal with sequences of structured symbols in the form of logical atoms, rather than flat characters. This note formally introduces LOHMMs and presents solutions to the three central inference problems for LOHMMs: evalu ..."
Abstract
-
Cited by 33 (10 self)
- Add to MetaCart
Logical hidden Markov models (LOHMMs) upgrade traditional hidden Markov models to deal with sequences of structured symbols in the form of logical atoms, rather than flat characters. This note formally introduces LOHMMs and presents solutions to the three central inference problems for LOHMMs: evaluation, most likely hidden state sequence and parameter estimation. The resulting representation and algorithms are experimentally evaluated on problems from the domain of bioinformatics. 1.
A Block-Free Hidden Markov Model for Genotypes and Its Application to Disease Association
- J. of Computational Biology
, 2005
"... We present a new stochastic model for genotype generation. The model offers a compromise between rigid block structure and no structure altogether: It reflects a general blocky structure of haplotypes, but also allows for “exchange ” of haplotypes at nonboundary SNP sites; it also accommodates rare ..."
Abstract
-
Cited by 17 (1 self)
- Add to MetaCart
We present a new stochastic model for genotype generation. The model offers a compromise between rigid block structure and no structure altogether: It reflects a general blocky structure of haplotypes, but also allows for “exchange ” of haplotypes at nonboundary SNP sites; it also accommodates rare haplotypes and mutations. We use a hidden Markov model and infer its parameters by an expectation-maximization algorithm. The algorithm was implemented in a software package called HINT (haplotype inference tool) and tested on 58 datasets of genotypes. To evaluate the utility of the model in association studies, we used biological human data to create a simple disease association search scenario. When comparing HINT to three other models, HINT predicted association most accurately.
Segmentation algorithms for time series and sequence data
- In A Tutorial in the SIAM International Conference on Data Mining
, 2005
"... data ..."
Hidden Markov Modelling Techniques for Haplotype Analysis
- In
, 2004
"... A hidden Markov model is introduced for descriptive modelling the mosaic--like structures of haplotypes, due to iterated recombinations within a population. Methods using the minimum description length principle are given for fitting such models to training data. Possible applications of the mod ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
A hidden Markov model is introduced for descriptive modelling the mosaic--like structures of haplotypes, due to iterated recombinations within a population. Methods using the minimum description length principle are given for fitting such models to training data. Possible applications of the models are delineated, and some preliminary analysis results on real sets of haplotypes are reported, demonstrating the potential of our methods.
Aggregating Time Partitions
"... Partitions of sequential data exist either per se or as a result of sequence segmentation algorithms. It is often the case that the same timeline is partitioned in many different ways. For example, different segmentation algorithms produce different partitions of the same underlying data points. In ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Partitions of sequential data exist either per se or as a result of sequence segmentation algorithms. It is often the case that the same timeline is partitioned in many different ways. For example, different segmentation algorithms produce different partitions of the same underlying data points. In such cases, we are interested in producing an aggregate partition, i.e., a segmentation that agrees as much as possible with the input segmentations. Each partition is defined as a set of continuous non-overlapping segments of the timeline. We show that this problem can be solved optimally in polynomial time using dynamic programming. We also propose faster greedy heuristics that work well in practice. We experiment with our algorithms and we demonstrate their utility in clustering the behavior of mobile-phone users and combining the results of different segmentation algorithms on genomic sequences.
Dynamic programming algorithms for haplotype blocks partitioning with tagSNPs minimization
- In Proceedings of the 24rd Workshop on Combinatorial Mathematics and Computation Theory
"... Recent studies show that the patterns of linkage disequilibrium (LD) observed in human chromosome reveal a block-like structure; the high LD regions are called haplotype blocks. The existence of haplotype block structures has serious implications for association-based methods in mapping of disease g ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Recent studies show that the patterns of linkage disequilibrium (LD) observed in human chromosome reveal a block-like structure; the high LD regions are called haplotype blocks. The existence of haplotype block structures has serious implications for association-based methods in mapping of disease genes. A Single Nucleotide Polymorphism or SNP is a DNA sequence variation occurring when a single nucleotide in the genome differs between members of species. In this paper, we propose several efficient algorithms for identifying haplotype blocks in the genome. Especially, we develop a dynamic programming algorithm for haplotype block partitioning to minimize the number of tagSNPs required to account for most of the common haplotypes in each block. We implement these algorithms and analyze the chromosome 21 haplotype data given by Patil et al. [14]. As a result, we identify a total of 2,266 blocks (3,260 tagSNPs) which is 45.2 % (28.6%) smaller than those identified by Patil et al. or Zhang et al. [18].
The Block Partitioning Results Using Different � and � a
, 2004
"... service This article cites 49 articles, 10 of which can be accessed free at: ..."
Abstract
- Add to MetaCart
service This article cites 49 articles, 10 of which can be accessed free at:
A Linear Space Algorithm for Haplotype Blocks Partitioning Using Limited Number of Tag SNPs
"... The pattern of linkage disequilibrium (LD) plays a central role in genome-wide association studies of identifying genetic variation responsible of common human diseases. A Single Nucleotide Polymorphism or SNP is a DNA sequence variation occurring when a single nucleotide in the genome differs betwe ..."
Abstract
- Add to MetaCart
The pattern of linkage disequilibrium (LD) plays a central role in genome-wide association studies of identifying genetic variation responsible of common human diseases. A Single Nucleotide Polymorphism or SNP is a DNA sequence variation occurring when a single nucleotide in the genome differs between members of species. Recent studies show that the patterns of linkage disequilibrium observed in human chromosome reveal a block-like structure; the high LD regions are called haplotype blocks, and furthermore, a small subset of SNPs, called tag SNPs, is sufficient to capture the haplotype patterns in each haplotype block. Both Patil [18] and Zhang et al. [24] have proposed algorithms to partition haplotype sample into blocks fully under the circumstances of requiring minimal number of tag SNPs. However, when resources are limited, investigators and biologists may not be able to genotype all the tag SNPs and instead must restrict the number of tag SNPs used in their studies. In this paper, we examine several haplotype block diversity evaluation functions and propose dynamic programming algorithms for haplotype block partitioning with using the limited number of tag SNPs. We implement these algorithms and analyze the chromosome 21 haplotype data given by Patil et al. [18]. When the sample is partitioned into blocks fully, we identify a total of 2,266 blocks and 3,260 tag SNPs which is smaller than those identified by Zhang et al. [24]. We demonstrate that Zhang’s algorithm does not find the optimal solution due to ignoring the non-monotonic property of common haplotype evaluation function. The algorithms described have been implemented in the web-based system as the analysis tools for bioinformaticists and ∗ This work is supported by grants from the Taichung
Efficient Algorithms for SNP Haplotype Block Selection Problems
"... Global patterns of human DNA sequence variation (haplotypes) defined by common single nucleotide polymorphisms (SNPs) have important implications for identifying disease associations and human traits. Recent genetics research reveals that SNPs within certain haplotype blocks induce only a few distin ..."
Abstract
- Add to MetaCart
Global patterns of human DNA sequence variation (haplotypes) defined by common single nucleotide polymorphisms (SNPs) have important implications for identifying disease associations and human traits. Recent genetics research reveals that SNPs within certain haplotype blocks induce only a few distinct common haplotypes in the majority of the population. The existence of haplotype block structure has serious implications for association-based methods for the mapping of disease genes. Our ultimate goal is to select haplotype block designations that best capture the structure within the data. Here in this paper we propose several efficient combinatorial algorithms related to selecting interesting haplotype blocks under different diversity functions that generalizes many previous results in the literatures. In particular, given an m × n haplotype matrix A, we show linear time algorithms for finding all interval diversities, farthest sites, and the longest block within A. For selecting the multiple long blocks with diversity constraint, we show that selecting k blocks with longest total length can be be found in O(nk) time. We also propose linear time algorithms in calculating the all intra-longest-blocks and all intra-klongest-blocks.

