Results 1 -
6 of
6
Hidden Markov models in computational biology: applications to protein modeling
- JOURNAL OF MOLECULAR BIOLOGY
, 1994
"... Hidden.Markov Models (HMMs) are applied t.0 the problems of statistical modeling, database searching and multiple sequence alignment of protein families and protein domains. These methods are demonstrated the on globin family, the protein kinase catalytic domain, and the EF-hand calcium binding moti ..."
Abstract
-
Cited by 436 (29 self)
- Add to MetaCart
Hidden.Markov Models (HMMs) are applied t.0 the problems of statistical modeling, database searching and multiple sequence alignment of protein families and protein domains. These methods are demonstrated the on globin family, the protein kinase catalytic domain, and the EF-hand calcium binding motif. In each case the parameters of an HMM are estimated from a training set of unaligned sequences. After the HMM is built, it is used to obtain a multiple alignment of all the training sequences. It is also used to search the. SWISS-PROT 22 database for other sequences. that are members of the given protein family, or contain the given domain. The Hi " produces multiple alignments of good quality that agree closely with the alignments produced by programs that incorporate threedimensional structural information. When employed in discrimination tests (by examining how closely the sequences in a database fit the globin, kinase and EF-hand HMMs), the '\ HMM is able to distinguish members of these families from non-members with a high degree of accuracy. Both the HMM and PROFILESEARCH (a technique used to search for relationships between a protein sequence and multiply aligned sequences) perform better in these tests than PROSITE (a dictionary of sites and patterns in proteins). The HMM appecvs to have a slight advantage over PROFILESEARCH in terms of lower rates of false
Dirichlet Mixtures: A Method for Improving Detection of Weak but Significant Protein Sequence Homology
, 1996
"... This paper presents the mathematical foundations of Dirichlet mixtures, which have been used to improve database search results for homologous sequences, when a variable number of sequences from a protein family or domain are known. We present a method for condensing the information in a protein dat ..."
Abstract
-
Cited by 105 (20 self)
- Add to MetaCart
This paper presents the mathematical foundations of Dirichlet mixtures, which have been used to improve database search results for homologous sequences, when a variable number of sequences from a protein family or domain are known. We present a method for condensing the information in a protein database into a mixture of Dirichlet densities. These mixtures are designed to be combined with observed amino acid frequencies, to form estimates of expected amino acid probabilities at each position in a profile, hidden Markov model, or other statistical model. These estimates give a statistical model greater generalization capacity, such that remotely related family members can be more reliably recognized by the model. Dirichlet mixtures have been shown to outperform substitution matrices and other methods for computing these expected amino acid distributions in database search, resulting in fewer false positives and false negatives for the families tested. This paper corrects a previously p...
Using Dirichlet Mixture Priors to Derive Hidden Markov Models for Protein Families
- PROCEEDINGS OF THE THIRD INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS FOR MOLECULAR BIOLOGY
, 1993
"... A Bayesian method for estimating the amino acid distributions in the states of a hidden Markov model (HMM) for a protein family or the columns of a multiple alignment of that family is introduced. This method uses Dirichlet mixture densities as priors over amino acid distributions. These mixtu ..."
Abstract
-
Cited by 56 (6 self)
- Add to MetaCart
A Bayesian method for estimating the amino acid distributions in the states of a hidden Markov model (HMM) for a protein family or the columns of a multiple alignment of that family is introduced. This method uses Dirichlet mixture densities as priors over amino acid distributions. These mixture densities are determined from examination of previously constructed HMMs or multiple alignments. It is shown that this Bayesian method can improve the quality of HMMs produced from small training sets. Specific experiments on the EF-hand motif are reported, for which these priors are shown to produce HMMs with higher likelihood on unseen data, and fewer false positives and false negatives in a database search task.
Scoring Hidden Markov Models
"... Motivation: Statistical sequence comparison techniques, such as hidden Markov models and generalized pro les, calculate the probability that a sequence was generated by a given model. Log-odds scoring is a means of evaluating this probability by comparing it to a null hypothesis, usually a simpler s ..."
Abstract
-
Cited by 31 (5 self)
- Add to MetaCart
Motivation: Statistical sequence comparison techniques, such as hidden Markov models and generalized pro les, calculate the probability that a sequence was generated by a given model. Log-odds scoring is a means of evaluating this probability by comparing it to a null hypothesis, usually a simpler statistical model intended to represent the universe of sequences as a whole, rather than the group of interest. Such scoring leads to two immediate questions: what should the null model be, and what threshold of log-odds score should be deemed a match to the model. Results: This paper experimentally analyses these two issues. Within the context of the Sequence Alignment and Modeling software suite (SAM), we consider a variety ofnull models and suitable thresholds. Additionally, we consider HMMer's log-odds scoring and SAM's original Zscoring method. Among the null model choices, a simple looping null model that emits characters according to the geometric mean of the character probabilities in the columns modeled by the HMM performs well or best across all four discrimination experiments.
Calsensin: A Novel Calcium-binding Protein Expressed in a Subset of Peripheral Leech Neurons Fasciculating in a Single Axon Tract
"... Abstract. The mAb lan3-6 recognizes a cytosolic antigen which is selectively expressed in the growth cones and axons of a small subset of peripheral sensory neurons fasciculating in a single tract common to all hirudinid leeches. We have used this antibody to clone a novel EF-hand calcium-binding pr ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Abstract. The mAb lan3-6 recognizes a cytosolic antigen which is selectively expressed in the growth cones and axons of a small subset of peripheral sensory neurons fasciculating in a single tract common to all hirudinid leeches. We have used this antibody to clone a novel EF-hand calcium-binding protein, calsensin, by screening an expression vector library. A full-length clone of 1.1 kb identified by the antibody was isolated and sequenced. In situ hybridizations with calsensin probes and antibody staining using new polyclonal antisera generated against calsensin sequence demonstrate that calsensin indeed corresponds to the lan3-6 antigen. Calsensin consists of 83 residues with a calculated molecular mass of 9.1 kD that contains two helix-loop-helix domains. The calcium-binding domains are likely to be
.2 Kinase experiments
"... this paper will be made available in electronic form, and can be obtained by anonymous ftp from ftp.cse.ucsc.edu. Our HMM building program and other tools (written in C) will also be made available from the same ftp site. ..."
Abstract
- Add to MetaCart
this paper will be made available in electronic form, and can be obtained by anonymous ftp from ftp.cse.ucsc.edu. Our HMM building program and other tools (written in C) will also be made available from the same ftp site.

