Results 1 - 10
of
10
Hidden Markov models for detecting remote protein homologies
- Bioinformatics
, 1998
"... A new hidden Markov model method (SAM-T98) for nding remote homologs of protein sequences is described and evaluated. The method begins with a single target sequence and iteratively builds a hidden Markov model (hmm) from the sequence and homologs found using the hmm for database search. SAM-T98 is ..."
Abstract
-
Cited by 229 (12 self)
- Add to MetaCart
A new hidden Markov model method (SAM-T98) for nding remote homologs of protein sequences is described and evaluated. The method begins with a single target sequence and iteratively builds a hidden Markov model (hmm) from the sequence and homologs found using the hmm for database search. SAM-T98 is also used to construct model libraries automatically from sequences in structural databases. We evaluate the SAM-T98 method with four datasets. Three of the test sets are fold-recognition tests, where the correct answers are determined by structural similarity. The fourth uses a curated database. The method is compared against wu-blastp and against double-blast, a two-step method similar to ISS, but using blast instead of fasta. Results SAM-T98 had the fewest errors in all tests| dramatically so for the fold-recognition tests. At the minimum-error point on the SCOP-domains test, SAM-T98 got 880 true positives and 68 false positives, double-blast got 533 true positives with 71 false positives, and wu-blastp got 353 true positives with 24 false positives. The method is optimized to recognize superfamilies, and would require parameter adjustment to be used to nd family or fold relationships. One key to the performance of the hmm method is a new score-normalization technique that compares the score to the score with a reversed model rather than to a uniform null model. Availability A World Wide Web server, as well as information on obtaining the Sequence Alignment and PREPRINT to appear in Bioinformatics, 1999
A Discriminative Framework for Detecting Remote Protein Homologies
, 1999
"... A new method for detecting remote protein homologies is introduced and shown to perform well in classifying protein domains by SCOP superfamily. The method is a variant of support vector machines using a new kernel function. The kernel function is derived from a generative statistical model for a ..."
Abstract
-
Cited by 163 (4 self)
- Add to MetaCart
A new method for detecting remote protein homologies is introduced and shown to perform well in classifying protein domains by SCOP superfamily. The method is a variant of support vector machines using a new kernel function. The kernel function is derived from a generative statistical model for a protein family, in this case a hidden Markov model. This general approach of combining generative models like HMMs with discriminative methods such as support vector machines may have applications in other areas of biosequence analysis as well.
Integrating Database Homology in a Probabilistic Gene Structure Model
- Proceedings of the Pacific Symposium on Biocomputing
, 1997
"... We present an improved stochastic model of genes in DNA, and describe a method for integrating database homology into the probabilistic framework. A generalized hidden Markov model (GHMM) describes the grammar of a legal parse of a DNA sequence. Probabilities are estimated for gene features by using ..."
Abstract
-
Cited by 17 (5 self)
- Add to MetaCart
We present an improved stochastic model of genes in DNA, and describe a method for integrating database homology into the probabilistic framework. A generalized hidden Markov model (GHMM) describes the grammar of a legal parse of a DNA sequence. Probabilities are estimated for gene features by using dynamic programming to combine information from multiple sensors. We showhow matches to homologous sequences from a database can be integrated into the probability estimation by interpreting the likelihood of a sequence in terms of the bit-cost to encode a sequence given a homology match. We also demonstrate how homology matches in protein databases can be exploited to help identify splice sites. Our experiments show signi cant improvements in the sensitivity and speci city ofgene structure identi cation when these new features are added to our gene- nding system, Genie. Experimental results in tests using a standard set of annotated genes showed that Genie identi ed 95 % of coding nucleotides correctly with a speci city of 91%, and 77 % of exons were identi ed exactly. 1
Pex13p is an SH3 protein of the peroxisome membrane and a docking factor for the predominantly cytoplasmic PTS1 receptor
- J. Cell
, 1996
"... Abstract. Import of newly synthesized PTS1 proteins into the peroxisome requires the PTS1 receptor (Pex5p), a predominantly cytoplasmic protein that cycles between the cytoplasm and peroxisome. We have identified Pex13p, a novel integral peroxisomal membrane from both yeast and humans that binds the ..."
Abstract
-
Cited by 11 (6 self)
- Add to MetaCart
Abstract. Import of newly synthesized PTS1 proteins into the peroxisome requires the PTS1 receptor (Pex5p), a predominantly cytoplasmic protein that cycles between the cytoplasm and peroxisome. We have identified Pex13p, a novel integral peroxisomal membrane from both yeast and humans that binds the PTS1 receptor via a cytoplasmically oriented SH3 domain. Although only a small amount of Pex5p is bound to peroxisomes at steady state (<5%), loss of Pex13p fur-ther reduces the amount of peroxisome-associated Pex5p by ~40-fold. Furthermore, loss of Pex13p eliminates import of peroxisomal matrix proteins that contain either the type-1 or type-2 peroxisomal targeting signal but does not affect targeting and insertion of integral peroxisomal membrane proteins. We conclude
A new computational method for detection of chimeric 16S rRNA artifacts generated by PCR amplification from mixed bacterial populations, Applied Environmental Microbiology 63 (6
- Appl. Environ. Microbiol
, 1997
"... A new computational method (chimeric alignment) has been developed to detect chimeric 16S rRNA artifacts generated during PCR amplification from mixed bacterial populations. In contrast to other nearest-neighbor methods (e.g., CHECK_CHIMERA) that define sequence similarity by k-tuple matching, the c ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
A new computational method (chimeric alignment) has been developed to detect chimeric 16S rRNA artifacts generated during PCR amplification from mixed bacterial populations. In contrast to other nearest-neighbor methods (e.g., CHECK_CHIMERA) that define sequence similarity by k-tuple matching, the chimeric alignment method uses the score from dynamic programming alignments. Further, the chimeric alignments are displayed to the user to assist in sequence classification. The distribution of improvement scores for 500 authentic, nonchimeric sequences and 300 artificial chimeras (constructed from authentic sequences) was used to study the sensitivity and accuracy of both chimeric alignment and CHECK_CHIMERA. At a constant rate of authentic sequence misclassification (5%), chimeric alignment incorrectly classified 13 % of the artificial chimeras versus 14 % for CHECK_CHIMERA. Interestingly, only 1 % of nonchimeras and 10 % of chimeras were misclassified by both programs, suggesting that optimum performance is obtained by using the two methods to assign sequences to three classes: high-probability nonchimeras, high-probability chimeras, and sequences that need further study by other means. This study suggests that k-tuple-based matching methods are more sensitive than alignment-based methods when there is significant parental sequence similarity, while the opposite becomes true as the sequences become more distantly related. The software and a World Wide Web-based server are available at
Fission yeast bub1 is a mitotic centromere protein essential for the spindle checkpoint and the preservation of correct ploidy through mitosis
- J. Cell
, 1998
"... Abstract. The spindle checkpoint ensures proper chromosome segregation by delaying anaphase until all chromosomes are correctly attached to the mitotic spindle. We investigated the role of the fission yeast bub1 gene in spindle checkpoint function and in unperturbed mitoses. We find that bub1 � is e ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
Abstract. The spindle checkpoint ensures proper chromosome segregation by delaying anaphase until all chromosomes are correctly attached to the mitotic spindle. We investigated the role of the fission yeast bub1 gene in spindle checkpoint function and in unperturbed mitoses. We find that bub1 � is essential for the fission yeast spindle checkpoint response to spindle damage and to defects in centromere function. Activation of the checkpoint results in the recruitment of Bub1 to centromeres and a delay in the completion of mitosis. We show that Bub1 also has a crucial role in normal, unperturbed mitoses. Loss of bub1 function causes chromosomes to lag on the anaphase spindle and an increased frequency of chromosome loss. Such genomic instability is even more dramatic in �bub1 diploids, leading to massive chromosome missegregation events and loss of the diploid state, demonstrating that bub1 � function is essential to maintain correct ploidy through mitosis. As in larger eukaryotes, Bub1 is recruited to kinetochores during the early stages of mitosis. However, unlike its vertebrate counterpart, a pool of Bub1 remains centromere-associated at metaphase and even until telophase. We discuss the possibility of a role for the Bub1 kinase after the metaphase–anaphase transition.
Enlarged similarity of nucleic acid sequences
- DNA Res
, 1996
"... The concept of nucleic acid sequence base alternations is presented. The number of base alterations for the sequences of different length is established. The definition of "enlarged similarity " of nucleic acids sequences on the basis of sequence base alterations is introduced. Mutual info ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
The concept of nucleic acid sequence base alternations is presented. The number of base alterations for the sequences of different length is established. The definition of "enlarged similarity " of nucleic acids sequences on the basis of sequence base alterations is introduced. Mutual information between sequences is used as a quantitative measure of enlarged similarity for two compared sequences. The method of mutual information calculation is developed considering the correlation of bases in compared sequences. The definitions of correlated similarity and evolution similarity between compared sequences are given. Results of the use of enlarged similarity approach for DNA sequences analysis are discussed. Key words: DNA sequence; computer analysis; sequence base alteration; mutual information; enlarged similarity 1.
Algorithms for Molecular Biology BioMed Central
, 2007
"... Software article A basic analysis toolkit for biological sequences ..."
NEW APPROACHES TO ANALYSIS OF BIOMOLECULAR DATA AND PROCESSES COMPUTER ANALYSIS OF MULTIPLE REPEATS IN BACTERIA
"... Motivation: The presence of repeated sequences is a well-known feature of bacterial genomes and interpretation and classification of those repeats is an actual problem. Results: We described a method for computing multiple repeats, that is sequences that have multiple (two or more) occurrences in a ..."
Abstract
- Add to MetaCart
Motivation: The presence of repeated sequences is a well-known feature of bacterial genomes and interpretation and classification of those repeats is an actual problem. Results: We described a method for computing multiple repeats, that is sequences that have multiple (two or more) occurrences in a genome. In order to identify multiple repeats in bacteria genomes, we apply the YASS software (Noe, Kucherov, 2004) and developed a novel algorithm for multiple repeat clusterization. Exhaustive computation and analysis of those “clusters of repeated sequences ” in bacteria is the subject of the present work. Availability: Program is available by
unknown title
, 2005
"... doi:10.1093/nar/gki925 Structural dynamics of cereal mitochondrial genomes as revealed by complete nucleotide sequencing of the wheat mitochondrial genome ..."
Abstract
- Add to MetaCart
doi:10.1093/nar/gki925 Structural dynamics of cereal mitochondrial genomes as revealed by complete nucleotide sequencing of the wheat mitochondrial genome

