Results 1 -
8 of
8
Hidden Markov models in computational biology: applications to protein modeling
- JOURNAL OF MOLECULAR BIOLOGY
, 1994
"... Hidden.Markov Models (HMMs) are applied t.0 the problems of statistical modeling, database searching and multiple sequence alignment of protein families and protein domains. These methods are demonstrated the on globin family, the protein kinase catalytic domain, and the EF-hand calcium binding moti ..."
Abstract
-
Cited by 436 (29 self)
- Add to MetaCart
Hidden.Markov Models (HMMs) are applied t.0 the problems of statistical modeling, database searching and multiple sequence alignment of protein families and protein domains. These methods are demonstrated the on globin family, the protein kinase catalytic domain, and the EF-hand calcium binding motif. In each case the parameters of an HMM are estimated from a training set of unaligned sequences. After the HMM is built, it is used to obtain a multiple alignment of all the training sequences. It is also used to search the. SWISS-PROT 22 database for other sequences. that are members of the given protein family, or contain the given domain. The Hi " produces multiple alignments of good quality that agree closely with the alignments produced by programs that incorporate threedimensional structural information. When employed in discrimination tests (by examining how closely the sequences in a database fit the globin, kinase and EF-hand HMMs), the '\ HMM is able to distinguish members of these families from non-members with a high degree of accuracy. Both the HMM and PROFILESEARCH (a technique used to search for relationships between a protein sequence and multiply aligned sequences) perform better in these tests than PROSITE (a dictionary of sites and patterns in proteins). The HMM appecvs to have a slight advantage over PROFILESEARCH in terms of lower rates of false
New Techniques for DNA Sequence Classification
, 1999
"... DNA sequence classification is the activity of determining whether or not an unlabeled sequence S belongs to an existing class C. This paper proposes two new techniques for DNA sequence classification. The first technique works by comparing the unlabeled sequence S with a group of active motifs disc ..."
Abstract
-
Cited by 9 (5 self)
- Add to MetaCart
DNA sequence classification is the activity of determining whether or not an unlabeled sequence S belongs to an existing class C. This paper proposes two new techniques for DNA sequence classification. The first technique works by comparing the unlabeled sequence S with a group of active motifs discovered from the elements of C and by distinction with elements outside of C. The second technique generates and matches gapped fingerprints of S with elements of C. Experimental results obtained by running these algorithms on long and well conserved Alu sequences demonstrate the good performance of the presented methods compared with FASTA. When applied to less conserved and relatively short functional sites such as splice-junctions, a variation of the second technique combining fingerprinting with consensus sequence analysis gives better results than the current classifiers employing text compression and machine learning algorithms. 2 INTRODUCTION DNA sequence classification is an import...
Pattern Discovery In Sequence Databases: Algorithms And Applications To DNA/Protein Classification
, 1997
"... Sequence databases comprise sequence data, which are linear structural descriptions of many natural entities. Approximate pattern discovery in a sequence database can lead to important conclusions or prediction of new phenomena. Traditional database technology is not suitable for accomplishing the t ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Sequence databases comprise sequence data, which are linear structural descriptions of many natural entities. Approximate pattern discovery in a sequence database can lead to important conclusions or prediction of new phenomena. Traditional database technology is not suitable for accomplishing the task, and new techniques need to be developed. In this dissertation, we propose several new techniques for discovering patterns in sequence databases. Our techniques incorporate pattern matching algorithms and novel heuristics for discovery and optimization. Experimental results of applying the techniques to both generated data and DNA/proteins show the effectiveness of the proposed techniques. We then develop several classifiers using our pattern discovery algorithms and a previously published fingerprint technique. When we apply the classifiers to classify DNA and protein seq...
.2 Kinase experiments
"... this paper will be made available in electronic form, and can be obtained by anonymous ftp from ftp.cse.ucsc.edu. Our HMM building program and other tools (written in C) will also be made available from the same ftp site. ..."
Abstract
- Add to MetaCart
this paper will be made available in electronic form, and can be obtained by anonymous ftp from ftp.cse.ucsc.edu. Our HMM building program and other tools (written in C) will also be made available from the same ftp site.
BMC Evolutionary Biology BioMed Central Research article Tracking Alu evolution in New World primates
, 2005
"... © 2005 Ray and Batzer; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License ..."
Abstract
- Add to MetaCart
© 2005 Ray and Batzer; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License
BMC Evolutionary Biology BioMed Central
, 2004
"... Research article Evolution and distribution of RNA polymerase II regulatory sites from RNA polymerase III dependant mobile Alu elements ..."
Abstract
- Add to MetaCart
Research article Evolution and distribution of RNA polymerase II regulatory sites from RNA polymerase III dependant mobile Alu elements
BMC Evolutionary Biology BioMed Central Research article
, 2007
"... Analysis of the features and source gene composition of the AluYg6 ..."

