Results 1 - 10
of
237
SCOP, Structural Classification of Proteins Database: Applications to Evaluation of the Effectiveness of Sequence Alignment Methods and Statistics of Protein Structural Data
, 1998
"... The Structural Classification of Proteins (SCOP) database provides a detailed and comprehensive description of the relationships of all known protein structures. The classification is on hierarchical levels: the first two levels, family and superfamily, describe near and far evolutionary relationshi ..."
Abstract
-
Cited by 703 (16 self)
- Add to MetaCart
The Structural Classification of Proteins (SCOP) database provides a detailed and comprehensive description of the relationships of all known protein structures. The classification is on hierarchical levels: the first two levels, family and superfamily, describe near and far evolutionary relationships; the third, fold, describes geometrical relationships. The distinction between evolutionary relationships and those that arise from the physics and chemistry of proteins is a feature that is unique to this database, so far. The database can be used as a source of data to calibrate sequence search algorithms and for the generation of population statistics on protein structures. The database and its associated les are freely accessible from a number of WWW sites mirrored from URL http://scop.mrc-lmb. cam.ac.uk/scop/.
Hidden Markov models for detecting remote protein homologies
- Bioinformatics
, 1998
"... A new hidden Markov model method (SAM-T98) for nding remote homologs of protein sequences is described and evaluated. The method begins with a single target sequence and iteratively builds a hidden Markov model (hmm) from the sequence and homologs found using the hmm for database search. SAM-T98 is ..."
Abstract
-
Cited by 229 (12 self)
- Add to MetaCart
A new hidden Markov model method (SAM-T98) for nding remote homologs of protein sequences is described and evaluated. The method begins with a single target sequence and iteratively builds a hidden Markov model (hmm) from the sequence and homologs found using the hmm for database search. SAM-T98 is also used to construct model libraries automatically from sequences in structural databases. We evaluate the SAM-T98 method with four datasets. Three of the test sets are fold-recognition tests, where the correct answers are determined by structural similarity. The fourth uses a curated database. The method is compared against wu-blastp and against double-blast, a two-step method similar to ISS, but using blast instead of fasta. Results SAM-T98 had the fewest errors in all tests| dramatically so for the fold-recognition tests. At the minimum-error point on the SCOP-domains test, SAM-T98 got 880 true positives and 68 false positives, double-blast got 533 true positives with 71 false positives, and wu-blastp got 353 true positives with 24 false positives. The method is optimized to recognize superfamilies, and would require parameter adjustment to be used to nd family or fold relationships. One key to the performance of the hmm method is a new score-normalization technique that compares the score to the score with a reversed model rather than to a uniform null model. Availability A World Wide Web server, as well as information on obtaining the Sequence Alignment and PREPRINT to appear in Bioinformatics, 1999
Comprehensive assessment of automatic structural alignment against a manual standard, the Scop classification of proteins
- Protein Sci
, 1998
"... ..."
Structure Comparison and Structure Patterns
- JOURNAL OF COMPUTATIONAL BIOLOGY
, 1999
"... This article investigate different aspects regarding pairwise and multiple structure comparison, and the problem of automatically discover common patterns in a set of structures. Descriptions and representation of structures and patterns are investigated, as well as scoring and algorithms for com ..."
Abstract
-
Cited by 69 (2 self)
- Add to MetaCart
This article investigate different aspects regarding pairwise and multiple structure comparison, and the problem of automatically discover common patterns in a set of structures. Descriptions and representation of structures and patterns are investigated, as well as scoring and algorithms for comparison and discovery. A framework and nomenclature is developed, and a lot of methods are reviewed and placed into this framework.
Hierarchical Protein Structure Superposition using both Secondary Structure and Atomic Representations
, 1997
"... The structural comparison of proteins has become increasingly important as a means to identify protein motifs and fold families. In this paper we present a new algorithm for the comparison of proteins based on a hierarchy of structural representations, from the secondary structure level to the ..."
Abstract
-
Cited by 55 (13 self)
- Add to MetaCart
The structural comparison of proteins has become increasingly important as a means to identify protein motifs and fold families. In this paper we present a new algorithm for the comparison of proteins based on a hierarchy of structural representations, from the secondary structure level to the atomic level. Our technique represents a-helices and b-strands as vectors and uses a set of seven scoring functions to compare pairs of vectors from different proteins. The scores obtained are used in a dynamic programming algorithm that finds the best local alignment of the two sets of vectors. The second step in our algorithm is based on the atomic coordinates of the protein structures and improves the initial vector alignment by iteratively minimizing the RMSD between pairs of nearest atoms from the two proteins. We refine the final alignment by determining a core of well aligned atoms and minimizing the RMSD of this core. In a comparison of our method to Holm and Sander's DA...
Using iterative dynamic programming to obtain accurate pairwise and multiple alignments of protein structures
- Proc. of ISMB’96 Intelligent Systems for Molecular Biology
, 1996
"... We show how a basic pairwise alignment procedure can be improved to more accurately align conserved structural regions, by using variable, positiondependent gap penalties that depend on secondary structure and by taking the consensus of a number of suboptimal alignments. These improvements, which ar ..."
Abstract
-
Cited by 55 (9 self)
- Add to MetaCart
We show how a basic pairwise alignment procedure can be improved to more accurately align conserved structural regions, by using variable, positiondependent gap penalties that depend on secondary structure and by taking the consensus of a number of suboptimal alignments. These improvements, which are novel for structural alignment, are direct analogs of what is possible with normal sequence alignment. They are feasible for us since our basic structural alignment procedure, unlike others, is so similar to normal sequence alignment. We further present preliminary results that show how our procedure can be generalized to produce a multiple alignment of a family of structures. Our approach is based on finding a “median ” structure from doing all possible pairwise alignments and then aligning everything to it.
Comprehensive evaluation of protein structure alignment methods: scoring by geometric measures
- J Mol Biol
, 2005
"... The problem of aligning, or establishing a correspondence between, residues of two protein Abbreviations used: ROC, receiver operating ..."
Abstract
-
Cited by 53 (0 self)
- Add to MetaCart
The problem of aligning, or establishing a correspondence between, residues of two protein Abbreviations used: ROC, receiver operating
TM-align: A protein structure alignment algorithm based on TM-score
- Nucleic Acids Research
"... TM-score ..."
Algorithmic aspects of protein structure similarity
- In 40th Annual Symposium on Foundations of Computer Science
, 1999
"... We show that calculating contact map overlap (a measure of similarity of protein structures) is NPhard, but can be solved in polynomial time for several interesting and relevant special cases. We identify an important special case of this problem corresponding to self-avoiding walks, and prove a dec ..."
Abstract
-
Cited by 46 (3 self)
- Add to MetaCart
We show that calculating contact map overlap (a measure of similarity of protein structures) is NPhard, but can be solved in polynomial time for several interesting and relevant special cases. We identify an important special case of this problem corresponding to self-avoiding walks, and prove a decomposition theorem and a corollary approximation result for this special case. These are the rst approximation algorithms with guaranteed error bounds, and NPcompleteness results in the literature in the area of protein structure alignment/fold recognition for measures of structure similarity of practical interest. A
Predicting protein structure using hidden Markov models
, 1997
"... We discuss how methods based on hidden Markov models performed in the fold recognition section of the CASP2 experiment. Hidden Markov models were built for a set of about a thousand structures from the PDB database, and each CASP2 target sequence was scored against this library of hidden Markov mode ..."
Abstract
-
Cited by 46 (18 self)
- Add to MetaCart
We discuss how methods based on hidden Markov models performed in the fold recognition section of the CASP2 experiment. Hidden Markov models were built for a set of about a thousand structures from the PDB database, and each CASP2 target sequence was scored against this library of hidden Markov models. In addition, a hidden Markov model was built for each of the target sequences, and all of the sequences in PDB were scored against that target model. Having high scores from both methods was found to be highly indicative of the target and a structure being homologous. Predictions were made based on several criteria: the scores with the structure models, the scores with the target models, consistency between the secondary structure in the known structure and predictions for the target (using the program PhD), human examination of predicted alignments between target and structure (using RASMOL), and solvation preferences in the alignment of the target and structure. The method worked well in comparison to other methods used at CASP2 for targets of moderate difficulty, where the closest structure in PDB could be aligned to the target with at least 15 % residue identity. There was no evidence for the method's e ectiveness for harder cases, where the residue identity was much lower than 15%.

