Results 1 - 10
of
42
3D-Jury: A simple approach to improve protein structure predictions
- Bioinformatics
"... Motivation: Consensus structure prediction methods (meta-predictors) have higher accuracy than individual structure prediction algorithms (their components). The goal for the development of the 3D-Jury system is to create a simple but powerful procedure for generating meta-predictions using variable ..."
Abstract
-
Cited by 50 (10 self)
- Add to MetaCart
Motivation: Consensus structure prediction methods (meta-predictors) have higher accuracy than individual structure prediction algorithms (their components). The goal for the development of the 3D-Jury system is to create a simple but powerful procedure for generating meta-predictions using variable sets of models obtained from diverse sources. The resulting protocol should help to improve the quality of structural annotations of novel proteins. Results: The 3D-Jury system generates meta-predictions from sets of models created using variable methods. It is not necessary to know prior characteristics of the methods. The system is able to utilize immediately new components (additional prediction providers). The accuracy of the system is comparable with other well-tuned prediction servers. The algorithm resembles methods of selecting models generated using ab initio folding simulations. It is simple and offers a portable solution to improve the accuracy of other protein structure prediction protocols. Availability: The 3D-Jury system is available via the Structure Prediction Meta Server
Pcons: A neural-network-based consensus predictor that improves fold recognition
- Protein Sci
, 2001
"... improves fold recognition ..."
A Machine Learning Information Retrieval Approach to Protein Fold Recognition
"... Motivation: Recognizing proteins that have similar tertiary structure is the key step of template-based protein structure prediction methods. Traditionally, a variety of alignment methods are used to identify similar folds, based on sequence similarity and sequencestructure compatibility. Although t ..."
Abstract
-
Cited by 27 (5 self)
- Add to MetaCart
Motivation: Recognizing proteins that have similar tertiary structure is the key step of template-based protein structure prediction methods. Traditionally, a variety of alignment methods are used to identify similar folds, based on sequence similarity and sequencestructure compatibility. Although these methods are complementary, their integration has not been thoroughly exploited. Statistical machine learning methods provide tools for integrating multiple features, but so far these methods have been used primarily for protein and fold classification, rather than addressing the retrieval problem of fold recognition–finding a proper template for a given query protein. Results: Here we present a two-stage machine learning, information retrieval, approach to fold recognition. First, we use alignment methods to derive pairwise similarity features for query-template protein pairs. We also use global profile-profile alignments in combination with predicted secondary structure, relative solvent accessibility, contact map, and beta-strand pairing to extract pairwise structural compatibility features. Second, we apply support vector machines to these features to predict the structural relevance (i.e. in the same fold or not) of the query-template pairs. For each query, the continuous relevance scores are used to rank the templates. The FOLDpro approach is modular, scalable, and effective. Compared to 11 other fold recognition methods, FOLDpro yields the best results in almost all standard categories on a comprehensive benchmark dataset. Using predictions of the top-ranked template, the sensitivity is about 85%, 56%, and 27 % at the family, superfamily, and fold levels respectively. Using the 5 top-ranked templates, the sensitivity increases to 90%, 70%, and 48%. Availability: The FOLDpro server is available with the SCRATCH
LiveBench1: Continuous benchmarking of protein structure prediction servers. Protein Sci
, 2001
"... We present a novel, continuous approach aimed at the large-scale assessment of the performance of available fold-recognition servers. Six popular servers were investigated: PDB-Blast, FFAS, T98-lib, Gen-THREADER, 3D-PSSM, and INBGU. The assessment was conducted using as prediction targets a large nu ..."
Abstract
-
Cited by 15 (5 self)
- Add to MetaCart
We present a novel, continuous approach aimed at the large-scale assessment of the performance of available fold-recognition servers. Six popular servers were investigated: PDB-Blast, FFAS, T98-lib, Gen-THREADER, 3D-PSSM, and INBGU. The assessment was conducted using as prediction targets a large number of selected protein structures released from October 1999 to April 2000. A target was selected if its sequence showed no significant similarity to any of the proteins previously available in the structural database. Overall, the servers were able to produce structurally similar models for one-half of the targets, but significantly accurate sequence-structure alignments were produced for only one-third of the targets. We further classified the targets into two sets: easy and hard. We found that all servers were able to find the correct answer for the vast majority of the easy targets if a structurally similar fold was present in the server’s fold libraries. However, among the hard targets—where standard methods such as PSI-BLAST fail—the most sensitive fold-recognition servers were able to produce similar models for only 40 % of the cases, half of which had a significantly accurate sequence-structure alignment. Among the hard targets, the presence of updated libraries appeared to be less critical for the ranking. An “ideally combined consensus” prediction, where the results of all servers are considered, would increase the percentage of correct assignments by 50%. Each server had a number of cases with a correct assignment, where the assignments of all the other servers were wrong. This emphasizes the benefits of considering more than one server in difficult prediction tasks. The LiveBench program
Pcons5: combining consensus, structural evaluation and fold recognition scores
- Bioinformatics
, 2005
"... doi:10.1093/bioinformatics/bti702 ..."
Protein Structure Prediction by Linear Programming
, 2003
"... If the primary sequence of a protein is given, what is its three-dimensional structure? This is one of the most important and dicult problems in molecular biology and has tremendous implication to proteomics. Over the last three decades, this issue has been intensely researched. Protein threading re ..."
Abstract
-
Cited by 9 (0 self)
- Add to MetaCart
If the primary sequence of a protein is given, what is its three-dimensional structure? This is one of the most important and dicult problems in molecular biology and has tremendous implication to proteomics. Over the last three decades, this issue has been intensely researched. Protein threading represents one of the most promising techniques. So far, there are many protein structure prediction computer programs based on protein threading; however, almost none incorporates the pairwise contact (interaction) potential explicitly in its energy function, although scientists believe that pairwise interactions are important for fold recognition targets. The underlying reason for ignoring the pairwise potential is that the protein threading problem is NP-hard (i.e., it is unlikely to have a polynomial-time algorithm), if the pairwise interactions are treated rigorously.
ACE: Consensus Fold Recognition by Predicted Model Quality
, 2004
"... I hereby declare that I am the sole author of this thesis. This is a true copy of the thesis, including any required final revision, as accepted by my examiners. I understand that my thesis may be made electronically available to the public. ii Protein structure prediction has been a fundamental cha ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
I hereby declare that I am the sole author of this thesis. This is a true copy of the thesis, including any required final revision, as accepted by my examiners. I understand that my thesis may be made electronically available to the public. ii Protein structure prediction has been a fundamental challenge in the biological field. In this post-genomic era, the need for automated protein structure prediction has never been more evident and researchers are now focusing on developing computa-tional techniques to predict three-dimensional structures with high throughput. Consensus-based protein structure prediction methods are state-of-the-art in automatic protein structure prediction. A consensus-based server combines the outputs of several individual servers and tends to generate better predictions than any individual server. Consensus-based methods have proved to be successful in recent CASP (Critical Assessment of Structure Prediction). In this thesis, a Support Vector Machine (SVM) regression-based consensus method is proposed for protein fold recognition, a key component for high through-put protein structure prediction and protein function annotation. The SVM first extracts the features of a structural model by comparing the model to the other models produced by all the individual servers. Then, the SVM predicts the quality of each model. The experimental results from several LiveBench data sets confirm that our proposed consensus method, SVM regression, consistently performs better than any individual server. Based on this method, we developed a meta server, the Alignment by Consensus Estimation (ACE). iii
Profile-profile comparisons by COMPASS predict intricate homologies between protein families
- Protein Sci
, 2003
"... Recently we proposed a novel method of alignment–alignment comparison, COMPASS (the tool for COmparison of Multiple Protein Alignments with Assessment of Statistical Significance). Here we present several examples of the relations between PFAM protein families that were detected by COMPASS and that ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
Recently we proposed a novel method of alignment–alignment comparison, COMPASS (the tool for COmparison of Multiple Protein Alignments with Assessment of Statistical Significance). Here we present several examples of the relations between PFAM protein families that were detected by COMPASS and that lead to the predictions of presently unresolved protein structures. We discuss relatively straightforward COMPASS predictions that are new and interesting to us, and that would require a substantial time and effort to justify even for a skilled PSI-BLAST user. All of the presented COMPASS hits are independently confirmed by other methods, including the ab initio structure-prediction method ROSETTA. The tertiary structure predictions made by ROSETTA proved to be useful for improving sequence-derived alignments, because they are based on a reasonable folding of the polypeptide chain rather than on the information from sequence databases. The ability of COMPASS to predict new relations within the PFAM database indicates the high sensitivity of COMPASS searches and substantiates its potential value for the discovery of previously unknown similarities between protein families.
Classification and evolutionary history of the single-strand annealing proteins, RecT, Redbeta, ERF and RAD52
, 2002
"... Background: The DNA single-strand annealing proteins (SSAPs), such as RecT, Red, ERF and Rad52, function in RecA-dependent and RecA-independent DNA recombination pathways. ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
Background: The DNA single-strand annealing proteins (SSAPs), such as RecT, Red, ERF and Rad52, function in RecA-dependent and RecA-independent DNA recombination pathways.

