Results 1  10
of
67
Rapid protein sidechain packing via tree decomposition
 Research in Computational Molecular Biology, Lecture Notes in Computer Science
, 2005
"... Abstract. This paper proposes a novel tree decomposition based sidechain assignment algorithm, which can obtain the globally optimal solution of the sidechain packing problem very efficiently. Theoretically, the computational complexity of this algorithm is O((N +M)n tw+1 rot) where N is the numbe ..."
Abstract

Cited by 20 (1 self)
 Add to MetaCart
Abstract. This paper proposes a novel tree decomposition based sidechain assignment algorithm, which can obtain the globally optimal solution of the sidechain packing problem very efficiently. Theoretically, the computational complexity of this algorithm is O((N +M)n tw+1 rot) where N is the number of residues in the protein, M the number of interacting residue pairs, nrot the average number of rotamers for each residue and tw( = O(N 2 3 log N)) the tree width of the residue interaction graph. Based on this algorithm, we have developed a sidechain prediction program SCATD (Side Chain Assignment via Tree Decomposition). Experimental results show that after the Goldstein DEE is conducted, nrot is around 3.5, tw is only 3 or 4 for most of the test proteins in the SCWRL benchmark and less than 10 for all the test proteins. SCATD runs up to 90 times faster than SCWRL 3.0 on some large proteins in the SCWRL benchmark and achieves an average of five times faster speed on all the test proteins. If only the postDEE stage is taken into consideration, then our treedecomposition based energy minimization algorithm is more than 200 times faster than that in SCWRL 3.0 on some large proteins. SCATD is freely available for academic research upon request. 1
Pcons5: combining consensus, structural evaluation and fold recognition scores
 Bioinformatics
, 2005
"... doi:10.1093/bioinformatics/bti702 ..."
Fold recognition by predicted alignment accuracy
 ACM/IEEE Transactions on Computational Biology and Bioinformatics
, 2005
"... Abstract—One of the key components in protein structure prediction by protein threading technique is to choose the best overall template for a given target sequence after all the optimal sequencetemplate alignments are generated. The chosen template should have the best alignment with the target se ..."
Abstract

Cited by 15 (6 self)
 Add to MetaCart
Abstract—One of the key components in protein structure prediction by protein threading technique is to choose the best overall template for a given target sequence after all the optimal sequencetemplate alignments are generated. The chosen template should have the best alignment with the target sequence since the threedimensional structure of the target sequence is built on the sequencetemplate alignment. The traditional method for template selection is called Zscore, which uses a statistical test to rank all the sequencetemplate alignments and then chooses the firstranked template for the sequence. However, the calculation of Zscore is timeconsuming and not suitable for genomescale structure prediction. Zscores are also hard to interpret when the threading scoring function is the weighted sum of several energy items of different physical meanings. This paper presents a Support Vector Machine (SVM) regression approach to directly predict the alignment accuracy of a sequencetemplate alignment, which is used to rank all the templates for a specific target sequence. Experimental results on a largescale benchmark demonstrate that SVM regression performs much better than the compositioncorrected Zscore method. SVM regression also runs much faster than the Zscore method. Index Terms—Protein structure prediction, protein threading, protein fold recognition, SVM regression. 1
Fold recognition by combining profile–profile alignment and support vector machine
 Bioinformatics
, 2005
"... Motivation: Currently, the most accurate fold recognition method is to perform profileprofile alignments and estimate the statistical significances of those alignments by calculating zscore or Evalue. Although this scheme is reliable in recognizing relatively close homologs related at the family ..."
Abstract

Cited by 9 (1 self)
 Add to MetaCart
Motivation: Currently, the most accurate fold recognition method is to perform profileprofile alignments and estimate the statistical significances of those alignments by calculating zscore or Evalue. Although this scheme is reliable in recognizing relatively close homologs related at the family level, it has difficulty in finding the remote homologs that are related at the superfamily or fold level. Results: Here, we present an alternative way to estimate the significance of the alignments. The alignment between a query protein and a template of length n in the fold library is transformed into a feature vector of length n+1, which is then evaluated by support vector machine (SVM). The output from SVM is converted to a posterior probability that a query sequence is related to a template given SVM output. Results show that a new method shows significantly better performance than PSIBLAST and profileprofile alignment with zscore scheme. While PSIBLAST and zscore scheme detect 16 % and 20 % of superfamilyrelated proteins, respectively, at 90 % specificity, a new method detects 46 % of these proteins, resulting in more than two fold increase in sensitivity. More significantly, at the fold level, a new method can detect 14 % of remotely related proteins at 90 % specificity, remarkable result considering the fact that the other methods can detect almost none at the same level of specificity. Contact:
Fast and accurate algorithms for protein sidechain packing
, 2006
"... This article studies the protein sidechain packing problem using the treedecomposition of a protein structure. To obtain fast and accurate protein sidechain packing, protein structures are modeled using a geometric neighborhood graph, which can be easily decomposed into smaller blocks. Therefor ..."
Abstract

Cited by 8 (1 self)
 Add to MetaCart
This article studies the protein sidechain packing problem using the treedecomposition of a protein structure. To obtain fast and accurate protein sidechain packing, protein structures are modeled using a geometric neighborhood graph, which can be easily decomposed into smaller blocks. Therefore, the sidechain assignment of the whole protein can be assembled from the assignment of the small blocks. Although we will show that the sidechain packing problem is still NPhard, we can achieve a treedecompositionbased globally optimal algorithm with time complexity of O(Nn tw+1 rot) and several polynomialtime approximation schemes (PTAS), where N is the number of residues contained in the protein, nrot the average number of rotamers for each residue, and tw = O(N 2/3 log N) the treewidth of the protein structure graph. Experimental results indicate that after Goldstein deadend elimination is conducted, nrot is very small and tw is equal to 3 or 4 most of the time. Based on the globally optimal algorithm, we developed a protein sidechain assignment program TreePack, which runs up to 90 times faster than SCWRL 3.0, a widelyused sidechain packing program, on some large test proteins in the SCWRL benchmark database and an average of five times faster on all the test proteins in this database. There are also some realworld
Improved pairwise alignments of proteins in the Twilight Zone using local structure predictions
 Bioinformatics
, 2006
"... doi:10.1093/bioinformatics/bti828 ..."
Assessment of RAPTOR's Linear Programming Approach in CAFASP3
 in cafasp3. Proteins
, 2003
"... We have developed a new algorithm based on the mathematical theory of linear programming (LP) and implemented it in our program RAPTOR. Our new approach provides an elegant formulation of the protein threading problem, overcomes the intractability problem of protein threading, in practice, and allow ..."
Abstract

Cited by 7 (2 self)
 Add to MetaCart
We have developed a new algorithm based on the mathematical theory of linear programming (LP) and implemented it in our program RAPTOR. Our new approach provides an elegant formulation of the protein threading problem, overcomes the intractability problem of protein threading, in practice, and allows us to use existing powerful linear programming software to obtain optimal protein threading solutions. CASP5 and CAFASP3 gave us the rst chance to test RAPTOR in an unbiased way. RAPTOR was ranked as the top individual (automatic) server for fold recognition by the CAFASP3 organizers. In this short paper, we describe RAPTOR's LP formulation, assess RAPTOR's performance in CAFASP3/CASP5, explain why it has superceded other existing automatic individual methods, and point out its strengths, limitations, extensions and prospects for improvement.
IPASS: error tolerant NMR backbone resonance assignment by linear programming
, 2009
"... Abstract. The automation of the entire NMR protein structure determination process requires a superior error tolerant backbone resonance assignment method. Although a variety of assignment approaches have been developed, none works well on noisy automatically picked peaks. IPASS is proposed as a nov ..."
Abstract

Cited by 6 (5 self)
 Add to MetaCart
Abstract. The automation of the entire NMR protein structure determination process requires a superior error tolerant backbone resonance assignment method. Although a variety of assignment approaches have been developed, none works well on noisy automatically picked peaks. IPASS is proposed as a novel integer linear programming (ILP) based assignment method. In order to reduce size of the problem, IPASS employs probabilistic spin system typing based on chemical shifts and secondary structure predictions. Furthermore, IPASS extracts connectivity information from the interresidue information and the 15 Nedited NOESY peaks which are then used to fix reliable fragments. The experimental results demonstrate that IPASS significantly outperforms the previous assignment methods on the synthetic data sets. It achieves an average of 99 % precision and 96 % recall on the synthesized spin systems, and an average of 96 % precision and 90 % recall on the synthesized peak lists. When applied on automatically picked peaks from experimentally derived data sets, it achieves an average precision and recall of 78 % and 67%, respectively. In contrast, the next best method, MARS, achieved an average precision and recall of 50 % and 40%, respectively. Availability: IPASS is available upon request, and the web server for IPASS is under construction.
Efficient Parameterized Algorithm for Biopolymer StructureSequence Alignment
 In Proceedings of Workshop on Algorithms for Bioinformatics
, 2005
"... Abstract. Computational alignment of a biopolymer sequence (e.g., an RNA or a protein) to a structure is an effective approach to predict and search for the structure of new sequences. To identify the structure of remote homologs, the structuresequence alignment has to consider not only sequence si ..."
Abstract

Cited by 6 (2 self)
 Add to MetaCart
Abstract. Computational alignment of a biopolymer sequence (e.g., an RNA or a protein) to a structure is an effective approach to predict and search for the structure of new sequences. To identify the structure of remote homologs, the structuresequence alignment has to consider not only sequence similarity but also spatially conserved conformations caused by residue interactions, and consequently is computationally intractable. It is difficult to cope with the inefficiency without compromising alignment accuracy, especially for structure search in genomes or large databases. This paper introduces a novel method and a parameterized algorithm for structuresequence alignment. Both the structure and the sequence are represented as graphs, where in general the graph for a biopolymer structure has a naturally small tree width. The algorithm constructs an optimal alignment by finding in the sequence graph the maximum valued subgraph isomorphic to the structure graph. It has the computational time complexity O(k t N 2) for the structure of N residues and its tree decomposition of width t. The parameter k, small in nature, is determined by a statistical cutoff for the correspondence between the structure and the sequence. The paper demonstrates a successful application of the algorithm to developing a fast program for RNA structural homology search. 1
alignment
"... FragQA: predicting local fragment quality of a sequencestructure ..."
Abstract

Cited by 5 (1 self)
 Add to MetaCart
FragQA: predicting local fragment quality of a sequencestructure