Results 1 - 10
of
21
Protein Folding in the Hydrophobic-Hydrophilic (HP) Model is NP-complete
, 1998
"... One of the simplest and most popular biophysical mod-els of protein folding is the hydrophobic-hydrophilic (HP) model. The HP model abstracts the hydrophobic in-teraction in protein folding by labeling the amino acids as hydrophobic (H for nonpolar) or hydrophilic (P for polar). Chains of amino acid ..."
Abstract
-
Cited by 99 (0 self)
- Add to MetaCart
One of the simplest and most popular biophysical mod-els of protein folding is the hydrophobic-hydrophilic (HP) model. The HP model abstracts the hydrophobic in-teraction in protein folding by labeling the amino acids as hydrophobic (H for nonpolar) or hydrophilic (P for polar). Chains of amino acids are con6gured as self-avoiding nalks on the 3D cubic lattice, where an opti-mal conformation maximizes the number of adjacencies between H’s. In this paper, the protein folding prob-lem under the HP model on the cubic lattice is shown to be NP-complete. This means that the protein fold-ing problem belongs to a large set of problems that are believed to be computationally intractable.
Algorithmic aspects of protein structure similarity
- In 40th Annual Symposium on Foundations of Computer Science
, 1999
"... We show that calculating contact map overlap (a measure of similarity of protein structures) is NPhard, but can be solved in polynomial time for several interesting and relevant special cases. We identify an important special case of this problem corresponding to self-avoiding walks, and prove a dec ..."
Abstract
-
Cited by 46 (3 self)
- Add to MetaCart
We show that calculating contact map overlap (a measure of similarity of protein structures) is NPhard, but can be solved in polynomial time for several interesting and relevant special cases. We identify an important special case of this problem corresponding to self-avoiding walks, and prove a decomposition theorem and a corollary approximation result for this special case. These are the rst approximation algorithms with guaranteed error bounds, and NPcompleteness results in the literature in the area of protein structure alignment/fold recognition for measures of structure similarity of practical interest. A
Rapid protein side-chain packing via tree decomposition
- Research in Computational Molecular Biology, Lecture Notes in Computer Science
, 2005
"... Abstract. This paper proposes a novel tree decomposition based side-chain assignment algorithm, which can obtain the globally optimal solution of the side-chain packing problem very efficiently. Theoretically, the computational complexity of this algorithm is O((N +M)n tw+1 rot) where N is the numbe ..."
Abstract
-
Cited by 17 (1 self)
- Add to MetaCart
Abstract. This paper proposes a novel tree decomposition based side-chain assignment algorithm, which can obtain the globally optimal solution of the side-chain packing problem very efficiently. Theoretically, the computational complexity of this algorithm is O((N +M)n tw+1 rot) where N is the number of residues in the protein, M the number of interacting residue pairs, nrot the average number of rotamers for each residue and tw( = O(N 2 3 log N)) the tree width of the residue interaction graph. Based on this algorithm, we have developed a side-chain prediction program SCATD (Side Chain Assignment via Tree Decomposition). Experimental results show that after the Goldstein DEE is conducted, nrot is around 3.5, tw is only 3 or 4 for most of the test proteins in the SCWRL benchmark and less than 10 for all the test proteins. SCATD runs up to 90 times faster than SCWRL 3.0 on some large proteins in the SCWRL benchmark and achieves an average of five times faster speed on all the test proteins. If only the post-DEE stage is taken into consideration, then our tree-decomposition based energy minimization algorithm is more than 200 times faster than that in SCWRL 3.0 on some large proteins. SCATD is freely available for academic research upon request. 1
Fold recognition by predicted alignment accuracy
- ACM/IEEE Transactions on Computational Biology and Bioinformatics
, 2005
"... Abstract—One of the key components in protein structure prediction by protein threading technique is to choose the best overall template for a given target sequence after all the optimal sequence-template alignments are generated. The chosen template should have the best alignment with the target se ..."
Abstract
-
Cited by 10 (3 self)
- Add to MetaCart
Abstract—One of the key components in protein structure prediction by protein threading technique is to choose the best overall template for a given target sequence after all the optimal sequence-template alignments are generated. The chosen template should have the best alignment with the target sequence since the three-dimensional structure of the target sequence is built on the sequence-template alignment. The traditional method for template selection is called Z-score, which uses a statistical test to rank all the sequence-template alignments and then chooses the first-ranked template for the sequence. However, the calculation of Z-score is time-consuming and not suitable for genome-scale structure prediction. Z-scores are also hard to interpret when the threading scoring function is the weighted sum of several energy items of different physical meanings. This paper presents a Support Vector Machine (SVM) regression approach to directly predict the alignment accuracy of a sequence-template alignment, which is used to rank all the templates for a specific target sequence. Experimental results on a large-scale benchmark demonstrate that SVM regression performs much better than the composition-corrected Z-score method. SVM regression also runs much faster than the Z-score method. Index Terms—Protein structure prediction, protein threading, protein fold recognition, SVM regression. 1
Protein Structure Prediction by Linear Programming
, 2003
"... If the primary sequence of a protein is given, what is its three-dimensional structure? This is one of the most important and dicult problems in molecular biology and has tremendous implication to proteomics. Over the last three decades, this issue has been intensely researched. Protein threading re ..."
Abstract
-
Cited by 9 (0 self)
- Add to MetaCart
If the primary sequence of a protein is given, what is its three-dimensional structure? This is one of the most important and dicult problems in molecular biology and has tremendous implication to proteomics. Over the last three decades, this issue has been intensely researched. Protein threading represents one of the most promising techniques. So far, there are many protein structure prediction computer programs based on protein threading; however, almost none incorporates the pairwise contact (interaction) potential explicitly in its energy function, although scientists believe that pairwise interactions are important for fold recognition targets. The underlying reason for ignoring the pairwise potential is that the protein threading problem is NP-hard (i.e., it is unlikely to have a polynomial-time algorithm), if the pairwise interactions are treated rigorously.
Solving the Protein Threading Problem in Parallel
- In IPDPS ’03: Proceedings of the 17th International Symposium on Parallel and Distributed Processing
, 2003
"... We propose a network flow formulation for protein threading and show its equivalence with the shortest path problem on a graph with a very particular structure. The underying Mixed Integer Programming (MIP) model proves to be very appropriate for the protein threading problem--huge real-life instanc ..."
Abstract
-
Cited by 6 (3 self)
- Add to MetaCart
We propose a network flow formulation for protein threading and show its equivalence with the shortest path problem on a graph with a very particular structure. The underying Mixed Integer Programming (MIP) model proves to be very appropriate for the protein threading problem--huge real-life instances have been solved in a reasonable time by using only a Mixed Integer Optimizer instead of a special-purpose branch&bound algorithm. The properties of the MIP model allow decomposition of the main problem on a large number of subproblems (tasks). We show in this paper that a branch&bound alike algorithm can be efficiently applied to solving in parallel these tasks, which leads to a significant reduction in the total running time. Computational experiments with huge problem instances are presented.
A tree-decomposition approach to protein structure prediction
- In Proc. 4th International IEEE Computer Society Computational Systems Bioinformatics Conference (CSB 2005
, 2005
"... This paper proposes a tree decomposition of protein structures, which can be used to efficiently solve two key subproblems of protein structure prediction: protein threading for backbone prediction and protein side-chain prediction. To develop a unified tree-decomposition based approach to these two ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
This paper proposes a tree decomposition of protein structures, which can be used to efficiently solve two key subproblems of protein structure prediction: protein threading for backbone prediction and protein side-chain prediction. To develop a unified tree-decomposition based approach to these two subproblems, we model them as a geometric neighborhood graph labeling problem. Theoretically, we can have a low-degree polynomial time al-gorithm to decompose a geometric neighborhood graph G = (V, E) into components with size O(|V | 2 3 log |V |). The computational complexity of the tree-decomposition based graph labeling algorithms is O(|V | ∆ tw+1) where ∆ is the average number of possible labels for each vertex and tw( = O(|V | 2 3 log |V |)) the tree width of G. Empirically, tw is very small and the tree-decomposition method can solve these two problems very efficiently. This paper also compares the computational efficiency of the treedecomposition approach with the linear programming approach to these two problems and identifies the condition under which the tree-decomposition approach is more efficient than the linear programming approach. Experimental result indicates that the tree-decomposition approach is more efficient most of the time. 1
Protein threading based on multiple protein structure alignment
- Genome Inform. Ser. Workshop Genome Inform
, 1999
"... Protein threading, a method employed in protein three-dimensional (3D) structure prediction was only proposed in the early 1990’s although predicting protein 3D structure from its given amino acid sequence has been around since 1970’s. Here we describe a protein threading method/system that we have ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
Protein threading, a method employed in protein three-dimensional (3D) structure prediction was only proposed in the early 1990’s although predicting protein 3D structure from its given amino acid sequence has been around since 1970’s. Here we describe a protein threading method/system that we have developed based on multiple protein structure alignment. In order to compute multiple structure alignments, we developed a similar structure search program on massive parallel computers and a program for constructing a multiple structure alignment from pairwise structure alignments, where the latter is based on the center star method for sequence alignment. A simple dynamic-programming based algorithm which uses a profile matrix obtained from the result of multiple structure alignment was also developed to compute a threading (i.e., an alignment between a target sequence and a known structure). Using this system, we participated in the threading category (category AL) of CASP3 (Third Community Wide Experiment on the Critical Assessment of Techniques for Protein Structure Prediction). The results are discussed.
Recent Advances in Solving the Protein Threading Problem
"... apport d e r echerche ISSN 0249-6399 ISRN INRIA/RR--6253--FR+ENG ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
apport d e r echerche ISSN 0249-6399 ISRN INRIA/RR--6253--FR+ENG
Fast molecular shape matching using contact maps
, 2002
"... In this paper, we study the problem of computing the similarity of two protein structures by measuring their contact-map overlap. Contact-map overlap abstracts the problem of computing the similarity of two polygonal chains as a graph-theoretic problem. In R 3,we present the first polynomial time al ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
In this paper, we study the problem of computing the similarity of two protein structures by measuring their contact-map overlap. Contact-map overlap abstracts the problem of computing the similarity of two polygonal chains as a graph-theoretic problem. In R 3,we present the first polynomial time algorithm with any guarantee on the approximation ratio for the 3-dimensional problem. More precisely, we give an algorithm for the contact-map overlap problem with an approximation ratio of σ, where σ = min{σ(P1),σ(P2)} ≤O(n 1/2) is a decomposition parameter depending on the input polygonal chains P1 and P2. InR 2, we improve the running time of the previous best known approximation algorithm from O(n 6) to O(n 3 log n) at the cost of decreasing the approximation ratio by half. We also give hardness results for the problem in three dimensions, suggesting that approximating it better than O(n ε), for some ε>0, is hard. Key words: shape matching, molecular structures, contact maps, graph theory. 1.

