## Algorithmic aspects of protein structure similarity (1999)

Venue: | In 40th Annual Symposium on Foundations of Computer Science |

Citations: | 55 - 3 self |

### BibTeX

@INPROCEEDINGS{Goldman99algorithmicaspects,

author = {Deborah Goldman and Sorin Istrail Y and Christos H. Papadimitriou Z},

title = {Algorithmic aspects of protein structure similarity},

booktitle = {In 40th Annual Symposium on Foundations of Computer Science},

year = {1999},

pages = {512--521}

}

### Years of Citing Articles

### OpenURL

### Abstract

We show that calculating contact map overlap (a measure of similarity of protein structures) is NPhard, but can be solved in polynomial time for several interesting and relevant special cases. We identify an important special case of this problem corresponding to self-avoiding walks, and prove a decomposition theorem and a corollary approximation result for this special case. These are the rst approximation algorithms with guaranteed error bounds, and NPcompleteness results in the literature in the area of protein structure alignment/fold recognition for measures of structure similarity of practical interest. A

### Citations

527 |
Protein Structure Comparison by Alignment of Distance Matrices
- Holm, Sander
- 1993
(Show Context)
Citation Context ...d over the past few years, attempting to assign to each pair of proteins a distance, presumably capturing the extent to which the two proteins \resemble" each other in structure, origin, and function =-=[17, 7, 34, 8, 28, 9, 16, 25, 27, 30, 29, 22, 17,4,36,5,24,35,37,38,32,21,32,19,17,18]-=-. The most important and popular such measures used in the Protein Science literature are these: The root-mean-square distance (RMSD) of the two proteins |the two three-dimensional structures are supe... |

263 |
Mapping the protein universe
- Holm, Sander
- 1996
(Show Context)
Citation Context ...d over the past few years, attempting to assign to each pair of proteins a distance, presumably capturing the extent to which the two proteins \resemble" each other in structure, origin, and function =-=[17, 7, 34, 8, 28, 9, 16, 25, 27, 30, 29, 22, 17,4,36,5,24,35,37,38,32,21,32,19,17,18]-=-. The most important and popular such measures used in the Protein Science literature are these: The root-mean-square distance (RMSD) of the two proteins |the two three-dimensional structures are supe... |

225 |
Database of homology-derived protein structures and the structural meaning of sequence alignment
- Sander, Schneider
- 1991
(Show Context)
Citation Context ...d over the past few years, attempting to assign to each pair of proteins a distance, presumably capturing the extent to which the two proteins \resemble" each other in structure, origin, and function =-=[17, 7, 34, 8, 28, 9, 16, 25, 27, 30, 29, 22, 17,4,36,5,24,35,37,38,32,21,32,19,17,18]-=-. The most important and popular such measures used in the Protein Science literature are these: The root-mean-square distance (RMSD) of the two proteins |the two three-dimensional structures are supe... |

218 |
A method to identify protein sequences that fold into a known three dimensional structure
- Bowie, Luthy, et al.
- 1991
(Show Context)
Citation Context |

215 |
A Solution for the Best Rotation to Relate Two Sets of Vectors. Acta Cryst
- Kabsch
- 1976
(Show Context)
Citation Context |

165 |
A new approach to protein fold recognition. Nature
- Jones, Taylor, et al.
- 1992
(Show Context)
Citation Context |

158 |
Protein Structure Alignment
- Taylor, Orengo
- 1989
(Show Context)
Citation Context |

124 |
Principles of protein folding: a perspective from simple exact models. Protein Sci
- KA
- 1995
(Show Context)
Citation Context ...algorithms introduce signi cant biases by disregarding this point. Also, the hydrophobic/hydrophilic character of the residues (believed by many to be the single most important predictor of structure =-=[14]-=-) is often not re ected in the distance calculation (this is especially true in the RMSD distance, and even more serious in the socalled C-alpha alignments [17]). Further, most models fail to take int... |

115 | Protein folding in the hydrophobic-hydrophilic (HP) model is NP-complete
- Berger, Leighton
- 1998
(Show Context)
Citation Context ...ures. First, some of them are notoriously non-robust [28, 9, 29, 17, 22, 4]. It is well-known that the mapping from sequence to structure (\the protein folding problem") is very complex and non-local =-=[10, 3]-=-; this means that there is very little relationship between the edit distance of two proteins and their three-dimensional similarity. Unfortunately, many alignment algorithms introduce signi cant bias... |

100 |
The structural alignment between two proteins: Is there a unique answer
- Godzik
- 1996
(Show Context)
Citation Context |

96 | On the complexity of protein folding
- Crescenzi, Goldman, et al.
- 1998
(Show Context)
Citation Context ...ures. First, some of them are notoriously non-robust [28, 9, 29, 17, 22, 4]. It is well-known that the mapping from sequence to structure (\the protein folding problem") is very complex and non-local =-=[10, 3]-=-; this means that there is very little relationship between the edit distance of two proteins and their three-dimensional similarity. Unfortunately, many alignment algorithms introduce signi cant bias... |

48 | Embedding graphs in books: a layout problem with applications to VLSI design
- Chung, Leighton, et al.
- 1987
(Show Context)
Citation Context ...act maps are the contact maps of two-dimensional self-avoiding walks. We identify two important special cases of contact map graphs, the queue and the stack (previously studied in the context of VLSI =-=[6, 20]-=-), as well as the staircase (a special case of the queue) and the augmented staircase (a staircase with a stack embedded in it in a restricted way). We develop polynomial-time dynamic programmingalgo... |

37 |
Detection of common three-dimensional substructures in proteins.” Proteins 11(1
- Vriend, Sander
- 1991
(Show Context)
Citation Context |

36 | Computing similarity between RNA strings
- Bafna, Muthukrishnan, et al.
- 1995
(Show Context)
Citation Context ...gn. Contact maps are also used extensively in the study of RNA structure. The three-dimensional structure of RNA is also the object of current intense study, and contact maps have been employed in it =-=[2, 26, 33]-=-. Calculating the contact map overlap distance of two RNA structures is another fundamental problem. It had been known that the three-dimensional structure of RNA is more dominated by its two-dimensio... |

36 | A polyhedral approach to RNA sequence structure alignment
- Lenhof, Reinert, et al.
- 1998
(Show Context)
Citation Context ...gn. Contact maps are also used extensively in the study of RNA structure. The three-dimensional structure of RNA is also the object of current intense study, and contact maps have been employed in it =-=[2, 26, 33]-=-. Calculating the contact map overlap distance of two RNA structures is another fundamental problem. It had been known that the three-dimensional structure of RNA is more dominated by its two-dimensio... |

35 |
Finding common subsequence with arcs and pseudoknots
- Evans
- 1999
(Show Context)
Citation Context ... knowledge, this is the rst theoretical study of protein structure similarity. An independent formulation of the problem and NP-completeness proof for a di erent measure involving RNA can be found in =-=[15]-=-. Also, a more general measure was used by [1] to provide hardness results and algorithms for threading. B Problem Formulation and NP-completeness A contact map (n; E) is an undirected graph G = (V;E)... |

27 | On the approximation of protein threading
- AKUTSU, MIYANO
- 1999
(Show Context)
Citation Context ...f protein structure similarity. An independent formulation of the problem and NP-completeness proof for a di erent measure involving RNA can be found in [15]. Also, a more general measure was used by =-=[1]-=- to provide hardness results and algorithms for threading. B Problem Formulation and NP-completeness A contact map (n; E) is an undirected graph G = (V;E) such that the set of vertices V = f1; 2;:::;n... |

16 |
Regularities in interaction patterns of globular proteins
- Godzik, Skolnick, et al.
- 1993
(Show Context)
Citation Context |

13 |
Simultaneous solution of the RNA folding, alignment and protosequence problems
- Sanko
- 1985
(Show Context)
Citation Context ...gn. Contact maps are also used extensively in the study of RNA structure. The three-dimensional structure of RNA is also the object of current intense study, and contact maps have been employed in it =-=[2, 26, 33]-=-. Calculating the contact map overlap distance of two RNA structures is another fundamental problem. It had been known that the three-dimensional structure of RNA is more dominated by its two-dimensio... |

9 |
The alignment of protein structures in three dimensions
- Zuker, Somorjai
- 1989
(Show Context)
Citation Context |

8 |
Catching a common fold
- Blundell, Johnson
- 1993
(Show Context)
Citation Context |

8 |
On the comparison of conformations using linear and quadratic transformations. Acta Crystallographica A32:1-10
- Diamond
- 1976
(Show Context)
Citation Context ... the Protein Science literature are these: The root-mean-square distance (RMSD) of the two proteins |the two three-dimensional structures are superposed in such away that their L2 metric is minimized =-=[28, 9,11,12,13,16,25,27, 30, 29]-=- A related measure is the di erence of the distance matrices [21, 32,22]. In this paper we examine an emerging important distance measure called contact map overlap. To compute this distance between t... |

7 |
A toolkit for computational molecular biology. ii. on the optimal superposition of of two sets of molecules
- Lesk
- 1986
(Show Context)
Citation Context |

7 |
Size-independent comparison of protein three-dimensional structures
- Maiorov, Gordon
- 1995
(Show Context)
Citation Context |

4 |
Superimposing Several Sets of Atomic Coordinates
- Gerber, Müller
- 1987
(Show Context)
Citation Context |

4 |
The page number of genus g graphs is o(g
- Heath, Istrail
- 1992
(Show Context)
Citation Context ...act maps are the contact maps of two-dimensional self-avoiding walks. We identify two important special cases of contact map graphs, the queue and the stack (previously studied in the context of VLSI =-=[6, 20]-=-), as well as the staircase (a special case of the queue) and the augmented staircase (a staircase with a stack embedded in it in a restricted way). We develop polynomial-time dynamic programmingalgo... |

3 |
A note on the rotational superposition problem. Acta Crystallogr. sect
- Diamond
- 1992
(Show Context)
Citation Context ... the Protein Science literature are these: The root-mean-square distance (RMSD) of the two proteins |the two three-dimensional structures are superposed in such away that their L2 metric is minimized =-=[28, 9,11,12,13,16,25,27, 30, 29]-=- A related measure is the di erence of the distance matrices [21, 32,22]. In this paper we examine an emerging important distance measure called contact map overlap. To compute this distance between t... |

1 |
On the prediction of protein structure: the signi cance of the root-mean-square deviation
- Cohen, Sternberg
- 1980
(Show Context)
Citation Context |

1 |
A topology ngerprint approach toinverse protein folding problem
- Godzik, Skolnick, et al.
- 1992
(Show Context)
Citation Context |

1 |
Signi - cance of root-mean-square deviation in comparing three-dimensional structures of globular proteins
- Maiorov, Crippen
- 1994
(Show Context)
Citation Context |

1 |
A mathematical procedure for superimposing atomic coordinates of proteins
- McLaghlan
- 1972
(Show Context)
Citation Context |

1 |
Estimation of e ective enterresidue contact energies from protein crystal structures: quasi-chemical approximation
- Miyazawa, Jernigan
- 1985
(Show Context)
Citation Context ...ignment, protein structure alignment, and threading [34, 8, 7]. They are also used extensively for the calculation of statistical potentials, a most popular example being the Miyazawa-Jernigan matrix =-=[31]-=-, a 20 20 matrix, whose entries re ect the frequency of contact between pairs of amino acids in a protein database. These potentials are in turn used for simulating protein folding, judging the qualit... |

1 |
A holistic approach toprotein structure alignment
- Taylor, Orengo
- 1989
(Show Context)
Citation Context |