Results 1 - 10
of
23
Inferring Functional Relationships of Proteins from Local Sequence and Spatial Surface Patterns
- J. Mol. Biol
, 2003
"... es, and for further inquiries on evolutionary origins of structural elements important for protein function. q 2003 Elsevier Ltd. All rights reserved. Keywords: protein surface; surface pattern; protein function; pocket sequence; pocket shape *Corresponding author Introduction With rapid progres ..."
Abstract
-
Cited by 74 (15 self)
- Add to MetaCart
es, and for further inquiries on evolutionary origins of structural elements important for protein function. q 2003 Elsevier Ltd. All rights reserved. Keywords: protein surface; surface pattern; protein function; pocket sequence; pocket shape *Corresponding author Introduction With rapid progress in the determination of protein structures, 1,2 protein structural analysis has become an important source of information for understanding functional roles of proteins. Conservation of protein structures often reveals very distant evolutionary relationships, which are otherwise difficult to detect by sequence analysis alone. Analysis of protein structure can provide insightful ideas about the biochemical functions and mechanisms of proteins (e.g. active sites, catalytic residues, and substrate interactions). 9--11 An important approach of studying protein structures is fold analysis. Identifying the correct tertiary fold of protein is often helpful for inferring protein funct
Finding the consensus shape of a protein family
- Proc. 18th Annual ACM Symposium on Computational Geometry
"... Abstract We define and prove properties of the consensus shape for a family of proteins, a protein-likestructure that provides a compact summary of the significant structural information for a protein family. If all members of a protein family exhibit a geometric relationship between correspondingal ..."
Abstract
-
Cited by 21 (1 self)
- Add to MetaCart
Abstract We define and prove properties of the consensus shape for a family of proteins, a protein-likestructure that provides a compact summary of the significant structural information for a protein family. If all members of a protein family exhibit a geometric relationship between correspondingalpha carbons then that relationship is preserved in the consensus shape. In particular, distances and angles that are consistent across family members are preserved. For the consensus shape,the spacing between successive alpha carbons is variable, with small distances in regions where the members of the protein family exhibit significant variation and large distances (up to thestandard spacing of about 4*A) in regions where the family members agree. Despite this nonprotein-like characteristic, the consensus shape preserves and highlights important structuralinformation. We describe an iterative algorithm for computing the consensus shape and prove that the algorithm converges. We also present the results of experiments in which we buildconsensus shapes for several known protein families.
FSSA: a novel method for identifying functional signatures from structural alignments
- Bioinformatics
, 2005
"... doi:10.1093/bioinformatics/bti471 ..."
The URMS-RMS hybrid algorithm for fast and sensitive local protein structure alignment
- Journal of Computational Biology
, 2005
"... structure alignment ..."
SARA: a server for function annotation of RNA structures
- Nucleic Acids Res
, 2009
"... Recent interest in non-coding RNA transcripts has resulted in a rapid increase of deposited RNA structures in the Protein Data Bank. However, a characterization and functional classification of the RNA structure and function space have only been partially addressed. Here, we introduce the SARA progr ..."
Abstract
-
Cited by 12 (1 self)
- Add to MetaCart
(Show Context)
Recent interest in non-coding RNA transcripts has resulted in a rapid increase of deposited RNA structures in the Protein Data Bank. However, a characterization and functional classification of the RNA structure and function space have only been partially addressed. Here, we introduce the SARA program for pair-wise alignment of RNA structures as a web server for structure-based RNA function assignment. The SARA server relies on the SARA program, which aligns two RNA structures based on a unit-vector root-mean-square approach. The likely accuracy of the SARA alignments is assessed by three different P-values estimating the statistical significance of the sequence, secondary structure and tertiary structure identity scores, respectively. Our benchmarks, which relied on a set of 419 RNA structures with known SCOR structural class, indicate that at a negative logarithm of mean P-value higher or equal than 2.5, SARA can assign the correct or a similar SCOR class to 81.4 % and 95.3 % of the benchmark set, respectively. The SARA server is freely accessible via the World Wide Web at
Geometric Suffix Tree: A New Index Structure for Protein 3-D Structures
- Protein 3-D Structures, Combinatorial Pattern Matching 2006 (CPM 2006), LNCS 4009
, 2006
"... Abstract. Protein structure analysis is one of the most important research issues in the post-genomic era, and faster and more accurate query data structures for such 3-D structures are highly desired for research on proteins. This paper proposes a new data structure for indexing protein 3-D structu ..."
Abstract
-
Cited by 11 (3 self)
- Add to MetaCart
(Show Context)
Abstract. Protein structure analysis is one of the most important research issues in the post-genomic era, and faster and more accurate query data structures for such 3-D structures are highly desired for research on proteins. This paper proposes a new data structure for indexing protein 3-D structures. For strings, there are many efficient indexing structures such as suffix trees, but it has been considered very difficult to design such sophisticated data structures against 3-D structures like proteins. Our index structure is based on the suffix trees and is called the geometric suffix tree. By using the geometric suffix tree for a set of protein structures, we can search for all of their substructures whose RMSDs (root mean square deviations) or URMSDs (unit-vector root mean square deviations) to a given query 3-D structure are not larger than a given bound. Though there are O(N 2) substructures, our data structure requires only O(N) space where N is the sum of lengths of the set of proteins. We propose an O(N 2) construction algorithm for it, while a naive algorithm would require O(N 3) time to construct it. Moreover we propose an efficient search algorithm. We also show computational experiments to demonstrate the practicality of our data structure. The experiments show that the construction time of the geometric suffix tree is practically almost linear to the size of the database, when applied to a protein structure database. 1
Approximation of protein structure for fast similarity measures
- in Proceedings of the Sventh ACM Annual International Conference on Computational Biology (RECOMB), 2003
, 2003
"... It is shown that structural similarity between proteins can be decided well with much less information than what is used in common similarity measures. The full Cα representation contains redundant information because of the inherent chain topology of proteins and a limit on their compactness due to ..."
Abstract
-
Cited by 9 (0 self)
- Add to MetaCart
It is shown that structural similarity between proteins can be decided well with much less information than what is used in common similarity measures. The full Cα representation contains redundant information because of the inherent chain topology of proteins and a limit on their compactness due to excluded volume. A wavelet analysis on random chains and proteins justifies approximating subchains by their centers of mass. For not too compact chain-like structures in general, and proteins in particular, similarity measures that use this approximation are highly correlated to the exact similarity measures and are therefore useful, e.g., as fast filters. Experimental results with such simplified similarity measures in two applications, nearest neighbor search and automatic structural classification show a significant speed up.
Continuous Optimization Methods for Structural Alignment
- Mathematical Programming
, 2007
"... Structural Alignment is an important tool for fold identification of proteins, structural screening on ligand databases, pharmacophore identification and other applications. In the general case, the optimization problem of superimposing two structures is nonsmooth and nonconvex, so that most popular ..."
Abstract
-
Cited by 7 (3 self)
- Add to MetaCart
Structural Alignment is an important tool for fold identification of proteins, structural screening on ligand databases, pharmacophore identification and other applications. In the general case, the optimization problem of superimposing two structures is nonsmooth and nonconvex, so that most popular methods are heuristic and do not employ derivative information. Usually, these methods do not admit convergence theories of practical significance. In this work it is shown that the optimization of the superposition of two structures may be addressed using continuous smooth minimization. It is proved that, using a Low Order-Value Optimization approach, the nonsmoothness may be essentially ignored and classical optimization algorithms may be used. Within this context, a Gauss-Newton method is introduced for structural alignments incorporating (or not) transformations (as flexibility) on the structures. Convergence theorems are provided and practical aspects of implementation are described. Numerical experiments suggest that the Gauss-Newton methodology is competitive with state-of-the-art algorithms for protein alignment both in terms of quality and speed. Additional experiments on binding site identification, ligand and cofactor alignments illustrate the generality of this approach.
Efficient Substructure RMSD Query Algorithms
"... protein 3-D structure comparison Protein structure analysis is a very important research topic in the molecular biology of the postgenomic era. The RMSD (root mean square deviation) is the most frequently used measure for comparing two protein 3-D structures. In this paper, we deal with two fundamen ..."
Abstract
-
Cited by 4 (4 self)
- Add to MetaCart
protein 3-D structure comparison Protein structure analysis is a very important research topic in the molecular biology of the postgenomic era. The RMSD (root mean square deviation) is the most frequently used measure for comparing two protein 3-D structures. In this paper, we deal with two fundamental problems related to the RMSD. We first deal with a problem called the ‘range RMSD query ’ problem. Given an aligned pair of structures, the problem is to compute the RMSD between two aligned substructures of them without gaps. This problem has many applications in protein structure analysis. We propose a linear-time preprocessing algorithm that enables constant-time RMSD computation. Next, we consider a problem called the ‘substructure RMSD query ’ problem, which is a generalization of the above range RMSD query problem. It is a problem to compute the RMSD between any substructures of two unaligned structures without gaps. Based on the algorithm for the range RMSD problem, we propose an O(nm) preprocessing algorithm that enables constant-time RMSD computation, where n and m are the lengths of the given structures. Moreover, we propose O(nm log r/r)-time and O(nm/r)-space preprocessing algorithm that enables O(r) query, where r is an arbitrary integer such that 1 ≤ r ≤ min(n, m). We also show that our strategy also works for another measure called the URMSD (unit-vector root mean square deviation), which is a variant of the RMSD. 1
Classification, Clustering and Data-Mining of Biological Data
, 2009
"... The proliferation of biological databases and the easy access enabled by the Internet is having a beneficial impact on biological sciences and transforming the way research is conducted. There are currently over 1100 molecular biology databases dispersed throughout the Internet. However, very few of ..."
Abstract
- Add to MetaCart
The proliferation of biological databases and the easy access enabled by the Internet is having a beneficial impact on biological sciences and transforming the way research is conducted. There are currently over 1100 molecular biology databases dispersed throughout the Internet. However, very few of them integrate data from multiple sources. To assist in the functional and evolutionary analysis of the abundant number of novel proteins, we introduce the PROFESS (PROtein Function, Evolution, Structure and Sequence) database that integrates data from various biological sources. PROFESS is freely available at