Results 1  10
of
17
Approximate Protein Structural Alignment in Polynomial Time
 Proc. Natl Acad. Sci. USA
, 2004
"... Alignment of protein structures is a fundamental task in computational molecular biology. Good structural alignments can help detect distant evolutionary relationships that are hard or impossible to discern from protein sequences alone. Here, we study the structural alignment problem as a family of ..."
Abstract

Cited by 50 (1 self)
 Add to MetaCart
Alignment of protein structures is a fundamental task in computational molecular biology. Good structural alignments can help detect distant evolutionary relationships that are hard or impossible to discern from protein sequences alone. Here, we study the structural alignment problem as a family of optimization problems and develop an approximate polynomial time algorithm to solve them. For a commonly used scoring function, the algorithm runs in O(n ) time, for globular protein of length n, when we wish to detect all scores that are at most # distance away from the optimum. We argue that such approximate solutions are, in fact, of greater interest than exact ones, due to the noisy nature of experimentally determined protein coordinates. The measurement of similarity between a pair of protein structures used by the algorithm involves the Euclidean distance between the structures, after rigidly transforming them. We show that an alternative approach, which relies on internal distance matrices, must incorporate sophisticated geometric ingredients in order to both guarantee optimality and run in polynomial time. We use these observations to visualize the scoring function for several real instances of the problem. Our investigations yield new insights on the computational complexity of protein alignment under various scoring functions. These insights can be used in the design of new scoring functions for which the optimum can be approximated e#ciently, and perhaps in the development of e#cient algorithms for the multiple structural alignment problem.
Curve matching, time warping, and light fields: New algorithms for computing similarity between curves
 J. Mathematic Imaging and Vision
"... The problem of curve matching appears in many application domains, like time series analysis, shape matching, speech recognition, and signature verification, among others. Curve matching has been studied extensively by computational geometers, and many measures of similarity have been examined, amon ..."
Abstract

Cited by 21 (0 self)
 Add to MetaCart
(Show Context)
The problem of curve matching appears in many application domains, like time series analysis, shape matching, speech recognition, and signature verification, among others. Curve matching has been studied extensively by computational geometers, and many measures of similarity have been examined, among them being the Fréchet distance (sometimes referred in folklore as the “dogman ” distance). A measure that is very closely related to the Fréchet distance but has never been studied in a geometric context is the Dynamic Time Warping measure (DTW), first used in the context of speech recognition. This measure is ubiquitous across different domains, a surprising fact because notions of similarity usually vary significantly depending on the application. However, this measure suffers from some drawbacks, most importantly the fact that it is defined between sequences of points rather than curves. Thus, the way in which a curve is sampled to yield such a sequence can dramatically affect the quality of the result. Some attempts have been made to generalize the DTW to continuous domains, but the resulting algorithms have exponential complexity. In this paper we propose similarity measures that attempt to capture the “spirit ” of dynamic time warping while being defined over continuous domains, and present efficient algorithms for computing them. Our formulation leads to a very interesting connection with finding short paths in a combinatorial manifold defined on the input chains, and in a deeper sense relates to the way light travels in a medium of variable refractivity. 1
Self Generating Metaheuristics in Bioinformatics: The Proteins Structure Comparison Case
 Genetic Programming and Evolvable Machines
, 2004
"... In this paper we describe the application of a so called "SelfGenerating" Memetic Algorithm to the Maximum Contact Map Overlap problem (MAXCMO). The maximum overlap of contact maps is emerging as a leading modeling technique to obtain structural alignment among pairs of protein structure ..."
Abstract

Cited by 16 (5 self)
 Add to MetaCart
In this paper we describe the application of a so called "SelfGenerating" Memetic Algorithm to the Maximum Contact Map Overlap problem (MAXCMO). The maximum overlap of contact maps is emerging as a leading modeling technique to obtain structural alignment among pairs of protein structures. Identifying structural alignments (and hence similarity among proteins) is essential to the correct assessment of the relation between proteins structure and function. A robust methodology for structural comparison could have impact on the process of rational drug design.
Bayesian protein structure alignment
, 2006
"... The analysis of the three dimensional structure of proteins is an important topic in molecular biochemistry. Structure plays a critical role in defining the function of proteins, and is more strongly conserved than amino acid sequence over evolutionary timescales. A key challenge is the identificati ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
The analysis of the three dimensional structure of proteins is an important topic in molecular biochemistry. Structure plays a critical role in defining the function of proteins, and is more strongly conserved than amino acid sequence over evolutionary timescales. A key challenge is the identification and evaluation of structural similarity between proteins; such analysis can aid in understanding the role of newly discovered proteins, and help elucidate evolutionary relationships between organisms. Computational biologists have developed many clever algorithmic techniques for comparing protein structures; however, all are based on heuristic optimization criteria making statistical interpretation somewhat difficult. Here we present a fully probabilistic framework for pairwise structural alignment of proteins. Our approach has several advantages, including the ability to capture alignment uncertainty, and to estimate key ’gap ’ parameters which critically affect the quality of the alignment. We show that several existing alignment methods arise as maximum a posteriori estimates under specific choices of prior distributions and error models. Our probabilistic framework is also easily extended to incorporate additional information we demonstrate this by inclusion of primary sequence information to generate simultaneous sequencestructure alignments that can resolve ambiguities obtained using structure alone. This combined model also provides a natural approach for the difficult task of estimating evolutionary distance based on structural alignments. The model is illustrated by comparison with wellestablished methods on several challenging protein alignment examples. 1
A new descriptor for 3d trajectory recognition
 The Ninth International Symposium on Operations Research and Its Applications
, 2010
"... Abstract Motion trajectory contains plentiful of motion information which is useful for motion analysis in many tasks. Motion recognition via trajectory is important in motion analysis for many human and robotic tasks. An effective descriptor for motion trajectories plays an important role in the re ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
Abstract Motion trajectory contains plentiful of motion information which is useful for motion analysis in many tasks. Motion recognition via trajectory is important in motion analysis for many human and robotic tasks. An effective descriptor for motion trajectories plays an important role in the recognition algorithm. In this paper, we propose a new descriptor with a modified data alignment method for motion trajectory recognition. Experimental results demonstrate the effectiveness of our method.
A Relational Extension to the Notion of Motifs: An application to the Common 3D Protein Substructures Searching Problem
, 2009
"... The geometric configurations of atoms in protein structures can be viewed as approximate relations among them. Then, finding similar common substructures within a set of protein structures belongs to a new class of problems that generalizes that of finding repeated motifs. The novelty lies in the ad ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
The geometric configurations of atoms in protein structures can be viewed as approximate relations among them. Then, finding similar common substructures within a set of protein structures belongs to a new class of problems that generalizes that of finding repeated motifs. The novelty lies in the addition of constraints on the motifs in terms of relations that must hold between pairs of positions of the motifs. We will hence denote them as relational motifs. For this class of problems we present an algorithm that is a suitable extension of the KMR (Karp et al., 1972) paradigm and, in particular, of the KMRC (Soldano et al., 1995) as it uses a degenerate alphabet. Our algorithm contains several improvements with respect to (Soldano et al., 1995) that become especially useful when—as it is required for relational motifs—the inference is made by partially overlapping shorter motifs, rather than concatenating them like in (Karp et al., 1972). The efficiency, correctness and completeness of the algorithm is ensured by several nontrivial properties that are proven in this paper. The algorithm has been applied in the important field of protein common 3D substructure searching. The methods implemented have been tested on several examples of protein families such as serine proteases, globins and cytochromes P450 additionally. The detected motifs have been compared to those found by multiple structural alignments methods. 1 1
A Comparison of Computational Methods for the Maximum Contact Map Overlap of Protein Pairs
"... ... this paper to discuss the mathematical properties of MAXCMO in detail as this has been dealt elsewhere [13],[23], [1]. In this paper we compare three algorithms that can be used to obtain maximum contact map overlaps between protein structures. We will point to the weaknesses and strengths of e ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
... this paper to discuss the mathematical properties of MAXCMO in detail as this has been dealt elsewhere [13],[23], [1]. In this paper we compare three algorithms that can be used to obtain maximum contact map overlaps between protein structures. We will point to the weaknesses and strengths of each one. It is our hope that this paper will encourage researchers to develop new and improve methods for protein comparison based on MAXCMO.
Printed in Great Britain Bayesian alignment using hierarchical models, with applications in protein bioinformatics
"... An important problem in shape analysis is to match configurations of points in space after filtering out some geometrical transformation. In this paper we introduce hierarchical models for such tasks, in which the points in the configurations are either unlabelled or have at most a partial labelling ..."
Abstract
 Add to MetaCart
An important problem in shape analysis is to match configurations of points in space after filtering out some geometrical transformation. In this paper we introduce hierarchical models for such tasks, in which the points in the configurations are either unlabelled or have at most a partial labelling constraining the matching, and in which some points may only appear in one of the configurations. We derive procedures for simultaneous inference about the matching and the transformation, using a Bayesian approach. Our hierarchical model is based on a Poisson process for hidden true point locations; this leads to considerable mathematical simplification and efficiency of implementation of em and Markov chain Monte Carlo algorithms. We find a novel use for classical distributions from directional statistics in a conditionally conjugate specification for the case where the geometrical transformation includes an unknown rotation. Throughout, we focus on the case of affine or rigid motion transformations. Under a broad parametric family of loss functions, an optimal Bayesian point estimate of the matching matrix can be constructed that depends only on a single parameter of the family. Our methods are illustrated by two applications from bioinformatics. The first problem is of matching protein gels in two dimensions, and the second consists of aligning active sites of proteins in three dimensions. In the latter case, we also use information related to the grouping of the amino acids, as an example of a more general capability of our methodology to include partial labelling information. We discuss some open problems and suggest directions for future work.
References
"... Methods developed in the statistical theory of shape provide a natural approach to modeling variability in macromolecular structure. In previous work, we have utilized generalized Procrustes analysis to solve the multiple structure superposition problem for families of protein structures (Wu et al ..."
Abstract
 Add to MetaCart
(Show Context)
Methods developed in the statistical theory of shape provide a natural approach to modeling variability in macromolecular structure. In previous work, we have utilized generalized Procrustes analysis to solve the multiple structure superposition problem for families of protein structures (Wu et al. 1998a,b). In this talk I will describe recent directions in applying concepts from shape analysis to the modeling of protein structure families. Examples include modelbased clustering of protein structures using affineinvariant models; methods for structure alignment and comparison allowing for backbone flexibility; identification of hinge regions in flexible proteins using Bayesian changepoint analyses; and a fully Bayesian approach to accounting for uncertainty in structure comparison arising from the the alignment phase (“landmark registration”). Implications for problems of protein structure analysis including classification, database search, motif discovery, and active site prediction will be discussed. Further details can be found in Schmidler (2003).