Results 1 - 10
of
188
Shape Distributions
- ACM Transactions on Graphics
, 2002
"... this paper, we propose and analyze a method for computing shape signatures for arbitrary (possibly degenerate) 3D polygonal models. The key idea is to represent the signature of an object as a shape distribution sampled from a shape function measuring global geometric properties of an object. The pr ..."
Abstract
-
Cited by 295 (2 self)
- Add to MetaCart
(Show Context)
this paper, we propose and analyze a method for computing shape signatures for arbitrary (possibly degenerate) 3D polygonal models. The key idea is to represent the signature of an object as a shape distribution sampled from a shape function measuring global geometric properties of an object. The primary motivation for this approach is to reduce the shape matching problem to the comparison of probability distributions, which is simpler than traditional shape matching methods that require pose registration, feature correspondence, or model fitting
Matching 3D Models with Shape Distributions
, 2001
"... Measuring the similarity between 3D shapes is a fundamental problem, with applications in computer vision, molecular biology, computer graphics, and a variety of other fields. A challenging aspect of this problem is to find a suitable shape signature that can be constructed and compared quickly, whi ..."
Abstract
-
Cited by 215 (7 self)
- Add to MetaCart
Measuring the similarity between 3D shapes is a fundamental problem, with applications in computer vision, molecular biology, computer graphics, and a variety of other fields. A challenging aspect of this problem is to find a suitable shape signature that can be constructed and compared quickly, while still discriminating between similar and dissimilar shapes. In this paper, we propose and analyze a method for computing shape signatures for arbitrary (possibly degenerate) 3D polygonal models. The key idea is to represent the signature of an object as a shape distribution sampled from a shape function measuring global geometric properties of an object. The primary motivation for this approach is to reduce the shape matching problem to the comparison of probability distributions, which is simpler than traditional shape matching methods that require pose registration, feature correspondence, or model fitting. We find that the dissimilarities between sampled distributions of simple shape functions (e.g., the distance between two random points on a surface) provide a robust method for discriminating between classes of objects (e.g., cars versus airplanes) in a moderately sized database, despite the presence of arbitrary translations, rotations, scales, mirrors, tessellations, simplifications, and model degeneracies. They can be evaluated quickly, and thus the proposed method could be applied as a pre-classifier in an object recognition system or in an interactive content-based retrieval application.
3D shape histograms for similarity search and classification in spatial databases
- SSD'99
, 1999
"... Classification is one of the basic tasks of data mining in modern database applications including molecular biology, astronomy, mechanical engineering, medical imaging or meteorology. The underlying models have to consider spatial properties such as shape or extension as well as thematic attributes ..."
Abstract
-
Cited by 179 (10 self)
- Add to MetaCart
Classification is one of the basic tasks of data mining in modern database applications including molecular biology, astronomy, mechanical engineering, medical imaging or meteorology. The underlying models have to consider spatial properties such as shape or extension as well as thematic attributes. We introduce 3D shape histograms as an intuitive and powerful similarity model for 3D objects. Particular flexibility is provided by using quadratic form distance functions in order to account for errors of measurement, sampling, and numerical rounding that all may result in small displacements and rotations of shapes. For query processing, a general filter-refinement architecture is employed that efficiently supports similarity search based on quadratic forms. An experimental evaluation in the context of molecular biology demonstrates both, the high classification accuracy of more than 90 % and the good performance of the approach.
The relationship between protein structure and function: a Yearbook of Medical Informatics 2001 97 Paper comprehensive survey with application to the yeast genome
- J Mol Biol
"... (Version ff225rev sent to the Journal of Molecular Biology) For most proteins in the genome databases, function is predicted via sequence comparison. In spite of the popularity of this approach, the extent to which it can be reliably applied is unknown. We address this issue by systematically invest ..."
Abstract
-
Cited by 158 (26 self)
- Add to MetaCart
(Show Context)
(Version ff225rev sent to the Journal of Molecular Biology) For most proteins in the genome databases, function is predicted via sequence comparison. In spite of the popularity of this approach, the extent to which it can be reliably applied is unknown. We address this issue by systematically investigating the relationship between protein function and structure. We focus initially on enzymes classified by the Enzyme Commission (EC) and relate these to structurally classified proteins in the SCOP database. We find that the major SCOP fold classes have different propensities to carry out certain broad categories of functions. For instance, alpha/beta folds are disproportionately associated with enzymes, especially transferases and hydrolases, and all-alpha and small folds with non-enzymes, while alpha+beta folds have an equal tendency either way. These observations for the database overall are largely true for specific genomes. We focus, in particular, on yeast, analyzing it with many classifications in addition to SCOP and EC (i.e. COGs, CATH, MIPS), and find clear tendencies for fold-function association, across a broad spectrum of functions. Analysis with the COGs scheme also suggests that the functions of the most ancient proteins are more evenly distributed among different structural classes
Predicting the functional consequences of non-synonymous single nucleotide polymorphisms: structure-based assessment of amino acid variation.
- J. Mol. Biol.
, 2001
"... We have developed a formalism and a computational method for analyzing the potential functional consequences of non-synonymous single nucleotide polymorphisms. Our approach uses a structural model and phylogenetic information to derive a selection of structure and sequence-based features serving as ..."
Abstract
-
Cited by 126 (1 self)
- Add to MetaCart
We have developed a formalism and a computational method for analyzing the potential functional consequences of non-synonymous single nucleotide polymorphisms. Our approach uses a structural model and phylogenetic information to derive a selection of structure and sequence-based features serving as indicators of an amino acid polymorphim's effect on function. The feature values can be integrated into a probabilistic assessment of whether an amino acid polymorphism will affect the function or stability of a target protein. The method has been validated with data sets of unbiased mutations in the lac repressor and lysoyzyme. Applying our methodology to recent surveys of genetic variation in the coding regions of clinically important genes, we estimate that approximately 26-32 % of the natural non-synonymous single nucleotide polymorphisms have effects on function. This estimate suggests that a typical person will have about 6240-12,800 heterozygous loci that encode proteins with functional variation due to natural amino acid polymorphism.
COMPASS: a tool for comparison of multiple protein alignments with assessment of statistical significance
- J. Mol. Biol
, 2003
"... We present a novel method for the comparison of multiple protein align-ments with assessment of statistical significance (COMPASS). The method derives numerical profiles from alignments, constructs optimal local profile–profile alignments and analytically estimates E-values for the detected similari ..."
Abstract
-
Cited by 125 (33 self)
- Add to MetaCart
We present a novel method for the comparison of multiple protein align-ments with assessment of statistical significance (COMPASS). The method derives numerical profiles from alignments, constructs optimal local profile–profile alignments and analytically estimates E-values for the detected similarities. The scoring system and E-value calculation are based on a generalization of the PSI-BLAST approach to profile–sequence comparison, which is adapted for the profile–profile case. Tested along with existing methods for profile–sequence (PSI-BLAST) and profile– profile (prof_sim) comparison, COMPASS shows increased abilities for sensitive and selective detection of remote sequence similarities, as well as improved quality of local alignments. The method allows prediction of relationships between protein families in the PFAM database beyond the range of conventional methods. Two predicted relations with high sig-nificance are similarities between various Rossmann-type folds and between various helix-turn-helix-containing families. The potential value of COMPASS for structure/function predictions is illustrated by the detec-tion of an intricate homology between the DNA-binding domain of the CTF/NFI family and the MH1 domain of the Smad family.
Phylogenomic inference of protein molecular function: advances and challenges
- Bioinformatics
, 2004
"... Motivation: Protein families evolve a multiplicity of functions through gene duplication, speciation and other processes. As a number of studies have shown, standard methods of protein function prediction produce systematic errors on these data. Phylogenomic analysis—combining phylogenetic tree cons ..."
Abstract
-
Cited by 76 (3 self)
- Add to MetaCart
(Show Context)
Motivation: Protein families evolve a multiplicity of functions through gene duplication, speciation and other processes. As a number of studies have shown, standard methods of protein function prediction produce systematic errors on these data. Phylogenomic analysis—combining phylogenetic tree construction, integration of experimental data and differentiation of orthologs and paralogs—has been proposed to address these errors and improve the accuracy of functional classification. The explicit integration of structure prediction and analysis in this framework, which we call structural phylogenomics, provides additional insights into protein superfamily evolution. Results: Results of protein functional classification using phylogenomic analysis show fewer expected false positives overall than when pairwise methods of functional classification are employed. We present an overview of the motivations and fundamental principles of phylogenomic analysis, new methods developed for the key tasks, benchmark datasets for these tasks (when available) and suggest procedures to increase accuracy. We also discuss some of the methods used in the Celera Genomics high-throughput phylogenomic classification of the human genome. Availability: Software tools from the Berkeley Phylogenomics Group are available at
Large-Scale Comparison of Protein Sequence Alignment Algorithms With Structure Alignments
- Proteins
, 2000
"... Sequence alignment programs such as BLAST and PSI-BLAST are used routinely in pairwise, profile-based, or intermediate-sequencesearch (ISS) methods to detect remote homologies for the purposes of fold assignment and comparative modeling. Yet, the sequence alignment quality of these methods at low se ..."
Abstract
-
Cited by 64 (2 self)
- Add to MetaCart
Sequence alignment programs such as BLAST and PSI-BLAST are used routinely in pairwise, profile-based, or intermediate-sequencesearch (ISS) methods to detect remote homologies for the purposes of fold assignment and comparative modeling. Yet, the sequence alignment quality of these methods at low sequence identity is not known. We have used the CE structure alignment program (Shindyalov and Bourne, Prot Eng 1998;11: 739) to derive sequence alignments for all superfamily and family-level related proteins in the SCOP domain database. CE aligns structures and their sequences based on distances within each protein, rather than on interprotein distances. We compared BLAST, PSI-BLAST, CLUSTALW, and ISS alignments with the CE structural alignments. We found that global alignments with CLUSTALW were very poor at low sequence identity (<25%), as judged by the CE alignments. We used PSI-BLAST to search the nonredundant sequence database (nr) with every sequence in SCOP using up to four iterations. The resulting matrix was used to search a database of SCOP sequences. PSI-BLAST is only slightly better than BLAST in alignment accuracy on a perresidue basis, but PSI-BLAST matrix alignments are much longer than BLAST's, and so align correctly a larger fraction of the total number of aligned residues in the structure alignments. Any two SCOP sequences in the same superfamily that shared a hit or hits in the nr PSI-BLAST searches were identified as linked by the shared intermediate sequence. We examined the quality of the longest SCOP-query/ SCOP-hit alignment via an intermediate sequence, and found that ISS produced longer alignments than PSI-BLAST searches alone, of nearly comparable per-residue quality. At 10--15% sequence identity, BLAST correctly aligns 28%, PSI-BLAST 40%, and ISS ...
Nearest Neighbor Classification in 3D Protein Databases
- In Proc. ISMB
, 1999
"... In molecular databases, structural classification is a basic task that can be successfully approached by nearest neighbor methods. The underlying similarity models consider spatial properties such as shape and extension as well as thematic attributes. We introduce 3D shape histograms as an intui ..."
Abstract
-
Cited by 49 (1 self)
- Add to MetaCart
(Show Context)
In molecular databases, structural classification is a basic task that can be successfully approached by nearest neighbor methods. The underlying similarity models consider spatial properties such as shape and extension as well as thematic attributes. We introduce 3D shape histograms as an intuitive and powerful approach to model similarity for solid objects such as molecules. Errors of measurement, sampling, and numerical rounding may result in small displacements of atomic coordinates. These effects may be handled by using quadratic form distance functions. An efficient processing of similarity queries based on quadratic forms is supported by a filter-refinement architecture. Experiments on our 3D protein database demonstrate the high classification accuracy of more than 90% and the good performance of the technique. Keywords: 3D Protein Databases, Nearest Neighbor Classification, Geometric Similarity Search, Machine Learning 1 Introduction One important task for mod...
PROMALS3D: a tool for multiple protein sequence and structure alignments
- Nucleic Acids Res
, 2008
"... structure alignments ..."
(Show Context)