Results 1 - 10
of
32
De novo prediction of three-dimensional structures for major protein families
- J. Mol. Biol
, 2002
"... As the number of gene sequences in databases, public and private, increase dramatically, so do the number of genes of unknown function. Of the protein sequences currently available approximately ..."
Abstract
-
Cited by 25 (10 self)
- Add to MetaCart
As the number of gene sequences in databases, public and private, increase dramatically, so do the number of genes of unknown function. Of the protein sequences currently available approximately
Sali A: Statistical potentials for fold assessment
- Protein Sci 2002
"... A protein structure model generally needs to be evaluated to assess whether or not it has the correct fold. To improve fold assessment, four types of a residue-level statistical potential were optimized, including distance-dependent, contact, �/ � dihedral angle, and accessible surface statistical p ..."
Abstract
-
Cited by 17 (3 self)
- Add to MetaCart
A protein structure model generally needs to be evaluated to assess whether or not it has the correct fold. To improve fold assessment, four types of a residue-level statistical potential were optimized, including distance-dependent, contact, �/ � dihedral angle, and accessible surface statistical potentials. Approximately 10,000 test models with the correct and incorrect folds were built by automated comparative modeling of protein sequences of known structure. The criterion used to discriminate between the correct and incorrect models was the Z-score of the model energy. The performance of a Z-score was determined as a function of many variables in the derivation and use of the corresponding statistical potential. The performance was measured by the fractions of the correctly and incorrectly assessed test models. The most discriminating combination of any one of the four tested potentials is the sum of the normalized distancedependent and accessible surface potentials. The distance-dependent potential that is optimal for assessing models of all sizes uses both C � and C � atoms as interaction centers, distinguishes between all 20 standard residue types, has the distance range of 30 Å, and is derived and used by taking into account the sequence separation of the interacting atom pairs. The terms for the sequentially local interactions are significantly less informative than those for the sequentially nonlocal interactions. The accessible surface potential that
Efficient prediction of nucleic acid binding function from low-resolution protein structures
- J. Mol. Biol
, 2006
"... Structural genomics projects aim to solve the experimental structures of all existing protein folds. The rationale behind these projects is that knowing a protein’s structure will help with identifying its ..."
Abstract
-
Cited by 14 (1 self)
- Add to MetaCart
Structural genomics projects aim to solve the experimental structures of all existing protein folds. The rationale behind these projects is that knowing a protein’s structure will help with identifying its
New developments in the interpro database
- Nucl Acids Res
, 2007
"... InterPro is an integrated resource for protein families, domains and functional sites, which integrates the following protein signature databases: PROSITE, ..."
Abstract
-
Cited by 13 (1 self)
- Add to MetaCart
InterPro is an integrated resource for protein families, domains and functional sites, which integrates the following protein signature databases: PROSITE,
Implications of structural genomics target selection strategies: Pfam5000, whole genome, and random approaches
- Proteins
, 2005
"... ABSTRACT Structural genomics is an international effort to determine the three-dimensional shapes of all important biological macromolecules, with a primary focus on proteins. Target proteins should be selected according to a strategy that is medically and biologically relevant, of good value, and t ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
ABSTRACT Structural genomics is an international effort to determine the three-dimensional shapes of all important biological macromolecules, with a primary focus on proteins. Target proteins should be selected according to a strategy that is medically and biologically relevant, of good value, and tractable. As an option to consider, we present the “Pfam5000 ” strategy, which involves selecting the 5000 most important families from the Pfam database as sources for targets. We compare the Pfam5000 strategy to several other proposed strategies that would require similar numbers of targets. These strategies include complete solution of several small to moderately sized bacterial proteomes, partial coverage of the human proteome, and random
Visualization and Integration of Protein-Protein Interactions
, 2002
"... CONTENTS INTRODUCTION ..........................................................................................................................................2 WHY DO WE NEED VISUALIZATION? .......................................................................................................... ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
CONTENTS INTRODUCTION ..........................................................................................................................................2 WHY DO WE NEED VISUALIZATION? .............................................................................................................3 PROTEIN INTERACTION MAPS VERSUS METABOLIC PATHWAYS ....................................................................3 PROTEIN NETWORKS, PROTEIN COMPLEXES, AND DYNAMIC PROTEIN INTERACTIONS ..................................4 PROTEIN-PROTEIN INTERACTIONS AND ASSOCIATED INFORMATION.............................................................5 VISUALIZATION..........................................................................................................................................5 RELATIONAL VISUALIZATION ...............
Genomic Fold Assignment and Rational Modeling of Proteins of Biological Interest
"... The first available genome of a multicellular organism, C. elegans, was used as a test case for protein fold assignment using PSI-BLAST, followed by rational structure modeling and interpretation of experimental mutagenesis data in the context of collaboration with biologists. Similar results ar ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
The first available genome of a multicellular organism, C. elegans, was used as a test case for protein fold assignment using PSI-BLAST, followed by rational structure modeling and interpretation of experimental mutagenesis data in the context of collaboration with biologists. Similar results are demonstrated for human disease proteins with known polymorphisms. Introduction The availability of entire genomic sequences in recent years has made it possible to compare the genomes of different organisms, as well as evaluate the distribution of known protein structures expressed by a particular organism (for examples, see Gerstein, 1998; Wolf et al., 1999). The number of sequenced genomes will most certainly expand rapidly, as will the number of sequenced human genomes, the first of which will be available this year. The expected redundancy of genomic data for a given species (esp. Homo sapiens) will also allow wide-scale classification of polymorphisms, or natural amino acid varian...
Protein complex compositions predicted by structural similarity. Nucleic Acids Res. 34: 2943–2952. doi: 10.1093/nar/ gkl353
- Nucl. Acids Res
, 2006
"... Proteins function through interactions with other molecules. Thus, the network of physical interactions among proteins is of great interest to both experimental and computational biologists. Here we present structure-based predictions of 3387 binary and 1234 higher order protein complexes in Sacchar ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
Proteins function through interactions with other molecules. Thus, the network of physical interactions among proteins is of great interest to both experimental and computational biologists. Here we present structure-based predictions of 3387 binary and 1234 higher order protein complexes in Saccharomyces cerevisiae involving 924 and 195 proteins, respectively. To generate candidate complexes, comparative models of individual proteins were built and combined together using complexes of known structure as templates. These candidate complexes were then assessed using a statistical potential, derived from binary domain interfaces in PIBASE
The UCSC known genes
- Bioinformatics
, 2006
"... The online version of this article has been published under an open access model. Users are entitled to use, reproduce, disseminate, or display the open access version of this article for non-commercial purposes provided that: the original authorship is properly and fully attributed; the Journal and ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
The online version of this article has been published under an open access model. Users are entitled to use, reproduce, disseminate, or display the open access version of this article for non-commercial purposes provided that: the original authorship is properly and fully attributed; the Journal and Oxford University Press are attributed as the original place of publication with the correct citation details given; if an article is subsequently reproduced or disseminated not in its entirety but only in part or as a derivative work this must be clearly indicated. For commercial re-use, please contact journals.permissions@oxfordjournals.org The University of California Santa Cruz (UCSC) Known Genes data set is constructed by a fully automated process, based on protein data from Swiss-Prot/TrEMBL(UniProt) and the associated mRNA data from Genbank. The detailed steps of this process are described. Extensive cross-references from this data set to other genomic and proteomic data were constructed. For each known gene, a details page is provided containing rich information about the gene, together with extensive links to other relevant genomic, proteomic, and pathway data. As of July 2005, the UCSC Known Genes are available for human, mouse, and rat genomes. The Known Genes serves as a foundation to support several key

