• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations

The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000 (2000)

by A Bairoch, R Apweiler
Venue:Nucl. Acids Res
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 773
Next 10 →

Improved prediction of signal peptides -- SignalP 3.0

by Jannick Dyrløv Bendtsen, Henrik Nielsen, Gunnar von Heijne, Søren Brunak - J. MOL. BIOL. , 2004
"... We describe improvements of the currently most popular method for prediction of classically secreted proteins, SignalP. SignalP consists of two different predictors based on neural network and hidden Markov model algorithms, where both components have been updated. Motivated by the idea that the cle ..."
Abstract - Cited by 654 (7 self) - Add to MetaCart
We describe improvements of the currently most popular method for prediction of classically secreted proteins, SignalP. SignalP consists of two different predictors based on neural network and hidden Markov model algorithms, where both components have been updated. Motivated by the idea that the cleavage site position and the amino acid composition of the signal peptide are correlated, new features have been included as input to the neural network. This addition, combined with a thorough error-correction of a new data set, have improved the performance of the predictor significantly over SignalP version 2. In version 3, correctness of the cleavage site predictions have increased notably for all three organism groups, eukaryotes, Gram-negative and Grampositive bacteria. The accuracy of cleavage site prediction has increased in the range from 6-17 % over the previous version, whereas the signal peptide discrimination improvement is mainly due to the elimination of false positive predictions, as well as the introduction of a new discrimination score for the neural network. The new method has also been benchmarked against other available methods. Predictions can be made at the publicly available web server

Investigating semantic similarity measures across the Gene Ontology: the relationship between sequence and annotation

by P. W. Lord, R. D. Stevens, A. Brass, C. A. Goble - Bioinformatics , 2003
"... between sequence and annotation ..."
Abstract - Cited by 247 (5 self) - Add to MetaCart
between sequence and annotation
(Show Context)

Citation Context

.... The terms held within this structure are used to annotate database entries (http://www.geneontology.org/goa). As they form a standard vocabulary across many biological resources such as SWISS-PROT (=-=Bairoch and Apweiler, 2000-=-), this shared understanding provides a valuable, computationally accessible form of the community’s knowledge about these attributes. Information about the evidence for this knowledge is also provide...

The InterPro database, 2003 brings increased coverage and new features. Nucleic Acids Res

by Nicola J. Mulder, Rolf Apweiler, Teresa K. Attwood, Amos Bairoch, Daniel Barrell, Alex Bateman, David Binns, Margaret Biswas, Paul Bradley, Peer Bork, Phillip Bucher, Richard R. Copley, Emmanuel Courcelle, Ujjwal Das, Richard Durbin, Laurent Falquet, Wolfgang Fleischmann, Sam Griffiths-jones, Daniel Haft, Nicola Harte, Nicolas Hulo, Daniel Kahn, Er Kanapin, Maria Krestyaninova, Rodrigo Lopez, Ivica Letunic, David Lonsdale, Ville Silventoinen, Ra E. Orchard, Marco Pagni, David Peyruc, Chris P. Ponting, Jeremy D. Selengut Florence Servant, Evgueni M. Zdobnov , 2003
"... InterPro, an integrated documentation resource of protein families, domains and functional sites, was created in 1999 as a means of amalgamating the major protein signature databases into one comprehensive resource. PROSITE, Pfam, PRINTS, ProDom, SMART and TIGRFAMs have been manually integrated and ..."
Abstract - Cited by 239 (17 self) - Add to MetaCart
InterPro, an integrated documentation resource of protein families, domains and functional sites, was created in 1999 as a means of amalgamating the major protein signature databases into one comprehensive resource. PROSITE, Pfam, PRINTS, ProDom, SMART and TIGRFAMs have been manually integrated and curated and are available in InterPro for text- and sequence-based searching. The results are provided in a single format that rationalises the results that would be obtained by searching the member databases individually. The latest release of
(Show Context)

Citation Context

...ontains/found in relationship generally refers to the presence of genetically mobile domains. All hits of the protein signatures in InterPro against a composite of the SWISS-PROT and TrEMBL databases =-=(8)-=- (SPTR) are precomputed. The matches are *To whom correspondence should be addressed. Tel: þ44 1223 494602; Fax: þ44 1223 494468; Email: mulder@ebi.ac.uks316 Nucleic Acids Research, 2003, Vol. 31, No....

The InterPro database, an integrated documentation resource for protein families, domains and functional sites,”

by R Apweiler, T K Attwood, A Bairoch - Nucleic Acids Research, , 2001
"... ..."
Abstract - Cited by 231 (18 self) - Add to MetaCart
Abstract not found

The PredictProtein server

by Burkhard Rost, Guy Yachdav, Jinfeng Liu , 2004
"... ..."
Abstract - Cited by 228 (21 self) - Add to MetaCart
Abstract not found

Improving the Prediction of Protein Secondary Structure in Three and Eight Classes Using Recurrent Neural Networks and Profiles

by Gianluca Pollastri, Darisz Przybylski, Burkhard Rost, Pierre Baldi , 2001
"... Secondarystructurepredictions areincreasinglybecomingtheworkhorseforseveralmethodsaimingatpredictingproteinstructure andfunction.Hereweuseensemblesofbidirectionalrecurrentneuralnetworkarchitectures, PSIBLAST -derivedprofiles,andalargenonredundant trainingsettoderivetwonewpredictors:(a)the secondvers ..."
Abstract - Cited by 216 (43 self) - Add to MetaCart
Secondarystructurepredictions areincreasinglybecomingtheworkhorseforseveralmethodsaimingatpredictingproteinstructure andfunction.Hereweuseensemblesofbidirectionalrecurrentneuralnetworkarchitectures, PSIBLAST -derivedprofiles,andalargenonredundant trainingsettoderivetwonewpredictors:(a)the secondversionoftheSSproprogramforsecondary structureclassificationintothreecategoriesand(b) thefirstversionoftheSSpro8programforsecondarystructureclassificationintotheeightclasses producedbytheDSSPprogram.Wedescribethe resultsofthreedifferenttestsetsonwhichSSpro achievedasustainedperformanceofabout78% correctprediction.Wereportconfusionmatrices, comparePSI-BLASTtoBLAST-derivedprofiles,and assessthecorrespondingperformanceimprovements. SSproandSSpro8areimplementedasweb servers,availabletogetherwithotherstructural featurepredictorsat:http://promoter.ics.uci.edu/ BRNN-PRED/.Proteins2002;47:228--235.
(Show Context)

Citation Context

...yield better results than using pro les at the output level [6]. BLAST: Input pro les for SSpro 1.0 were constructed primarily by running the BLAST program [1] against the NR (non-redundant) database =-=[3, 10]-=-, with standard default parameters (E=10.0, BLOSUM62 matrix). The version used was available online in October 1999 and contained approximately 420,000 protein sequences. For redundancy reduction, ins...

Prediction of the coding sequences of unidentified human genes. VI. The coding sequences of 80 new genes (KIAA0201KIAA0280) deduced by analysis of cDNA clones from cell line KG-1 and brain

by Takahiro Nagase, Ken-ichi Ishikawa, Daisuke Nakajima, Miki Ohira, Naohiko Seki, Nobuyuki Mlyajlma, Ayako Tanaka, Hirokazu Kotani, Nobuo Nomura, Osamu Ohara - DNA Res , 1996
"... In this series of projects of sequencing human cDNA clones which correspond to relatively long transcripts, we newly determined the entire sequences of 100 cDNA clones which were screened on the basis of the potentiality of coding for large proteins in vitro. The cDNA libraries used were the fractio ..."
Abstract - Cited by 194 (15 self) - Add to MetaCart
In this series of projects of sequencing human cDNA clones which correspond to relatively long transcripts, we newly determined the entire sequences of 100 cDNA clones which were screened on the basis of the potentiality of coding for large proteins in vitro. The cDNA libraries used were the fractions with average insert sizes from 5.3 to 7.0 kb of the size-fractionated cDNA libraries from human brain. The randomly sampled clones were single-pass sequenced from both the ends to select clones that are not registered in the public database. Then their protein-coding potentialities were examined by an in vitro transcription/translation system, and the clones that generated proteins larger than 60 kDa were entirely sequenced. Each clone gave a distinct open reading frame (ORF), and the length of the ORF was roughly coincident with the approximate molecular mass of the in vitro product estimated from its mobility on SDS-polyacrylamide gel electrophoresis. The average size of the cDNA clones sequenced was 6.1 kb, and that of the ORFs corresponded to 1200 amino acid residues. By computer-assisted analysis of the sequences with DNA and protein-motif databases (GenBank and PROSITE databases), the functions of at least 73% of the gene products could be anticipated, and 88 % of them (the products of 64 clones) were assigned to the functional categories of proteins relating to cell signaling/communication, nucleic acid managing,

STRING: a database of predicted functional associations between proteins

by Christian Von Mering, Martijn Huynen, Daniel Jaeggi, Steffen Schmidt, Peer Bork, Berend Snel - Nucleic Acids Res , 2003
"... Functional links between proteins can often be inferred from genomic associations between the genes that encode them: groups of genes that are required for the same function tend to show similar species coverage, are often located in close proximity on the genome (in prokaryotes), and tend to be inv ..."
Abstract - Cited by 180 (12 self) - Add to MetaCart
Functional links between proteins can often be inferred from genomic associations between the genes that encode them: groups of genes that are required for the same function tend to show similar species coverage, are often located in close proximity on the genome (in prokaryotes), and tend to be involved in gene-fusion events. The database STRING is a precomputed global resource for the exploration and analysis of these associations. Since the three types of evidence differ conceptually, and the number of predicted interactions is very large, it is essential to be able to assess and compare the significance of individual predictions. Thus, STRING contains a unique scoring-framework based on benchmarks of the different types of associations against a common reference set, integrated in a single confidence score per prediction. The graphical representation of the network of inferred, weighted protein interactions provides a high-level view of functional linkage, facilitating the analysis of modularity in biological processes. STRING is updated continuously, and currently contains 261 033 orthologs in 89 fully sequenced genomes. The database predicts functional interactions at an expected level of accuracy of at least 80 % for more than half of the genes; it is online at
(Show Context)

Citation Context

...1).Nucleic Acids Research, 2003, Vol. 31, No. 1 261 DATA SOURCES, ORTHOLOGY For information on genomes, genes, and encoded proteins, STRING relies on the annotated proteomes maintained by SWISS-PROT =-=(23)-=-. Assignment of functional equivalence of genes across these genomes is essential for the predictions, and this information is derived from the manually curated orthology database, COGs (15). For any ...

Review: Protein Secondary Structure Prediction Continues to Rise

by Burkhard Rost - J. Struct. Biol , 2001
"... f prediction accuracy? We shall see. 2001 Academic Press INTRODUCTION History. Linus Pauling correctly guessed the formation of helices and strands (14, 15) (and falsely hypothesized other structures). Three years before Pauling's guess was verified by the publications of the first X-ray stru ..."
Abstract - Cited by 180 (22 self) - Add to MetaCart
f prediction accuracy? We shall see. 2001 Academic Press INTRODUCTION History. Linus Pauling correctly guessed the formation of helices and strands (14, 15) (and falsely hypothesized other structures). Three years before Pauling's guess was verified by the publications of the first X-ray structures (16, 17), one group had already ventured to predict secondary structure from sequence (18). The first-generation prediction methods following in the 1960s and 1970s were all based on single amino acid propensities (19). The second-generation methods dominating the scene until the early 1990s used propensities for segments of 3--51 adjacent residues (19). Basically any imaginable theoretical algorithm had been applied to the problem of predicting secondary structure from sequence. However, it seemed that prediction accuracy stalled at levels slightly above 60% (percentage of residues predicted correctly in one of the three states: helix, strand, and other). The reason for this limit was the
(Show Context)

Citation Context

... profiles, and which fraction results from training on larger profiles? Using PHD from 1994 to separate the effects (8), we first compared a noniterative standard BLAST (53) search against SWISS-PROT =-=(54) wit-=-h one against SWISS-PROT � TrEMBL (54) � PDB (55). The larger database improves performance by about two percentage points (8). Second, we compared the standard BLAST against the large database wi...

SIFT: Predicting amino acid changes that affect protein function

by Pauline C. Ng, Steven Henikoff - Nucleic Acids Res , 2003
"... Single nucleotide polymorphism (SNP) studies and random mutagenesis projects identify amino acid substitutions in protein-coding regions. Each sub-stitution has the potential to affect protein function. SIFT (Sorting Intolerant From Tolerant) is a program that predicts whether an amino acid substitu ..."
Abstract - Cited by 163 (4 self) - Add to MetaCart
Single nucleotide polymorphism (SNP) studies and random mutagenesis projects identify amino acid substitutions in protein-coding regions. Each sub-stitution has the potential to affect protein function. SIFT (Sorting Intolerant From Tolerant) is a program that predicts whether an amino acid substitution affects protein function so that users can prioritize substitutions for further study. We have shown that SIFT can distinguish between functionally neutral and deleterious amino acid changes in mutagenesis studies and on human polymorphisms. SIFT is available at
(Show Context)

Citation Context

...ymorphisms (3). Assuming that disease-causing amino acid substitutions are damaging to protein function, we applied SIFT to a database of missense substitutions associated with or involved in disease =-=(4)-=-. SIFT predicted 69% to be damaging. When SIFT was applied to the non-synonymous SNPs in dbSNP (5), a database of putative SNPs, 25% of the variants were predicted to be deleterious. This was similar ...

Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University