Results 1 -
8 of
8
The SUPERFAMILY database in 2007: families and functions
- Nucleic Acids Res
, 2007
"... The SUPERFAMILY database provides protein domain assignments, at the SCOP ‘superfamily ’ level, for the predicted protein sequences in over 400 completed genomes. A superfamily groups together domains of different families which have a common evolutionary ancestor based on structural, functional and ..."
Abstract
-
Cited by 8 (5 self)
- Add to MetaCart
The SUPERFAMILY database provides protein domain assignments, at the SCOP ‘superfamily ’ level, for the predicted protein sequences in over 400 completed genomes. A superfamily groups together domains of different families which have a common evolutionary ancestor based on structural, functional and sequence data. SUPERFAMILY domain assignments are generated using an expert curated set of profile hidden Markov models. All models and structural assignments are available for browsing and download from
VMD: a community annotation database for oomycetes and microbial genomes
- Nucleic Acids Res
, 2006
"... The VBI Microbial Database (VMD) is a database system designed to host a range of microbial genome sequences. At present, the database contains genome sequence and annotation data of two plant pathogens Phytophthora sojae and Phytophthora ramorum. With the completion of the draft genome sequences of ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
The VBI Microbial Database (VMD) is a database system designed to host a range of microbial genome sequences. At present, the database contains genome sequence and annotation data of two plant pathogens Phytophthora sojae and Phytophthora ramorum. With the completion of the draft genome sequences of these pathogens in collaboration with the DOE Joint Genome Institute (JGI), we have created this resource to make the sequences publicly available. The genome sequences (95 MB for P.sojae and 65 MB for P.ramorum) were annotated with 19 000 and 16 000 gene models, respectively. We used two different statistical methods to validate these gene models, Fickett’s and a log-likelihood method. Functional annotation of the gene models is based on results from BlastX and InterProScan screens. From the InterProScan results, we could assign putative functions to 17 694 genes in P.sojae and 14 700 genes in P.ramorum. We created an easy-to-use genome browser to view the genome sequence data, which opens to detailed annotation pages for each gene model. A community annotation interface is available for registered community members to add or edit annotations. There are 1600 gene models for P.sojae and 700 models for P.ramorum that have already been manually curated. A toolkit is provided as an additional resource for users to perform a variety of sequence analysis jobs. The database is publicly available at
The DIMA web resource—exploring the protein domain network
"... doi:10.1093/bioinformatics/btl050 ..."
Mining sequence annotation databanks for association patterns
, 2005
"... Data and text mining Vol. 21 Suppl. 3 2005, pages iii49–iii57 doi:10.1093/bioinformatics/bti1206 Mining sequence annotation databanks for association patterns ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Data and text mining Vol. 21 Suppl. 3 2005, pages iii49–iii57 doi:10.1093/bioinformatics/bti1206 Mining sequence annotation databanks for association patterns
Domain Architecture Comparison for Multidomain Homology Identification
, 2007
"... Homology identification is the first step for many genomic studies. Current methods, based on sequence comparison, can result in a substantial number of mis-assignments due to the similarity of homologous domains in otherwise unrelated sequences. Here we propose methods to detect homologs through ex ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Homology identification is the first step for many genomic studies. Current methods, based on sequence comparison, can result in a substantial number of mis-assignments due to the similarity of homologous domains in otherwise unrelated sequences. Here we propose methods to detect homologs through explicit comparison of protein domain content. We developed several schemes for scoring the homology of a pair of protein sequences based on methods used in the field of information retrieval. We evaluate the proposed methods and methods used in the literature using a benchmark of fifteen sequence families of known evolutionary history. The results of these studies demonstrate the effectiveness of comparing domain architectures using these similarity measures. We also demonstrate the importance of both weighting promiscuous domains and of compensating for the statistical effect of having a large number of domains in a protein. Using logistic regression we demonstrate the benefit of combining similarity measures based on domain content with sequence similarity measures. 1
Sequence analysis Automated Improvement of Domain ANnotations using context
"... Motivation: Since protein domains are the units of evolution, databases of domain signatures such as ProDom or Pfam enable both a sensitive and selective sequence analysis. However, manually curated databases have a low coverage and automatically generated ones often miss relationships which have no ..."
Abstract
- Add to MetaCart
Motivation: Since protein domains are the units of evolution, databases of domain signatures such as ProDom or Pfam enable both a sensitive and selective sequence analysis. However, manually curated databases have a low coverage and automatically generated ones often miss relationships which have not yet been discovered between domains or cannot display similarities between domains which have drifted apart. Methods: We present a tool which makes use of the fact that overall domain arrangements are often conserved. AIDAN (Automated Improvement of Domain ANnotations) identifies potential annotation artifacts and domains which have drifted apart. The underlying database supplements ProDom and is interfaced by a graphical tool allowing the localization of single domain deletions or annotations which have been falsely made by the automated procedure.
Novel Families of Toxin-like Peptides in Insects and Mammals: A Computational Approach
"... Most animal toxins are short proteins that appear in venom and vary in sequence, structure and function. A common characteristic of many such toxins is their apparent structural stability. Sporadic instances of endogenous toxin-like proteins that function in non-venom context have been reported. We ..."
Abstract
- Add to MetaCart
Most animal toxins are short proteins that appear in venom and vary in sequence, structure and function. A common characteristic of many such toxins is their apparent structural stability. Sporadic instances of endogenous toxin-like proteins that function in non-venom context have been reported. We have utilized machine learning methodology, based on sequence-derived features and guided by the notion of structural stability, in order to conduct a large-scale search for toxin and toxin-like proteins. Application of the method to insect and mammalian sequences revealed novel families of toxin-like proteins. One of these proteins shows significant similarity to ion channel inhibitors that are expressed in cone snail and assassin bug venom, and is surprisingly expressed in the bee brain. A toxicity assay in which the protein was injected to fish induced a strong yet reversible paralytic effect. We suggest that the protein may function as an endogenous modulator of voltage-gated Ca 2+ channels. Additionally, we have identified a novel mammalian cluster of toxin-like proteins that are expressed in the testis. We suggest that these proteins might be involved in regulation of nicotinic acetylcholine receptors that affect the acrosome reaction and sperm motility. Finally, we highlight a possible evolutionary link between venom toxins and antibacterial proteins. We expect our methodology to enhance the discovery of additional novel protein families.
BMC Genomics BioMed Central Database Fungal cytochrome P450 database
, 2008
"... © 2008 Park et al; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License ..."
Abstract
- Add to MetaCart
© 2008 Park et al; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License

