Results 1 - 10
of
190
Rfam: An RNA family database
- Nucleic Acids Res
, 2003
"... Rfam is a collection of multiple sequence alignments and covariance models representing non-coding RNA families. Rfam is available on the web in the UK at http://www.sanger.ac.uk/Software/Rfam/ and in the US at http://rfam.wustl.edu/. These websites allow the user to search a query sequence against ..."
Abstract
-
Cited by 114 (1 self)
- Add to MetaCart
Rfam is a collection of multiple sequence alignments and covariance models representing non-coding RNA families. Rfam is available on the web in the UK at http://www.sanger.ac.uk/Software/Rfam/ and in the US at http://rfam.wustl.edu/. These websites allow the user to search a query sequence against a library of covariance models, and view multiple sequence alignments and family annotation. The database can also be downloaded in flatfile form and searched locally using the INFERNAL package (http://infernal.wustl.edu/). The first release of Rfam (1.0) contains 25 families, which annotate over 50000 non-coding RNA genes in the taxonomic divisions of the EMBL nucleotide database.
RSEARCH: Finding homologs of single structured RNA sequences
- BMC Bioinformatics
, 2003
"... Background: Many trans-acting noncoding RNA genes and cis-acting RNA regulatory elements conserve secondary structure rather than primary sequence. Most homology search tools only look at the primary sequence level, however. ..."
Abstract
-
Cited by 83 (0 self)
- Add to MetaCart
Background: Many trans-acting noncoding RNA genes and cis-acting RNA regulatory elements conserve secondary structure rather than primary sequence. Most homology search tools only look at the primary sequence level, however.
A Distributed Annotation System
, 2001
"... One goal of any genome project is the elucidation of the primary sequence of DNA contained within a given species. While the availability of the primary sequence itself is valuable, it does not reach its full potential until it has been annotated. Generally dened, annotation is descriptive informati ..."
Abstract
-
Cited by 75 (6 self)
- Add to MetaCart
One goal of any genome project is the elucidation of the primary sequence of DNA contained within a given species. While the availability of the primary sequence itself is valuable, it does not reach its full potential until it has been annotated. Generally dened, annotation is descriptive information or commentary added to text, in this case genomic sequence. Without a mechanism for collecting, recording, and disseminating community-based annotation, a valuable source of information is severely diminished. In this report I outline the design and implementation of a Distributed Annotation System (DAS). DAS allows sequence annotations to be decentralized among multiple third-party annotators and integrated on an as-needed basis by client-side software. A single server, designated the \reference server," provides essential structural information about the genome: the physical map which relates one sequence to another, the DNA sequence for each entry, and the authorship information. Multiple sites then act as third-party \annotation servers." Using a web browser-like client application, researchers can interrogate one or more annotation servers to retrieve features in a region of interest. The servers return the results using a standard data format, allowing the annotation browser to integrate the information and display a graphical representation of the data. When an annotation is particularly interesting, they can easily return to the originating database for additional information. In short, DAS provides an indexing mechanism between sequence databases. 1 Contents 1
Local similarity in RNA secondary structures
, 2003
"... We present a systematic treatment of alignment distance and local similarity algorithms on trees and forests. We build upon the tree alignment algorithm for ordered trees given by Jiang et. al (1995) and extend it to calculate local forest alignments, which is essential for finding local similar reg ..."
Abstract
-
Cited by 46 (1 self)
- Add to MetaCart
We present a systematic treatment of alignment distance and local similarity algorithms on trees and forests. We build upon the tree alignment algorithm for ordered trees given by Jiang et. al (1995) and extend it to calculate local forest alignments, which is essential for finding local similar regions in RNA secondary structures. The time complexity of our algorithm is O(|F1 | ·|F2 | ·deg(F1) · deg(F2) · (deg(F1) +deg(F2)) where |Fi | is the number of nodes in forest Fi and deg(Fi) is the degree of Fi. We provide carefully engineered dynamic programming implementations using dense, two-dimensional tables which considerably reduces the space requirement. We suggest a new representation of RNA secondary structures as forests that allow reasonable scoring of edit operations on RNA secondary structures. The comparison of RNA secondary structures is facilitated by a new visualization technique for RNA secondary structure alignments. Finally, we show how potential regulatory motifs can be discovered solely by their structural preservation, and independent of their sequence conservation and position.
Consensus Folding of Aligned Sequences as a New Measure for the Detection of Functional RNAs by Comparative Genomics
, 2004
"... Facing the ever-growing list of newly discovered classes of functional RNAs, it can be expected that further types of functional RNAs are still hidden in recently completed genomes. The computational identification of such RNA genes is, therefore, of major importance. While most known functional RNA ..."
Abstract
-
Cited by 45 (12 self)
- Add to MetaCart
Facing the ever-growing list of newly discovered classes of functional RNAs, it can be expected that further types of functional RNAs are still hidden in recently completed genomes. The computational identification of such RNA genes is, therefore, of major importance. While most known functional RNAs have characteristic secondary structures, their free energies are generally not statistically significant enough to distinguish RNA genes from the genomic background. Additional information is required. Considering the wide availability of new genomic data of closely related species, comparative studies seem to be the most promising approach. Here we show that prediction of consensus structures of aligned sequences can be a significant measure to detect functional RNAs. We report a new method how to test multiple sequence alignments for the existence of an unusually structured and conserved fold. We show for alignments of six types of well known functional RNA that an energy score consisting of free energy and a covariation term significantly improves sensitivity compared to single sequence predictions. We further test our method on a number of non coding RNAs from C. elegans/C. briggsae and seven Saccharomyces species. Most RNAs can be detected with high significance. We provide a Perl implementation which can be readily used to score single alignments and discuss how the methods described here can be extended to allow for e#cient genome-wide screens.
The microRNAs of Caenorhabditis elegans
- Genes Dev
, 2003
"... MicroRNAs (miRNAs) are an abundant class of tiny RNAs thought to regulate the expression of protein-coding genes in plants and animals. In the present study, we describe a computational procedure to identify miRNA genes conserved in more than one genome. Applying this program, known as MiRscan, toge ..."
Abstract
-
Cited by 42 (4 self)
- Add to MetaCart
MicroRNAs (miRNAs) are an abundant class of tiny RNAs thought to regulate the expression of protein-coding genes in plants and animals. In the present study, we describe a computational procedure to identify miRNA genes conserved in more than one genome. Applying this program, known as MiRscan, together with molecular identification and validation methods, we have identified most of the miRNA genes in the nematode Caenorhabditis elegans. The total number of validated miRNA genes stands at 88, with no more than 35 genes remaining to be detected or validated. These 88 miRNA genes represent 48 gene families; 46 of these families (comprising 86 of the 88 genes) are conservedin Caenorhabditis briggsae, and22 families are conservedin humans. More than a thirdof the worm miRNAs, including newly identified members of the lin-4 and let-7 gene families, are differentially expressed during larval development, suggesting a role for these miRNAs in mediating larval developmental transitions. Most are present at very high steady-state levels—more than 1000 molecules per cell, with some exceeding 50,000 molecules per cell. Our census of the worm miRNAs andtheir expression patterns helps define this class of noncoding RNAs, lays the groundwork for functional studies, and provides the tools for more comprehensive analyses of miRNA genes in other species. [Keywords: miRNA; noncoding RNA; computational gene identification; Dicer] Supplemental material is available at
Functional and structural genomics using PEDANT
, 2001
"... Motivation: Enormous demand for fast and accurate analysis of biological sequences is fuelled by the pace of genome analysis efforts. There is also an acute need in reliable up-to-date genomic databases integrating both functional and structural information. Here we describe the current status of th ..."
Abstract
-
Cited by 32 (7 self)
- Add to MetaCart
Motivation: Enormous demand for fast and accurate analysis of biological sequences is fuelled by the pace of genome analysis efforts. There is also an acute need in reliable up-to-date genomic databases integrating both functional and structural information. Here we describe the current status of the PEDANT software system for highthroughput analysis of large biological sequence sets and the genome analysis server associated with it.
Exploring the repertoire of rna secondary motifs using graph theory; implications for rna design
- Nucleic Acids Res
, 2003
"... Understanding the structural repertoire of RNA is crucial for RNA genomics research. Yet current methods for ®nding novel RNAs are limited to small or known RNA families. To expand known RNA structural motifs, we develop a two-dimensional graphical representation approach for describing and estimati ..."
Abstract
-
Cited by 29 (7 self)
- Add to MetaCart
Understanding the structural repertoire of RNA is crucial for RNA genomics research. Yet current methods for ®nding novel RNAs are limited to small or known RNA families. To expand known RNA structural motifs, we develop a two-dimensional graphical representation approach for describing and estimating the size of RNA's secondary structural repertoire, including naturally occurring and other possible RNA motifs. We employ tree graphs to describe RNA tree motifs and more general (dual) graphs to describe both RNA tree and pseudoknot motifs. Our estimates of RNA's structural space are vastly smaller than the nucleotide sequence space, suggesting a new avenue for ®nding novel RNAs. Speci®cally our survey shows that known RNA trees and pseudoknots represent only a small subset of all possible motifs, implying that some of the `missing ' motifs may represent novel RNAs. To help pinpoint RNA-like motifs, we show that the motifs of existing functional RNAs are clustered in a narrow range of topological characteristics. We also illustrate the applications of our approach to the design of novel RNAs and automated comparison of RNA structures; we report several occurrences of RNA motifs within larger RNAs. Thus, our graph theory approach to RNA structures has implications for RNA genomics, structure analysis and design.
PF: Hairpins in a Haystack: recognizing microRNA precursors in comparative genomics data
- Bioinformatics
, 2006
"... doi:10.1093/bioinformatics/btl257 ..."

