Results 1 - 10
of
96
Statistics of local multiple alignments
- BIOINFORMATICS
, 2005
"... Summary: BLAST statistics have been shown to be extremely useful for searching for significant similarity hits, for amino acid and nucleotide sequences. Although these statistics are well understood for pairwise comparisons, there has been little success developing statistical scores for multiple al ..."
Abstract
-
Cited by 10 (2 self)
- Add to MetaCart
Summary: BLAST statistics have been shown to be extremely useful for searching for significant similarity hits, for amino acid and nucleotide sequences. Although these statistics are well understood for pairwise comparisons, there has been little success developing statistical scores for multiple alignments. In particular, there is no score for multiple alignment that is well founded and treated as a standard. We extend the BLAST theory to multiple alignments. Following some simple assumptions, we present and justify a significance score for multiple segments of a local multiple alignment. We demonstrate its usefulness in distinguishing high and moderate quality multiple alignments from low quality ones, with supporting experiments on orthologous vertebrate promoter sequences.
SCARNA: fast and accurate structural alignment of RNA sequences by matching fixed-length stem fragments
- BIOINFORMATICS
, 2006
"... ..."
M-Coffee: combining multiple sequence alignment methods with T-Coffee
- Nucleic Acids Res
, 2006
"... We introduce M-Coffee, a meta-method for assembling multiple sequence alignments (MSA) by combining the output of several individual methods into one single MSA. M-Coffee is an extension of T-Coffee and uses consistency to estimate a consensus alignment. We show that the procedure is robust to varia ..."
Abstract
-
Cited by 10 (4 self)
- Add to MetaCart
We introduce M-Coffee, a meta-method for assembling multiple sequence alignments (MSA) by combining the output of several individual methods into one single MSA. M-Coffee is an extension of T-Coffee and uses consistency to estimate a consensus alignment. We show that the procedure is robust to variations in the choice of constituent methods and reasonably tolerant to duplicate MSAs. We also show that performances can be improved by carefully selecting the constituent methods. M-Coffee outperforms all the individual methods on three major reference datasets: HOMSTRAD, Prefab and Balibase. We also show that on a case-by-case basis, M-Coffee is twice as likely to deliver the best alignment than any individual method. Given a collection of pre-computed MSAs, M-Coffee has similar CPU requirements to the original T-Coffee. M-Coffee is a freeware open-source package available from
Public web-based services from the European Bioinformatics Institute
- Nucleic Acids Res
, 2004
"... The mission of the European Bioinformatics Institute (EBI), an outstation of the European Molecular Biology Laboratory (EMBL) in Heidelberg, is to ensure that the growing body of information from molecular biology and genome research is placed in the public domain and is accessible freely to all par ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
The mission of the European Bioinformatics Institute (EBI), an outstation of the European Molecular Biology Laboratory (EMBL) in Heidelberg, is to ensure that the growing body of information from molecular biology and genome research is placed in the public domain and is accessible freely to all parts of the scientific community in ways that promote scientific progress. To fulfil this mission, the EBI provides a wide variety of free, publicly available bioinformatics services. These can be divided into data submissions processing;accesstoquery,analysisandretrievalsystems and tools; ftp downloads of software and databases; training and education and user support. All of these services are available at the EBI website:
Comparative Proteogenomics: Combining Mass Spectrometry and Comparative Genomics to Analyze Multiple Genomes
"... Mass spectrometry recently emerged as a valuable technique for proteogenomic annotations that improve on the state-of-the art in predicting genes and other features. However, previous proteogenomic approaches were limited to a single genome and did not take advantage of analyzing mass spectrometry d ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
Mass spectrometry recently emerged as a valuable technique for proteogenomic annotations that improve on the state-of-the art in predicting genes and other features. However, previous proteogenomic approaches were limited to a single genome and did not take advantage of analyzing mass spectrometry data from multiple genomes at once. We show that such comparative proteogenomics approach (similarly to comparative genomics approaches) allows one to address the problems that remained beyond the reach of the traditional “single proteome ” approach in mass-spectrometry. In particular, we show how comparative proteogenomics addresses the notoriously difficult problem of “one-hit-wonders ” in proteomics and improves on the existing gene prediction tools in genomics. 1
SLiMDisc: short, linear motif discovery, correcting
, 2006
"... for common evolutionary descent ..."
Large Grain Size Stochastic Optimization Alignment
- in Proceedings of the Sixth IEEE Symposium on BionInformatics and BioEngineering (BIBE’06). IEEE Computer Society
, 2006
"... DNA sequence alignment is a critical step in identifying homology between organisms. The most widely used alignment program, ClustalW, is known to suffer from the local minima problem, where suboptimal guide trees produce incorrect gap insertions. The optimization alignment approach, has been shown ..."
Abstract
-
Cited by 3 (3 self)
- Add to MetaCart
DNA sequence alignment is a critical step in identifying homology between organisms. The most widely used alignment program, ClustalW, is known to suffer from the local minima problem, where suboptimal guide trees produce incorrect gap insertions. The optimization alignment approach, has been shown to be effective in combining alignment and phylogenetic search in order to avoid the problems associated with poor guide trees. The optimization alignment algorithm operates at a small grain size, aligning each tree found, wasting time producing multiple sequence alignments for suboptimal trees. This research develops and analyzes a large grain size algorithm for optimization alignment that iterates through steps of alignment and phylogeny search, thus improving the quality of guide trees used for computation of multiple sequence alignments and eliminating computation of multiple sequence alignments for sub-optimal guide trees. Local minima are avoided by the use of stochastic search methods. Large Grain Size Stochastic Optimization Alignment (LGA) exploits the relationship between phylogenies and multiple sequence alignments, and in so doing achieves improved alignment accuracy. LGA is licensed under the GNU General Public License. Source code and data sets are publicly available at
Computational neurogenetic modelling: a pathway to new discoveries in genetic neuroscience
- Int. J. Neural Systems
"... The paper presents a methodology for using computational neurogenetic modelling (CNGM) to bring new original insights into how genes influence the dynamics of brain neural networks. CNGM is a novel computational approach to brain neural network modelling that integrates dynamic gene networks with ar ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
The paper presents a methodology for using computational neurogenetic modelling (CNGM) to bring new original insights into how genes influence the dynamics of brain neural networks. CNGM is a novel computational approach to brain neural network modelling that integrates dynamic gene networks with artificial neural network model (ANN). Interaction of genes in neurons affects the dynamics of the whole ANN model through neuronal parameters, which are no longer constant but change as a function of gene expression. Through optimization of interactions within the internal gene regulatory network (GRN), initial gene/protein expression values and ANN parameters, particular target states of the neural network behaviour can be achieved, and statistics about gene interactions can be extracted. In such a way, we have obtained an abstract GRN that contains predictions about particular gene interactions in neurons for subunit genes of AMPA, GABAA and NMDA neuro-receptors. The extent of sequence conservation for 20 subunit proteins of all these receptors was analysed using standard bioinformatics multiple alignment procedures. We have observed abundance of conserved residues but the most interesting observation has been the consistent conservation of phenylalanine (F at position 269) and leucine (L at position 353) in all 20 proteins with no mutations. We hypothesise that these regions can be the basis for mutual interactions. Existing knowledge on evolutionary linkage of their protein families and analysis at molecular level indicate that the expression of these individual subunits should be coordinated, which provides the biological justification for our optimized GRN.
VirGen: a comprehensive viral genome resource
- Nucleic Acids Res
, 2004
"... VirGen is a comprehensive viral genome resource that organizes the `sequence space ' of viral genomes in a structured fashion. It has been developed with the objective of serving as an annotated and curated database comprising complete genome sequences of viruses, value-added derived data and data m ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
VirGen is a comprehensive viral genome resource that organizes the `sequence space ' of viral genomes in a structured fashion. It has been developed with the objective of serving as an annotated and curated database comprising complete genome sequences of viruses, value-added derived data and data mining tools. The current release (v1.1) contains 559 complete genomes in addition to 287 putative genomes of viruses belonging to eight viral families for which the host range includes animals and plants. Viral genomes in VirGen are annotated using sequence-based Bioinformatics approaches. The genomic data is also curated to identify `alternate names ' of viral proteins, where available. VirGen archives the results of comparisons of genomes, proteomes and individual proteins within and between viral species. It is the ®rst resource to provide phylogenetic trees of viral species computed using whole-genome sequence data. The module of predicted B-cell antigenic determinants in VirGen is an attempt to link the genome to its vaccinome. Comparative genome analysis data facilitate the study of genome organization and evolution of viruses, which would have implications in applied research to identify candidates for the design of vaccines and antiviral drugs. VirGen is a relational database and is available at
doi:10.1093/nar/gkl858 PATRIC: The VBI PathoSystems Resource Integration Center
, 2006
"... The PathoSystems Resource Integration Center (PATRIC) is one of eight Bioinformatics Resource Centers (BRCs) funded by the National Institute of Allergy and Infection Diseases (NIAID) to create a data and analysis resource for selected NIAID priority pathogens, specifically proteobacteria of the gen ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
The PathoSystems Resource Integration Center (PATRIC) is one of eight Bioinformatics Resource Centers (BRCs) funded by the National Institute of Allergy and Infection Diseases (NIAID) to create a data and analysis resource for selected NIAID priority pathogens, specifically proteobacteria of the genera Brucella, Rickettsia and Coxiella, and corona-, caliciand lyssaviruses and viruses associated with hepatitis A and E. The goal of the project is to provide a comprehensive bioinformatics resource for these pathogens, including consistently annotated genome, proteome and metabolic pathway data to facilitate research into counter-measures, including drugs, vaccines and diagnostics. The project’s curation

