Results 1 - 10
of
20
OrthoMCL-DB: querying a comprehensive multi-species collection of ortholog groups
- Nucleic Acids Res
, 2006
"... of ortholog groups ..."
The NN k technique for image searching and browsing
, 2005
"... Retrieval of images from large image archives based solely on their visual similarity to a query image provides an exciting alternative to conventional text-based search. For content-based retrieval images are represented in terms of visual features. The question of how to combine these for similari ..."
Abstract
-
Cited by 9 (4 self)
- Add to MetaCart
Retrieval of images from large image archives based solely on their visual similarity to a query image provides an exciting alternative to conventional text-based search. For content-based retrieval images are represented in terms of visual features. The question of how to combine these for similarity computation is typically addressed by eliciting relevance feedback from the user on the retrieved images. We argue in this thesis that the prevailing approach to relevance feedback suffers from three significant shortcomings: firstly, it leaves unsolved the question of how to combine features for the first retrieval; secondly, the advantage of automated content-extraction over manual annotation is greatest for large collections but if the query image is not constrained to come from the indexed collection, content-based retrieval entails imagewise comparisons leading to prohibitive response times; thirdly, users may only have vaguely defined information needs or may change their needs in the course of the interaction. The large majority of relevance feedback techniques are ill-suited for such undirected exploration. We propose a new framework of user interaction that addresses these limitations. It is centred on what we call the NN k idea. The NN k of an image are all those images that are most similar to it under some combination of features. They can be viewed as representatives of the possible
A parsimony approach to genome-wide ortholog assignment
- in Research in Computational Molecular Biology, 10th Annual International Conference, RECOMB 2006
, 2006
"... Abstract. The assignment of orthologous genes between a pair of genomes is a fundamental and challenging problem in comparative genomics, since many computational methods for solving various biological problems critically rely on bona fide orthologs as input. While it is usually done using sequence ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
Abstract. The assignment of orthologous genes between a pair of genomes is a fundamental and challenging problem in comparative genomics, since many computational methods for solving various biological problems critically rely on bona fide orthologs as input. While it is usually done using sequence similarity search, we recently proposed a new combinatorial approach that combines sequence similarity and genome rearrangement. This paper continues the development of the approach and unites genome rearrangement events and (post-speciation) duplication events in a single framework under the parsimony principle. In this framework, orthologous genes are assumed to correspond to each other in the most parsimonious evolutionary scenario involving both genome rearrangement and (post-speciation) gene duplication. Besides several original algorithmic contributions, the enhanced method allows for the detection of inparalogs. Following this approach, we have implemented a high-throughput system for ortholog assignment on a genome scale, called MSOAR, and applied it to the genomes of human and mouse. As the result will show, MSOAR is able to find 99 more true orthologs than the INPARANOID program did. We have also compared MSOAR with the iterated exemplar algorithm on simulated data and found that MSOAR performed very well in terms of assignment accuracy. These test results indiate that our approach is very promising for genome-wide ortholog assignment. 1
Hierarchical clustering algorithm for comprehensive orthologous-domain classification in multiple genomes
, 2006
"... ..."
Procrastination leads to efficient filtration for local multiple alignment
- PROCEEDINGS OF THE 6TH INTERNATIONAL WORKSHOP ON ALGORITHMS IN BIOINFORMATICS (WABI)
, 2006
"... Abstract. We describe an efficient local multiple alignment filtration heuristic for identification of conserved regions in one or more DNA se- quences. The method incorporates several novel ideas: (1) palindromic spaced seed patterns to match both DNA strands simultaneously, (2) seed extension (cha ..."
Abstract
-
Cited by 3 (3 self)
- Add to MetaCart
Abstract. We describe an efficient local multiple alignment filtration heuristic for identification of conserved regions in one or more DNA se- quences. The method incorporates several novel ideas: (1) palindromic spaced seed patterns to match both DNA strands simultaneously, (2) seed extension (chaining) in order of decreasing multiplicity, and (3) procrastination when low multiplicity matches are encountered. The re- sulting local multiple alignments may have nucleotide substitutions and internal gaps as large as w characters in any occurrence of the motif. The algorithm consumes O(wN ) memory and O(wN log wN ) time where N is the sequence length. We score the significance of multiple alignments using entropy-based motif scoring methods. We demonstrate the per- formance of our filtration method on Alu-repeat rich segments of the human genome and a large set of Hepatitis C virus genomes. The GPL implementation of our algorithm in C++ is called procrastAligner and is freely available from http://gel.ahabs.wisc.edu/procrastination
The Iccare web server: an attempt to merge sequence and mapping information for plant and animal species
- Nucleic Acids Res
, 2004
"... The Iccare web server, ..."
MSOAR: A high-throughput ortholog assignment system based on genome rearrangement
- Journal of Computational Biology
"... The assignment of orthologous genes between a pair of genomes is a fundamental and challenging problem in comparative genomics, since many computational methods for solving various biological problems critically rely on bona fide orthologs as input. While it is usually done using sequence similarity ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
The assignment of orthologous genes between a pair of genomes is a fundamental and challenging problem in comparative genomics, since many computational methods for solving various biological problems critically rely on bona fide orthologs as input. While it is usually done using sequence similarity search, we recently proposed a new combinatorial approach that combines sequence similarity and genome rearrangement. This paper continues the development of the approach and unites genome rearrangement events and (post-speciation) duplication events in a single framework under the parsimony principle. In this framework, orthologous genes are assumed to correspond to each other in the most parsimonious evolutionary scenario involving both genome rearrangement and (post-speciation) gene duplication. Besides several original algorithmic contributions, the enhanced method allows for the detection of inparalogs. Following this approach, we have implemented a high-throughput system for ortholog assignment on a genome scale, called MSOAR, and applied it to human and mouse genomes. As the result will show, MSOAR is able to find 99 more true orthologs than the INPARANOID program did. In comparison to the iterated exemplar algorithm on simulated data, MSOAR performed favorably in terms of assignment accuracy. We also validated our predicted main ortholog pairs between human and mouse using public ortholog assignment datasets, synteny information, and gene function classification. These test results indiate that our approach is very promising for genome-wide ortholog assignment. Supplemental material and MSOAR program are available at
Clustering of Main Orthologs for Multiple Genomes
, 2007
"... The identification of orthologous genes shared by multiple genomes is critical for both functional and evolutionary studies in comparative genomics. While it is usually done by sequence similarity search and reconciled tree construction in practice, recently a new combinatorial approach and a high-t ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
The identification of orthologous genes shared by multiple genomes is critical for both functional and evolutionary studies in comparative genomics. While it is usually done by sequence similarity search and reconciled tree construction in practice, recently a new combinatorial approach and a high-throughput system MSOAR for ortholog identification between closely related genomes based on genome rearrangement and gene duplication have been proposed in [1]. MSOAR assumes that orthologous genes correspond to each other in the most parsimonious evolutionary scenario minimizing the number of genome rearrangement and (post-speciation) gene duplication events. However, the parsimony approach used by MSOAR limits it to pairwsie genome comparisons. In this paper, we extend MSOAR to multiple (closely related) genomes and propose an ortholog clustering method, called MultiMSOAR, to infer main orthologs in multiple genomes. As a preliminary experiment, we apply MultiMSOAR to rat, mouse and human genomes, and validate our results using gene annotations and gene function classifications in the public databases. We further compare our results to the ortholog clusters predicted by MultiParanoid, which is an extension of the well-known program Inparanoid for pairwise genome comparisons. The comparison reveals that MultiMSOAR gives more detailed and accurate orthology information since it can effectively distinguish main orthologs from inparalogs.
1 MSOAR 2.0: INCORPORATING TANDEM DUPLICATIONS INTO ORTHOLOG ASSIGNMENT BASED ON GENOME REARRANGEMENT
"... Ortholog assignment is a critical and fundamental problem in comparative genomics, since orthologs are considered to be functional counterparts in different species and can be used to infer molecular functions of one species from those of other species. MSOAR is a recently developed high-throughput ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Ortholog assignment is a critical and fundamental problem in comparative genomics, since orthologs are considered to be functional counterparts in different species and can be used to infer molecular functions of one species from those of other species. MSOAR is a recently developed high-throughput system for assigning orthologs between closely related species on a genome scale. It attempts to reconstruct the evolutionary history of input genomes in terms of genome rearrangement and gene duplication events. It assumes that a gene duplication event inserts a duplicated gene into the genome of interest at a random location (i.e., the random duplication model). However, in practice, biologists believe that genes are often duplicated by tandem duplications, where a duplicated gene is located next to the original copy (i.e., the tandem duplication model). In this paper, we develop MSOAR 2.0, an improved system for ortholog assignment. For a pair of input genomes, the system first focuses on the tandemly duplicated genes of each genome and tries to identify among them those that were duplicated after the speciation (i.e., the so-called inparalogs), using a simple phylogenetic tree reconciliation method. For each such set of tandemly duplicated inparalogs, all but one gene will be deleted from the concerned genome (because they cannot possibly appear in any ortholog pairs), and MSOAR is invoked. Using both simulated and real data experiments, we show that MSOAR 2.0 is able to achieve a better sensitivity and specificity than MSOAR. In comparison with two well-known genome-scale ortholog assignment tools, the InParanoid program and the Ensembl ortholog database, MSOAR 2.0 shows the highest sensitivity. Although the

