Results 1 -
5 of
5
Internet Browsing and Searching: User Evaluations of Category Map and Concept Space Techniques
- JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE
, 1998
"... ..."
A Concept Space Approach to Addressing the Vocabulary Problem in Scientific Information Retrieval: An Experiment on the Worm Community System
- Journal of the American Society for Information Science
, 1997
"... This research presents an algorithmic approach to addressing the vocabulary problem in scientific information retrieval and information sharing, using the molecular biology domain as an example. We first present a literature review of cognitive stud!es related to the vcrcabulaw problem and vocabular ..."
Abstract
-
Cited by 56 (14 self)
- Add to MetaCart
This research presents an algorithmic approach to addressing the vocabulary problem in scientific information retrieval and information sharing, using the molecular biology domain as an example. We first present a literature review of cognitive stud!es related to the vcrcabulaw problem and vocabulary-based search aids (thesauri) and then discuss technques for building robust and domain-specific thesauri to assist in cross-domain scientific information retrieval. Using a variation of the automatic thesaurus generation techniques, which we refer to as the concept space approach, we racentiy conducted an experiment in the molecular biology domain in whch we created a C. eksgans worm thesaurus of 7,657 worm-specific terms and a Drosophila fty thesaurus of 15,626 terms. About 30 % of these terms overtappad, which created vocabulary paths
The Application of Stochastic Context-Free Grammars to Folding, Aligning and Modeling Homologous RNA Sequences
, 1993
"... Stochastic context-free grammars (SCFGs) are applied to the problems of folding, aligning and modeling families of homologous RNA sequences. SCFGs capture the sequences' common primary and secondary structure and generalize the hidden Markov models (HMMs) used in related work on protein and DNA. ..."
Abstract
-
Cited by 9 (2 self)
- Add to MetaCart
Stochastic context-free grammars (SCFGs) are applied to the problems of folding, aligning and modeling families of homologous RNA sequences. SCFGs capture the sequences' common primary and secondary structure and generalize the hidden Markov models (HMMs) used in related work on protein and DNA. The novel aspect of this work is that SCFG parameters are learned automatically from unaligned, unfolded training sequences. A generalization of the HMM forward-backward algorithm is introduced to do this. The new algorithm, Tree-Grammar EM, based on tree grammars and faster than the previously proposed SCFG inside-outside training algorithm, produced a model that we tested on the transfer RNA (tRNA) family. Results show that after having been trained on as few as 20 tRNA sequences from only two tRNA subfamilies (mitochondrial and cytoplasmic), the model can discern general tRNA from similarlength RNA sequences of other kinds, can find secondary structure of new tRNA sequences, and c...
Efficient Processing of Queries Containing User-Defined Predicates
, 1995
"... Query optimizers often have difficulties with the treatment of user-defined functions. In particular, the optimization of joins involving complex, userdefined predicates rather than just simple arithmetic comparisons may lead to query plans of poor quality. In this paper, we identify a class of user ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
Query optimizers often have difficulties with the treatment of user-defined functions. In particular, the optimization of joins involving complex, userdefined predicates rather than just simple arithmetic comparisons may lead to query plans of poor quality. In this paper, we identify a class of user-defined functions that can be included in queries in such a way that efficient query optimization is still possible. For this class of functions, a join between two sets R and S, for example, could still be computed in time significantly less than O(jRj \Delta jSj). To achieve this goal we introduce the concept of the OE-function, an operator to process each set separately with respect to the user-defined function (s) being used. These OE-functions dynamically derive information to constrain subsequent processing steps. The derived information allows in particular the application of an existing index or some other traditional join strategy. After demonstrating this technique on various ex...
Object-Oriented Modelling in Molecular Biology
- Proceedings of the Artificial Intelligence and Genome WorkshoI, JCAI
, 1993
"... d of modelling in molecular biology and we have developed various tools to handle and study genomic sequences. We have been among the firsts to propose genomic data bases (Gautier et al., 1981), then to develop a Data Base Management System (DBMS) dedicated to the biological sequences: ACNUC (Gouy e ..."
Abstract
- Add to MetaCart
d of modelling in molecular biology and we have developed various tools to handle and study genomic sequences. We have been among the firsts to propose genomic data bases (Gautier et al., 1981), then to develop a Data Base Management System (DBMS) dedicated to the biological sequences: ACNUC (Gouy et al., 1985). In association with this data base, we have built the Analseq package (Jacobzone and Gautier, 1989) for sequence analysis. These two softwares are examples of systems in which it exist a strong separation between interrogation and analysis of the biological sequences. More recently, we have developed tools that integrate both biological and methodological knowledge. ColiGene is a modelling of E. coli genetics devoted to the analysis of relationships between genomic sequences and gene expressivity and MultiMap implements a formalization of genome maps allowing manipulation of localization informations with homology modelling in man and mouse. Modelling of biological knowledge

