Results 1 -
8 of
8
Principles of Programming with Complex Objects and Collection Types
- Theoretical Computer Science
, 1995
"... We present a new principle for the development of database query languages that the primitive operations should be organized around types. Viewing a relational database as consisting of sets of records, this principle dictates that we should investigate separately operations for records and sets. Th ..."
Abstract
-
Cited by 111 (28 self)
- Add to MetaCart
We present a new principle for the development of database query languages that the primitive operations should be organized around types. Viewing a relational database as consisting of sets of records, this principle dictates that we should investigate separately operations for records and sets. There are two immediate advantages of this approach, which is partly inspired by basic ideas from category theory. First, it provides a language for structures in which record and set types may be freely combined: nested relations or complex objects. Second, the fundamental operations for sets are closely related to those for other "collection types" such as bags or lists, and this suggests how database languages may be uniformly extended to these new types. The most general operation on sets, that of structural recursion, is one in which not all programs are welldefined. In looking for limited forms of this operation that always give rise to well-defined operations, we find a number of close ...
K2/Kleisli and GUS: Experiments in Integrated Access to Genomic Data Sources
, 2000
"... The integration of heterogeneous data sources and software systems is a major issue in the biomedical community and several approaches have been explored: linking databases, "on-the-fly" integration through views, and integration through warehousing. In this paper we report on our experiences with t ..."
Abstract
-
Cited by 52 (4 self)
- Add to MetaCart
The integration of heterogeneous data sources and software systems is a major issue in the biomedical community and several approaches have been explored: linking databases, "on-the-fly" integration through views, and integration through warehousing. In this paper we report on our experiences with two systems that were developed at the University of Pennsylvania: an integration system called K2, which has primarily been used to provide views over multiple external data sources and software systems; and a data warehouse called GUS which downloads, cleans, integrates and annotates data from multiple external data sources. Although the view and warehouse approaches each have their advantages, there is no clear "winner". Therefore, users must consider how the data is to be used, what the performance guarantees must be, and how much programmer time and expertise is available to choose the best strategy for a particular application.
Kleisli, a Functional Query System
- J. Funct. Prog
, 1998
"... Kleisli is a modern data integration system that has made a significant impact on bioinformatics data integration. This paper contains a brief introduction to the Kleisli system and an example to illustrate its uses in the bioinformatics arena. The primary query language provided by Kleisli is calle ..."
Abstract
-
Cited by 13 (1 self)
- Add to MetaCart
Kleisli is a modern data integration system that has made a significant impact on bioinformatics data integration. This paper contains a brief introduction to the Kleisli system and an example to illustrate its uses in the bioinformatics arena. The primary query language provided by Kleisli is called CPL, which is a functional query language whose surface syntax is based on the comprehension syntax. Kleisli is itself implemented using the functional language SML. So this paper also describes the influence of functional programming research that benefits the Kleisli system, especially the less obvious ones at the implementation level. Availability. Kleisli has been commercialized under the name "KRIS". It is available from Kris Technology Inc., 713 Santa Cruz Ave, #2, Menlo Park, CA 94025. Direct email to info@kris-inc.com and web browser to http://www.kris-inc.com. 1 Introduction The Kleisli system (Davidson et al., 1997) is an advanced broad-scale integration technology that has pro...
THE Kleisli/CPL EXTENSIBLE QUERY OPTIMIZER - Programmer Guide
, 1996
"... this report, the focus is on the optimizer of the Kleisli/CPL system. As mentioned earlier, the system is being applied to query heterogeneous biological data sources. This application environment has the following characteristics. ..."
Abstract
-
Cited by 4 (4 self)
- Add to MetaCart
this report, the focus is on the optimizer of the Kleisli/CPL system. As mentioned earlier, the system is being applied to query heterogeneous biological data sources. This application environment has the following characteristics.
Extracting Kozak Consensus Sequence Using Kleisli
- Proceedings of 1st International Conference on Bioinformatics of Genome Regulation and Structure
, 1998
"... Consensus sequence for the context of translation initiation codon is useful to many molecular biologists in identifying the possible translation initiation codon in new genes. We have two objectives in this paper. First, we want to build a system that can extract context sequences of translation in ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
Consensus sequence for the context of translation initiation codon is useful to many molecular biologists in identifying the possible translation initiation codon in new genes. We have two objectives in this paper. First, we want to build a system that can extract context sequences of translation initiation codon from large DNA databases and compute a consensus sequence for these context sequences. Second, we want to demonstrate how the general bioinformatics database integration system called Kleisli can help build such a system easily. We achieve these two objectives by showing that short and clear programs can be written in Kleisli, using its high-level query language CPL, to build such a system by integrating WU-BLAST2.0, Entrez, and database-style indices. 1 Introduction The recent explosion of genomic information, as gleaned from the Human Genome Project and other similar efforts, has been fueled by engineering and technological advances. However, as the amount of information gr...
Some MEDLINE Queries Powered by Kleisli
- In ACCESS
, 1998
"... this article is devoted to demonstrating that Kleisli is such an enabling system for broad-scale integration. I will use the follow query to illustrate: "Find those articles about SUBJ of organisms in the same CAT as ORG, focus especially on those that have associated protein sequences." This query ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
this article is devoted to demonstrating that Kleisli is such an enabling system for broad-scale integration. I will use the follow query to illustrate: "Find those articles about SUBJ of organisms in the same CAT as ORG, focus especially on those that have associated protein sequences." This query is a mix of requirements from the first and second examples. (It is also possible to incorporate the third example, but I do not have space in this short article to do so.) This query requires that an external taxonomy data source be looked up to find closely related organism and a second external source be looked up to ensure the presence of associated protein sequences. 3 Implementation
Pruning Nested Data Values Using Branch Expressions With Wildcards
"... Branch expressions are presented as a means of expressing structural queries over nested data contained in data exchange formats. We demonstrate their utility in pruning large data structures by using them to specify a form of parse optimization; and we show that their evaluation can be done in line ..."
Abstract
- Add to MetaCart
Branch expressions are presented as a means of expressing structural queries over nested data contained in data exchange formats. We demonstrate their utility in pruning large data structures by using them to specify a form of parse optimization; and we show that their evaluation can be done in linear time with a constant amount of memory. Wildcards that range over subtrees of a data structure are introduced and a method for eliminating wildcards is described. We then demonstrate how we have embedded branch expressions into a more general system to express a richer class of queries. Finally, optimizations for migrating operations from the general system into the more efficient branch expression system are described. 1 Introduction In the biomedical community, a vast amount of public data continues to be stored, queried, transmitted, and viewed using data exchange formats (e.g. ASN.1, ACE, EMBL, PIR, and PDB). These formats have varying degrees of implicit or explicit syntactic structu...
A Strategy for Database Interoperation
- Journal of Computational Biology
, 1995
"... To realize the full potential of biological databases (DBs) requires more than the interactive, hypertext flavor of database interoperation that is now so popular in the bioinformatics community. Interoperation based on declarative queries to multiple network-accessible databases will support analys ..."
Abstract
- Add to MetaCart
To realize the full potential of biological databases (DBs) requires more than the interactive, hypertext flavor of database interoperation that is now so popular in the bioinformatics community. Interoperation based on declarative queries to multiple network-accessible databases will support analyses and investigations that are orders of magnitude faster and more powerful than what can be accomplished through interactive navigation. I present a vision of the capabilities that a query-based interoperation infrastructure should provide, and identify assumptions underlying, and requirements of, this vision. I then propose an architecture for query-based interoperation that includes a number of novel components of an information infrastructure for molecular biology. These components include: a knowledge base that describes relationships among the conceptualizations used in different biological databases; a module that can determine the DBs that are relevant to a particular query; a module...

