Results 1 - 10
of
28
K2/Kleisli and GUS: Experiments in Integrated Access to Genomic Data Sources
, 2000
"... The integration of heterogeneous data sources and software systems is a major issue in the biomedical community and several approaches have been explored: linking databases, "on-the-fly" integration through views, and integration through warehousing. In this paper we report on our experiences with t ..."
Abstract
-
Cited by 52 (4 self)
- Add to MetaCart
The integration of heterogeneous data sources and software systems is a major issue in the biomedical community and several approaches have been explored: linking databases, "on-the-fly" integration through views, and integration through warehousing. In this paper we report on our experiences with two systems that were developed at the University of Pennsylvania: an integration system called K2, which has primarily been used to provide views over multiple external data sources and software systems; and a data warehouse called GUS which downloads, cleans, integrates and annotates data from multiple external data sources. Although the view and warehouse approaches each have their advantages, there is no clear "winner". Therefore, users must consider how the data is to be used, what the performance guarantees must be, and how much programmer time and expertise is available to choose the best strategy for a particular application.
Links: web programming without tiers
- In FMCO 2006, volume 4709 of LNCS
, 2007
"... Abstract. Links is a programming language for web applications that generates code for all three tiers of a web application from a single source, compiling into JavaScript to run on the client and into SQL to run on the database. Links supports rich clients running in what has been dubbed ‘Ajax ’ st ..."
Abstract
-
Cited by 48 (3 self)
- Add to MetaCart
Abstract. Links is a programming language for web applications that generates code for all three tiers of a web application from a single source, compiling into JavaScript to run on the client and into SQL to run on the database. Links supports rich clients running in what has been dubbed ‘Ajax ’ style, and supports concurrent processes with statically-typed message passing. Links is scalable in the sense that session state is preserved in the client rather than the server, in contrast to other approaches such as Java Servlets or PLT Scheme. Client-side concurrency in JavaScript and transfer of computation between client and server are both supported by translation into continuation-passing style. 1
GeneWays: a system for extracting, analyzing, visualizing, and integrating molecular pathway data
- Journal of Biomedical Informatics
, 2004
"... The immense growth in the volume of research literature and experimental data in the field of molecular biology calls for e#cient automatic methods to capture and store information. In recent years, several groups have worked on specific problems in this area, such as automated selection of articles ..."
Abstract
-
Cited by 34 (1 self)
- Add to MetaCart
The immense growth in the volume of research literature and experimental data in the field of molecular biology calls for e#cient automatic methods to capture and store information. In recent years, several groups have worked on specific problems in this area, such as automated selection of articles pertinent to molecular biology, or automated extraction of information using natural-language processing, information visualization, and generation of specialized knowledge bases for molecular biology. GeneWays is an integrated system that combines several such subtasks. It analyzes interactions between molecular substances, drawing on multiple sources of information to infer a consensus view of molecular networks. GeneWays is designed as an open platform, allowing researchers to query, review, and critique stored information.
Modal Types for Mobile Code
, 2008
"... In this dissertation I argue that modal type systems provide an elegant and practical means for controlling local resources in spatially distributed computer programs. A distributed program is one that executes in multiple physical or logical places. It usually does so because those places have loca ..."
Abstract
-
Cited by 13 (0 self)
- Add to MetaCart
In this dissertation I argue that modal type systems provide an elegant and practical means for controlling local resources in spatially distributed computer programs. A distributed program is one that executes in multiple physical or logical places. It usually does so because those places have local resources that can only be used in those locations. Such resources can include processing power, proximity to data, hardware, or the physical presence of a user. Programmers that write distributed applications therefore need to be able to reason about the places in which their programs will execute. This work provides an elegant and practical way to think about such programs in the form of a type system derived from modal logic. Modal logic allows for reasoning about truth from multiple simultaneous perspectives. These perspectives, called "worlds," are identified with the locations in the distributed program. This enables the programming language to be simultaneously aware of the various hosts involved in a program, their
Interprocedural query extraction for transparent persistence
- In Proc. of ACM Conf. on Object-Oriented Programming, Systems, Languages and Applications (OOPSLA
, 2008
"... Transparent persistence promises to integrate programming languages and databases by allowing procedural programs to access persistent data with the same ease as non-persistent data. Transparent persistence is more likely to be adopted if it leverages the performance and transaction management of re ..."
Abstract
-
Cited by 11 (3 self)
- Add to MetaCart
Transparent persistence promises to integrate programming languages and databases by allowing procedural programs to access persistent data with the same ease as non-persistent data. Transparent persistence is more likely to be adopted if it leverages the performance and transaction management of relational databases. Since creating good relational queries from procedural programs is hard, most practical systems compromise transparency to achieve performance. In this work we demonstrate the practical feasibility of a technique for extracting relational queries from object-oriented programs. A program analysis derives query structure and conditions across methods that access persistent data. The system combines static analysis and runtime query composition to handle procedures that return persistent values. Our prototype Java compiler implements the analysis, and handles recursion and parameterized queries. We evaluate the effectiveness of the optimization on the 007 and TORPEDO benchmarks, showing that automatic optimizations are in some cases as efficient as hand-tuned code. 1.
The Functional Guts of the Kleisli Query System
- SIGPLAN Notices
, 2000
"... Kleisli is a modern data integration system that has made a significant impact on bioinformatics data integration. The primary query language provided by Kleisli is called CPL, which is a functional query language whose surface syntax is based on the comprehension syntax. Kleisli is itself implement ..."
Abstract
-
Cited by 10 (2 self)
- Add to MetaCart
Kleisli is a modern data integration system that has made a significant impact on bioinformatics data integration. The primary query language provided by Kleisli is called CPL, which is a functional query language whose surface syntax is based on the comprehension syntax. Kleisli is itself implemented using the functional language SML. This paper describes the influence of functional programming research that benefits the Kleisli system, especially the less obvious ones at the implementation level. 1 Introduction The Kleisli system [14] is an advanced broad-scale integration technology that has proved useful in the bioinformatics arena. Many bioinformatics problems require access to data sources that are high in volume, highly heterogeneous and complex, constantly evolving, and geographically dispersed. Solutions to these problems usually involve multiple carefully sequenced steps and require information to be passed smoothly between the steps. Kleisli is designed to handle these req...
InfoGrid: Providing Information Integration for Knowledge Discovery
- Information Sciences
, 2003
"... Many scientific experiments produce large amounts of data using high-throughput devices. In order to analyse this type of data Knowledge Discovery systems are required. However, generic laboratory systems do not provide any contextual information about the system that is being studied. In these situ ..."
Abstract
-
Cited by 9 (1 self)
- Add to MetaCart
Many scientific experiments produce large amounts of data using high-throughput devices. In order to analyse this type of data Knowledge Discovery systems are required. However, generic laboratory systems do not provide any contextual information about the system that is being studied. In these situations, Knowledge Discovery can be aided and validated by the use of Information integration tools. In this paper, we introduce InjbGrid, a data integration, middleware engine, designed to operate under a Grid framework. It focuses on providing information access services and offers all users a query system which is able to retain the familiarity with their spedtic sdentific applications while being diverse, flexible and open at the same time. The assumption there is that defining a common language for all queries is not desirable.
Technologies for Integrating Biological Data
- Briefings in Bioinformatics
, 2002
"... The process of building a new database relevant to some field of study in biomedicine involves transforming, integrating, and cleansing multiple data sources, as well as adding new material and annotations. We review in this paper some of the requirements of a general solution to this data integra ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
The process of building a new database relevant to some field of study in biomedicine involves transforming, integrating, and cleansing multiple data sources, as well as adding new material and annotations. We review in this paper some of the requirements of a general solution to this data integration problem. We survey several representative technologies and approaches to data integration in biomedicine. Then we highlight some interesting features that separate the more general data integration technologies from the more specialized ones.
Comprehensive comprehensions: comprehensions with “order by” and “group by
, 2007
"... We propose an extension to list comprehensions that makes it easy to express the kind of queries one would write in SQL using ORDER BY, GROUP BY, and LIMIT. Our extension adds expressive power to comprehensions, and generalises the SQL constructs that inspired it. Moreover, it is easy to implement, ..."
Abstract
-
Cited by 7 (1 self)
- Add to MetaCart
We propose an extension to list comprehensions that makes it easy to express the kind of queries one would write in SQL using ORDER BY, GROUP BY, and LIMIT. Our extension adds expressive power to comprehensions, and generalises the SQL constructs that inspired it. Moreover, it is easy to implement, using simple desugaring rules. 1.
The Kleisli Approach to Data Transformation and Integration
, 2001
"... Kleisli is a data transformation and integration system that can be used for any application where the data is typed, but has proven especially useful for bioinformatics applications. It extends the conventional flat relational data model supported by the query language SQL to a complex object dat ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Kleisli is a data transformation and integration system that can be used for any application where the data is typed, but has proven especially useful for bioinformatics applications. It extends the conventional flat relational data model supported by the query language SQL to a complex object data model supported by the collection programming language CPL. It also opens up the closed nature of commercial relational data management systems to an easily extensible system that performs complex transformations on autonomous data sources that are heterogeneous and geographically dispersed. This paper describes some implementation details and example applications of Kleisli. 1 Introduction The Kleisli system [14, 32, 33] is an advanced broad-scale integration technology that has proven very useful in the bioinformatics arena. Many bioinformatics problems require access to data sources that are large, highly heterogeneous and complex, constantly evolving, and geographically dispersed...

