Results 1 - 10
of
42
Combinators for bi-directional tree transformations: A linguistic approach to the view update problem
- In ACM SIGPLAN–SIGACT Symposium on Principles of Programming Languages (POPL
, 2005
"... We propose a novel approach to the view update problem for tree-structured data: a domainspecific programming language in which all expressions denote bi-directional transformations on trees. In one direction, these transformations—dubbed lenses—map a “concrete ” tree into a simplified “abstract vie ..."
Abstract
-
Cited by 94 (13 self)
- Add to MetaCart
We propose a novel approach to the view update problem for tree-structured data: a domainspecific programming language in which all expressions denote bi-directional transformations on trees. In one direction, these transformations—dubbed lenses—map a “concrete ” tree into a simplified “abstract view”; in the other, they map a modified abstract view, together with the original concrete tree, to a correspondingly modified concrete tree. Our design emphasizes both robustness and ease of use, guaranteeing strong well-behavedness and totality properties for welltyped lenses. We identify a natural mathematical space of well-behaved bi-directional transformations over arbitrary structures, study definedness and continuity in this setting, and state a precise connection with the classical theory of “update translation under a constant complement ” from databases. We then instantiate this semantic framework in the form of a collection of lens combinators that can be assembled to describe transformations on trees. These combinators include familiar constructs from functional programming (composition, mapping, projection, conditionals, recursion) together with some novel primitives for manipulating trees (splitting, pruning, copying, merging, etc.). We illustrate the expressiveness of these combinators by developing a number of bi-directional listprocessing transformations as derived forms. An extended example shows how our combinators can be used to define a lens that translates between a native HTML representation of browser bookmarks and a generic abstract bookmark format.
Cache-and-Query for Wide Area Sensor Databases
, 2003
"... Webcams, microphones, pressure gauges and other sensors provide exciting new opportunities for querying and monitoring the physical world. In this paper we focus on querying wide area sensor databases, containing (XML) data derived from sensors spread over tens to thousands of miles. We present the ..."
Abstract
-
Cited by 72 (18 self)
- Add to MetaCart
Webcams, microphones, pressure gauges and other sensors provide exciting new opportunities for querying and monitoring the physical world. In this paper we focus on querying wide area sensor databases, containing (XML) data derived from sensors spread over tens to thousands of miles. We present the first scalable system for executing XPATH queries on such databases. The system maintains the logical view of the data as a single XML document, while physically the data is fragmented across any number of host nodes. For scalability, sensor data is stored close to the sensors, but can be cached elsewhere as dictated by the queries (auto-tuning). Our design enables self-starting distributed queries that jump directly to the lowest common ancestor of the query result, dramatically reducing query response times. We present a novel query-evaluategather technique (using XSLT) for detecting (1) which data in a local database fragment is part of the query result, and (2) how to gather the missing parts. We define partitioning and cache invariants that ensure that even partial matches on cached data are exploited and that correct answers are returned, despite our dynamic query-driven caching. Experimental results demonstrate that our techniques dramatically increase query throughputs and decrease query response times in wide area sensor databases.
Query Rewriting for Semistructured Data
"... We address the problem of query rewriting for TSL, a language for querying semistructured data. We develop and present an algorithm that, given a semistructured query q and a set of semistructured views V, finds rewriting queries, i.e., queries that access the views and produce the same result as q ..."
Abstract
-
Cited by 60 (9 self)
- Add to MetaCart
We address the problem of query rewriting for TSL, a language for querying semistructured data. We develop and present an algorithm that, given a semistructured query q and a set of semistructured views V, finds rewriting queries, i.e., queries that access the views and produce the same result as q. Our algorithm is based on appropriately generalizing containment mappings, the chase, and query composition- techniques that were developed for structured, relational data. We also develop an algorithm for equivalence checking of TSL queries. We show that the algorithm is sound and complete for TSL, i.e., it always finds every non-trivial TSL rewriting query of q, and we discuss its complexity. We extend the rewriting algorithm to use some forms of structural constraints (such as DTDs) and find more opportunities for query rewriting.
K2/Kleisli and GUS: Experiments in Integrated Access to Genomic Data Sources
, 2000
"... The integration of heterogeneous data sources and software systems is a major issue in the biomedical community and several approaches have been explored: linking databases, "on-the-fly" integration through views, and integration through warehousing. In this paper we report on our experiences with t ..."
Abstract
-
Cited by 52 (4 self)
- Add to MetaCart
The integration of heterogeneous data sources and software systems is a major issue in the biomedical community and several approaches have been explored: linking databases, "on-the-fly" integration through views, and integration through warehousing. In this paper we report on our experiences with two systems that were developed at the University of Pennsylvania: an integration system called K2, which has primarily been used to provide views over multiple external data sources and software systems; and a data warehouse called GUS which downloads, cleans, integrates and annotates data from multiple external data sources. Although the view and warehouse approaches each have their advantages, there is no clear "winner". Therefore, users must consider how the data is to be used, what the performance guarantees must be, and how much programmer time and expertise is available to choose the best strategy for a particular application.
Views in a Large Scale XML Repository
- In Proc. 27th Int. Conf. on Very Large Databases
, 2001
"... We are interested in de#ning and querying views in a huge and highly heterogeneous XML repository #Web scale#. In this context, view de#nitions are very large and there is no apparent limitation to their size. This raises interesting problems that we address in the paper: #i# how to distribute ..."
Abstract
-
Cited by 49 (6 self)
- Add to MetaCart
We are interested in de#ning and querying views in a huge and highly heterogeneous XML repository #Web scale#. In this context, view de#nitions are very large and there is no apparent limitation to their size. This raises interesting problems that we address in the paper: #i# how to distribute views over several machines without having a negative impact on the query translation process; #ii# how to quickly select the relevant part of a view given a query; #iii# how to minimize the cost of communicating potentially large queries to the machines where they will be evaluated. 1
Active Views for Electronic Commerce
, 1999
"... Electronic commerce is emerging as a major Web-supported application. In this paper we argue that database technology can, and should, provide the backbone for a wide range of such applications. More precisely, we present here the ActiveViews system, which, relaying on an extensive use of database ..."
Abstract
-
Cited by 47 (8 self)
- Add to MetaCart
Electronic commerce is emerging as a major Web-supported application. In this paper we argue that database technology can, and should, provide the backbone for a wide range of such applications. More precisely, we present here the ActiveViews system, which, relaying on an extensive use of database features including views, active rules (triggers), and enhanced mechanisms for notification, access control and logging/tracing of users activities, provides the needed basis for electronic commerce. Based on
Query Optimization for Semistructured Data
, 1997
"... With the emerging prevalence of semistructured data -- data that may be irregular or incomplete -- it is important to develop efficient query processing techniques for such data. This paper describes the query processor of Lore, a DBMS for semistructured data, and focuses particularly on the cost-ba ..."
Abstract
-
Cited by 23 (7 self)
- Add to MetaCart
With the emerging prevalence of semistructured data -- data that may be irregular or incomplete -- it is important to develop efficient query processing techniques for such data. This paper describes the query processor of Lore, a DBMS for semistructured data, and focuses particularly on the cost-based query optimization techniques we have developed and implemented for a semistructured environment. While all of the usual problems associated with cost-based query optimization apply to semistructured data as well, a number of additional problems arise, suchasvastly different query execution strategies for different semistructured databases, more complicated notions of database statistics, and novel uses of indexing. Weintroduce very flexible logical query plans that can be transformed into a wide varietyofphysical plans, define appropriate database statistics and a cost model, and describe plan enumeration including heuristics for reducing the search space. Our optimizer is fully implemented for most of the Lore query language, and preliminary performance results are reported.

