Results 1 - 10
of
38
Answering Queries Using Views: A Survey
, 2000
"... The problem of answering queries using views is to find efficient methods of answering a query using a set of previously defined materialized views over the database, rather than accessing the database relations. The problem has recently received significant attention because of its relevance to a w ..."
Abstract
-
Cited by 395 (27 self)
- Add to MetaCart
The problem of answering queries using views is to find efficient methods of answering a query using a set of previously defined materialized views over the database, rather than accessing the database relations. The problem has recently received significant attention because of its relevance to a wide variety of data management problems. In query optimization, finding a rewriting of a query using a set of materialized views can yield a more efficient query execution plan. To support the separation of the logical and physical views of data, a storage schema can be described using views over the logical schema. As a result, finding a query execution plan that accesses the storage amounts to solving the problem of answering queries using views. Finally, the problem arises in data integration systems, where data sources can be described as precomputed views over a mediated schema. This article surveys the state of the art on the problem of answering queries using views, and synthesizes the disparate works into a coherent framework. We describe the different applications of the problem, the algorithms proposed to solve it and the relevant theoretical results.
Eddies: Continuously Adaptive Query Processing
- In SIGMOD
, 2000
"... In large federated and shared-nothing databases, resources can exhibit widely fluctuating characteristics. Assumptions made at the time a query is submitted will rarely hold throughout the duration of query processing. As a result, traditional static query optimization and execution techniques are i ..."
Abstract
-
Cited by 300 (19 self)
- Add to MetaCart
In large federated and shared-nothing databases, resources can exhibit widely fluctuating characteristics. Assumptions made at the time a query is submitted will rarely hold throughout the duration of query processing. As a result, traditional static query optimization and execution techniques are ineffective in these environments. In this paper we introduce a query processing mechanism called an eddy, which continuously reorders operators in a query plan as it runs. We characterize the moments of symmetry during which pipelined joins can be easily reordered, and the synchronization barriers that require inputs from different sources to be coordinated. By combining eddies with appropriate join algorithms, we merge the optimization and execution phases of query processing, allowing each tuple to have a flexible ordering of the query operators. This flexibility is controlled by a combination of fluid dynamics and a simple learning algorithm. Our initial implementation demonstrates prom...
Tinydb: An acquisitional query processing system for sensor networks
- ACM Trans. Database Syst
, 2005
"... We discuss the design of an acquisitional query processor for data collection in sensor networks. Acquisitional issues are those that pertain to where, when, and how often data is physically acquired (sampled) and delivered to query processing operators. By focusing on the locations and costs of acq ..."
Abstract
-
Cited by 295 (7 self)
- Add to MetaCart
We discuss the design of an acquisitional query processor for data collection in sensor networks. Acquisitional issues are those that pertain to where, when, and how often data is physically acquired (sampled) and delivered to query processing operators. By focusing on the locations and costs of acquiring data, we are able to significantly reduce power consumption over traditional passive systems that assume the a priori existence of data. We discuss simple extensions to SQL for controlling data acquisition, and show how acquisitional issues influence query optimization, dissemination, and execution. We evaluate these issues in the context of TinyDB, a distributed query processor for smart sensor devices, and show how acquisitional techniques can provide significant reductions in power consumption on our sensor devices. Categories and Subject Descriptors: H.2.3 [Database Management]: Languages—Query languages; H.2.4 [Database Management]: Systems—Distributed databases; query processing
Continuously Adaptive Continuous Queries over Streams
- In SIGMOD
, 2002
"... We present a continuously adaptive, continuous query (CACQ) implementation based on the eddy query processing framework. We show that our design provides significant performance benefits over existing approaches to evaluating continuous queries, not only because of its adaptivity, but also because o ..."
Abstract
-
Cited by 204 (6 self)
- Add to MetaCart
We present a continuously adaptive, continuous query (CACQ) implementation based on the eddy query processing framework. We show that our design provides significant performance benefits over existing approaches to evaluating continuous queries, not only because of its adaptivity, but also because of the aggressive crossquery sharing of work and space that it enables. By breaking the abstraction of shared relational algebra expressions, our Telegraph CACQ implementation is able to share physical operators -- both selections and join state -- at a very fine grain. We augment these features with a grouped-filter index to simultaneously evaluate multiple selection predicates. We include measurements of the performance of our core system, along with a comparison to existing continuous query approaches.
Adaptive Query Processing: Technology in Evolution
- IEEE DATA ENGINEERING BULLETIN
, 2000
"... As query engines are scaled and federated, they must cope with highly unpredictable and changeable environments. In the Telegraph project, we are attempting to architect and implement a continuously adaptive query engine suitable for global-area systems, massive parallelism, and sensor networks. To ..."
Abstract
-
Cited by 73 (9 self)
- Add to MetaCart
As query engines are scaled and federated, they must cope with highly unpredictable and changeable environments. In the Telegraph project, we are attempting to architect and implement a continuously adaptive query engine suitable for global-area systems, massive parallelism, and sensor networks. To set the stage for our research, we present a survey of prior work on adaptive query processing, focusing on three characterizations of adaptivity: the frequency of adaptivity, the effects of adaptivity, and the extent of adaptivity. Given this survey, we sketch directions for research in the Telegraph project.
Using State Modules for Adaptive Query Processing
- In ICDE
, 2003
"... We present a query architecture in which join operators are decomposed into their constituent data structures (State Modules, or SteMs), and dataflow among these SteMs is managed adaptively by an Eddy routing operator. Breaking the encapsulation of joins serves two purposes. First, it allows the Edd ..."
Abstract
-
Cited by 44 (9 self)
- Add to MetaCart
We present a query architecture in which join operators are decomposed into their constituent data structures (State Modules, or SteMs), and dataflow among these SteMs is managed adaptively by an Eddy routing operator. Breaking the encapsulation of joins serves two purposes. First, it allows the Eddy to observe multiple physical operations embedded in a join algorithm, allowing for better calibration and control of these operations. Second, the SteM on a relation serves as a shared materialization point, enabling multiple competing access methods to share results, which can be leveraged by multiple competing join algorithms. Our architecture extends prior work significantly, allowing continuously adaptive decisions for most major aspects of traditional query optimization: choice of access methods and join algorithms, ordering of operators, and choice of a query spanning tree. SteMs introduce...
Operator Scheduling in Data Stream Systems
- THE VLDB JOURNAL
, 2003
"... In many applications involving continuous data streams, data arrival is bursty and data rate fluctuates over time. Systems that seek to give rapid or real-time query responses in such an environment must be prepared to deal gracefully with bursts in data arrival without compromising system performan ..."
Abstract
-
Cited by 29 (1 self)
- Add to MetaCart
In many applications involving continuous data streams, data arrival is bursty and data rate fluctuates over time. Systems that seek to give rapid or real-time query responses in such an environment must be prepared to deal gracefully with bursts in data arrival without compromising system performance. We discuss one strategy for processing bursty streams --- adaptive, load-aware scheduling of query operators to minimize resource consumption during times of peak load. We show that the choice of an operator scheduling strategy can have significant impact on the run-time system memory usage as well as output latency. Our aim is to design a scheduling strategy that minimizes the maximum run-time system memory, while maintaining the output latency within prespecified bounds. We first present Chain scheduling, an operator scheduling strategy for data stream systems that is near-optimal in minimizing run-time memory usage for any collection of single-stream queries involving selections, projections, and foreignkey joins with stored relations. Chain scheduling also performs well for queries with sliding-window joins over multiple streams, and multiple queries of the above types. However, during bursts in input streams, when there is a buildup of unprocessed tuples, Chain scheduling may lead to high output latency. We study the online problem of minimizing maximum run-time memory, subject to a constraint on maximum latency. We present preliminary observations, negative results, and heuristics for this problem. A thorough
Partial Results for Online Query Processing
, 2002
"... Traditional query processors generate full, accurate query results, either in batch or in pipelined fashion. We argue that this strict model is too rigid for exploratory queries over diverse and distributed data sources, such as sources on the Internet. Instead, we propose a looser model of querying ..."
Abstract
-
Cited by 28 (3 self)
- Add to MetaCart
Traditional query processors generate full, accurate query results, either in batch or in pipelined fashion. We argue that this strict model is too rigid for exploratory queries over diverse and distributed data sources, such as sources on the Internet. Instead, we propose a looser model of querying in which a user submits a broad initial query outline, and the system continually generates partial result tuples that may contain values for only some of the output fields. The user can watch these partial results accumulate at the user interface, and accordingly refine the query by specifying their interest in di#erent kinds of partial results. After
Learning Response Time for WebSources using Query Feedback and Application in Query Optimization
- VLDB Journal
, 1999
"... The rapid growth of the Internet and support for interoperability protocols has increased the number of Web accessible sources, WebSources. Current optimization technology for wrapper mediator architectures needs to be extended to estimate the response time (delays) to access WebSources and to use t ..."
Abstract
-
Cited by 24 (6 self)
- Add to MetaCart
The rapid growth of the Internet and support for interoperability protocols has increased the number of Web accessible sources, WebSources. Current optimization technology for wrapper mediator architectures needs to be extended to estimate the response time (delays) to access WebSources and to use this delay in query optimization. In this paper, we present a Multi-Dimensional Table (MDT), a tool that is based on learning using query feedback from WebSources. We describe the MDT learning algorithms, and report on the MDT learning for WebSources. The MDT uses dimensions Time of day, Day, and Quantity of data, to learn response times from a particular WebSource, and to predict the expected response time (delay), and a confidence in this prediction, for some query. Experiment data was collected from several WebSources and analyzed, to determine those dimensions that were significant in estimating the response time for particular WebSources. Our research shows that we can improve the qualit...

