Results 1 - 10
of
39
TelegraphCQ: Continuous Dataflow Processing for an Uncertan World
, 2003
"... Increasingly pervasive networks are leading towards a world where data is constantly in motion. In such a world, conventional techniques for query processing, which were developed under the assumption of a far more static and predictable computational environment, will not be sufficient. Instead, qu ..."
Abstract
-
Cited by 328 (18 self)
- Add to MetaCart
Increasingly pervasive networks are leading towards a world where data is constantly in motion. In such a world, conventional techniques for query processing, which were developed under the assumption of a far more static and predictable computational environment, will not be sufficient. Instead, query processors based on adaptive dataflow will be necessary. The Telegraph project has developed a suite of novel technologies for continuously adaptive query processing. The next generation Telegraph system, called TelegraphCQ, is focused on meeting the challenges that arise in handling large streams of continuous queries over high-volume, highly-variable data streams. In this paper, we describe the system architecture and its underlying technology, and report on our ongoing implementation effort, which leverages the PostgreSQL open source code base. We also discuss open issues and our research agenda.
Online Aggregation
, 1997
"... Aggregation in traditional database systems is performed in batch mode: a query is submitted, the system processes a large volume of data over a long period of time, and, eventually, the final answer is returned. This archaic approach is frustrating to users and has been abandoned in most other area ..."
Abstract
-
Cited by 284 (45 self)
- Add to MetaCart
Aggregation in traditional database systems is performed in batch mode: a query is submitted, the system processes a large volume of data over a long period of time, and, eventually, the final answer is returned. This archaic approach is frustrating to users and has been abandoned in most other areas of computing. In this paper we propose a new online aggregation interface that permits users to both observe the progress of their aggregation queries and control execution on the fly. After outlining usability and performance requirements for a system supporting online aggregation, we present a suite of techniques that extend a database system to meet these requirements. These include methods for returning the output in random order, for providing control over the relative rate at which different aggregates are computed, and for computing running confidence intervals. Finally, we report on an initial implementation of online aggregation in postgres. 1 Introduction Aggregation is an incre...
Ripple Joins for Online Aggregation
"... We present a new family of join algorithms, called ripple joins, for online processing of multi-table aggregation queries in a relational database management system (dbms). Such queries arise naturally in interactive exploratory decision-support applications. Traditional offline join algorithms are ..."
Abstract
-
Cited by 145 (11 self)
- Add to MetaCart
We present a new family of join algorithms, called ripple joins, for online processing of multi-table aggregation queries in a relational database management system (dbms). Such queries arise naturally in interactive exploratory decision-support applications. Traditional offline join algorithms are designed to minimize the time to completion of the query. In contrast, ripple joins are designed to minimize the time until an acceptably precise estimate of the query result is available, as measured by the length of a confidence interval. Ripple joins are adaptive, adjusting their behavior during processing in accordance with the statistical properties of the data. Ripple joins also permit the user to dynamically trade off the two key performance factors of online aggregation: the time between successive updates of the running aggregate, and the amount by which the confidence-interval length decreases at each update. We show how ripple joins can be implemented in an existing dbms using iterators, and we give an overview of the methods used to compute confidence intervals and to adaptively optimize the ripple join "aspect-ratio" parameters. In experiments with an initial implementation of our algorithms in the postgres dbms, the time required to produce reasonably precise online estimates was up to two orders of magnitude smaller than the time required for the best offline join algorithms to produce exact answers.
Query optimization in the presence of limited access patterns
- In Proc. of ACM SIGMOD Conf. on Management of Data
, 1999
"... We consider the problem of query optimization in the presence of limitations on access patterns to the data (i.e., when one must provide values for one of the attributes of a relation in order to obtain tuples). We show that in the presence of limited access patterns we must search a space of annota ..."
Abstract
-
Cited by 77 (10 self)
- Add to MetaCart
We consider the problem of query optimization in the presence of limitations on access patterns to the data (i.e., when one must provide values for one of the attributes of a relation in order to obtain tuples). We show that in the presence of limited access patterns we must search a space of annotated query plans, where the annotations describe the inputs that must be given to the plan. We describe a theoretical and experimental analysis of the resulting search space and a novel query optimization algorithm that is designed to perform well under the di erent conditions that may arise. The algorithm searches the set of annotated query plans, pruning invalid and non-viable plans as early as possible in the search space, and it also uses a best- rst search strategy in order to produce a rst complete plan early in the search. We describe experiments to illustrate the performance of our algorithm. 1
Optimization techniques for queries with expensive methods
- ACM Transactions on Database Systems (TODS
, 1998
"... Object-Relational database management systems allow knowledgeable users to de ne new data types, as well as new methods (operators) for the types. This exibility produces an attendant complexity, which must be handled in new ways for an Object-Relational database management system to be e cient. In ..."
Abstract
-
Cited by 53 (3 self)
- Add to MetaCart
Object-Relational database management systems allow knowledgeable users to de ne new data types, as well as new methods (operators) for the types. This exibility produces an attendant complexity, which must be handled in new ways for an Object-Relational database management system to be e cient. In this paper we study techniques for optimizing queries that contain time-consuming methods. The focus of traditional query optimizers has been on the choice of join methods and orders; selections have been handled by \pushdown " rules. These rules apply selections in an arbitrary order before as many joins as possible, using the assumption that selection takes no time. However, users of Object-Relational systems can embed complex methods in selections. Thus selections may take signi cant amounts of time, and the query optimization model must be enhanced. In this paper, we carefully de ne a query cost framework that incorporates both selectivity and cost estimates for selections. We develop an algorithm called Predicate Migration, and prove that it produces optimal plans for queries with expensive methods. We then describe our implementation of Predicate Migration in the commercial Object-Relational database management system Illustra, and discuss practical issues that a ect our earlier assumptions. We compare Predicate Migration to a variety of simpler optimization techniques, and demonstrate that Predicate Migration is the best general solution to date. The alternative techniques we presentmaybe useful for constrained workloads.
WSQ/DSQ: A Practical Approach for Combined Querying of Databases and the Web
- In SIGMOD
, 2000
"... We present WSQ/DSQ (pronounced "wisk-disk"), a new approach for combining the query facilities of traditional databases with existing search engines on the Web. WSQ, for Web-Supported (Database) Queries, leverages results from Web searches to enhance SQL queries over a relational database. DSQ, f ..."
Abstract
-
Cited by 52 (2 self)
- Add to MetaCart
We present WSQ/DSQ (pronounced "wisk-disk"), a new approach for combining the query facilities of traditional databases with existing search engines on the Web. WSQ, for Web-Supported (Database) Queries, leverages results from Web searches to enhance SQL queries over a relational database. DSQ, for Database-Supported (Web) Queries, uses information stored in the database to enhance and explain Web searches. This paper focuses primarily on WSQ, describing a simple, low-overhead way to support WSQ in a relational DBMS, and demonstrating the utility of WSQ with a number of interesting queries and results. The queries supported by WSQ are enabled by two virtual tables, whose tuples represent Web search results generated dynamically during query execution. WSQ query execution may involve many high-latency calls to one or more search engines, during which the query processor is idle. We present a lightweight technique called asynchronous iteration that can be integrated easily into a standard sequential query processor to enable concurrency between query processing and multiple Web search requests. Asynchronous iteration has broader applications than WSQ alone, and it opens up many interesting query optimization issues. We have developed a prototype implementation of WSQ by extending a DBMS with virtual tables and asynchronous iteration; performance results are reported. 1
Form-Based Proxy Caching for Database-Backed Web Sites: Keywords and Functions
, 2008
"... Web caching proxy servers are essential for improving web performance and scalability, and recent research has focused on making proxy caching work for database-backed web sites. In this paper, we explore a new proxy caching framework that exploits the query semantics of HTML forms. We identify two ..."
Abstract
-
Cited by 48 (3 self)
- Add to MetaCart
Web caching proxy servers are essential for improving web performance and scalability, and recent research has focused on making proxy caching work for database-backed web sites. In this paper, we explore a new proxy caching framework that exploits the query semantics of HTML forms. We identify two common classes of form-based queries from real-world database-backed web sites, namely, keyword-based queries and function-embedded queries. Using typical examples of these queries, we study two representative caching schemes within our framework: (i) traditional passive query caching, and (ii) active query caching, in which the proxy cache can service a request by evaluating a query over the contents of the cache. Results from our experimental implementation show that our form-based proxy is a general and flexible approach that efficiently enables active caching schemes for database-backed web sites. Furthermore, handling query containment at the proxy yields significant performance advantages over passive query caching, but extending the power of the active cache to do full semantic caching appears to be less generally effective.
Online Dynamic Reordering for Interactive Data Processing
- In VLDB
, 1999
"... Abstract We present a pipelining, dynamically user-controllable reorder operator, for use in data-intensive applications. Allowing the user to reorder the data delivery on the fly increases the interactivity in several contexts such as online aggregation and large-scale spreadsheets; it allows the u ..."
Abstract
-
Cited by 33 (11 self)
- Add to MetaCart
Abstract We present a pipelining, dynamically user-controllable reorder operator, for use in data-intensive applications. Allowing the user to reorder the data delivery on the fly increases the interactivity in several contexts such as online aggregation and large-scale spreadsheets; it allows the user to control the processing of data by dynamically specifying preferences for different data items based on prior feedback, so that data of interest is prioritized for early processing.
Reverse query processing
- in the IDEA System". Int. Symp. on Advanced Database Technologies and Their Integration
, 1994
"... Traditionally, query processing gets a query and a database instance as input and returns the result of the query for that particular database instance. Reverse query processing (RQP) gets a query and a result as input and returns a possible database instance that could have produced that result for ..."
Abstract
-
Cited by 30 (13 self)
- Add to MetaCart
Traditionally, query processing gets a query and a database instance as input and returns the result of the query for that particular database instance. Reverse query processing (RQP) gets a query and a result as input and returns a possible database instance that could have produced that result for that query. Rather than making a closed world assumption, RQP makes an open world assumption. There are several applications for RQP; most notably, testing database applications and debugging database applications. This paper describes the formal framework of RQP and the design of a system, called SPQR (System for Processing Queries Reversely) that implements a reverse query processor for SQL. 1
Full-fledged algebraic XPath processing in Natix
- In 21st International Conference on Data Engineering (ICDE’05). IEEE Computer Society
, 2005
"... We present the first complete translation of XPath into an algebra, paving the way for a comprehensive, state-of-theart XPath (and later on, XQuery) compiler based on algebraic optimization techniques. Our translation includes all XPath features such as nested expressions, position-based predicates ..."
Abstract
-
Cited by 25 (11 self)
- Add to MetaCart
We present the first complete translation of XPath into an algebra, paving the way for a comprehensive, state-of-theart XPath (and later on, XQuery) compiler based on algebraic optimization techniques. Our translation includes all XPath features such as nested expressions, position-based predicates and node-set functions. The translated algebraic expressions can be executed using the proven, scalable, iterator-based approach, as we demonstrate in form of a corresponding physical algebra in our native XML DBMS Natix. A first glance at performance results shows that even without further optimization of the expressions, we provide a competitive evaluation technique for XPath queries. 1.

