Incremental Maintenance of Views with Duplicates
Abstract

We study the problem of efficient maintenance of materialized views that may contain duplicates. This problem is particularly important when queries against such views involve aggregate functions, which need duplicates to produce correct results. Unlike most work on the view maintenance problem that is based on an algorithmic approach, our approach is algebraic and based on equational reasoning. This approach has a number of advantages: it is robust and easily extendible to new language constructs, it produces output that can be used by query optimizers, and it simpli es correctness proofs. We use a natural extension of the relational algebra operations to bags (multisets) as our basic language. We present an algorithm that propagates changes from base relations to materialized views. This algorithm is based on reasoning about equivalence of bagvalued expressions. We prove that it is correct and preserves a certain notion of minimality that ensures that no unnecessary tuples are computed. Although it is generally only a heuristic that computing changes to the view rather than recomputing the view from scratch is more efficient, we prove results saying that under normal circumstances one should expect the change propagation algorithm to be significantly faster and more space efficient than complete recomputing of the view. We also show that our approach interacts nicely with aggregate functions, allowing their correct evaluation on views that change.
Principles of Programming with Complex Objects and Collection Types
 Theoretical Computer Science
, 1995
Abstract

We present a new principle for the development of database query languages that the primitive operations should be organized around types. Viewing a relational database as consisting of sets of records, this principle dictates that we should investigate separately operations for records and sets. There are two immediate advantages of this approach, which is partly inspired by basic ideas from category theory. First, it provides a language for structures in which record and set types may be freely combined: nested relations or complex objects. Second, the fundamental operations for sets are closely related to those for other "collection types" such as bags or lists, and this suggests how database languages may be uniformly extended to these new types. The most general operation on sets, that of structural recursion, is one in which not all programs are welldefined. In looking for limited forms of this operation that always give rise to welldefined operations, we find a number of close ...
Towards Tractable Algebras for Bags
, 1993
Abstract

Bags, i.e. sets with duplicates, are often used to implement relations in database systems. In this paper, we study the expressive power of algebras for manipulating bags. The algebra we present is a simple extension of the nested relation algebra. Our aim is to investigate how the use of bags in the language extends its expressive power, and increases its complexity. We consider two main issues, namely (i) the impact of the depth of bag nesting on the expressive power, and (ii) the complexity and the expressive power induced by the algebraic operations. We show that the bag algebra is more expressive than the nested relation algebra (at all levels of nesting), and that the difference may be subtle. We establish a hierarchy based on the structure of algebra expressions. This hierarchy is shown to be highly related to the properties of the powerset operator. Invited to a special issue of the Journal of Computer and System Sciences selected from ACM Princ. of Database Systems,...
Counting Quantifiers, Successor Relations, and Logarithmic Space
 Journal of Computer and System Sciences
Abstract

Given a successor relation S (i.e., a directed line graph), and given two distinguished points s and t, the problem ORD is to determine whether s precedes t in the unique ordering defined by S. We show that ORD is Lcomplete (via quantifierfree projections). We then show that firstorder logic with counting quantifiers, a logic that captures TC 0 ([BIS90]) over structures with a builtin totalordering, can not express ORD. Our original proof of this in the conference version of this paper ([Ete95]) employed an EhrenfeuchtFraiss'e Game for firstorder logic with counting ([IL90]). Here we show how the result follows from a more general one obtained independently by Nurmonen, [Nur96]. We then show that an appropriately modified version of the EF game is "complete" for the logic with counting in the sense that it provides a necessary and sufficient condition for expressibility in the logic. We observe that the Lcomplete problem ORD is essentially sparse if we ignore reorderings of v...
The Power of Languages for the Manipulation of Complex Values
 VLDB Journal
, 1995
Abstract

Abstract. Various models and languages for describing and manipulating hierarchically structured data have been proposed. Algebraic, calculusbased, and logicprogramming oriented languages have all been considered. This article presents a general model for complex values (i.e., values with hierarchical structures), and languages for it based on the three paradigms. The algebraic language generalizes those presented in the literature; it is shown to be related to the functional style of programming advocated by Backus (1978). The notion of domain independence (from relational databases) is defined, and syntactic restrictions (referred to as safety conditions) on calculus queries are formulated to guarantee domain independence. The main results are: The domainindependent calculus, the safe calculus, the algebra, and the logicprogramming oriented language have equivalent expressive power. In particular, recursive queries, such as the transitive closure, can be expressed in each of the languages. For this result, the algebra needs the powerset operation. A more restricted version of safety is presented, such that the restricted safe calculus is equivalent to the algebra without the powerset. The results are extended to the case where arbitrary functions and predicates are used in the languages. Key Words. Database, query language, complex value, complex object, database model.
Local Properties of Query Languages
Abstract

predeterminedportionoftheinput.Examplesincludeallrelationalcalculusqueries. everyrelationalcalculus(rstorder)queryislocal,thegeneralresultsprovedforlocalqueriescan manyeasyinexpressibilityproofsforlocalqueries.Wethenconsideracloselyrelatedproperty, namely,theboundeddegreeproperty.Itdescribestheoutputsoflocalqueriesonstructuresthat locallylook\simple."Everyquerythatislocalisshowntohavetheboundeddegreeproperty.Since Westartbyprovingageneralresultdescribingoutputsoflocalqueries.Thisresultleadsto toapplythanEhrenfeuchtFrassegames.Wealsoshowthatsomegeneralizationsofthebounded degreepropertythatwereconjecturedtohold,failforrelationalcalculus. beviewedas\otheshelf"strategiesforprovinginexpressibilityresults,whichareofteneasier maintenanceofviews,andshowthatSQLandrelationalcalculusareincapableofmaintainingthe gregates,whichisessentiallyplainSQL,hastheboundeddegreeproperty,thusansweringaques tionthathasbeenopenforseveralyears.Consequently,rstorderquerieswithHartigorRescher quantiersalsohavetheboundeddegreeproperty.Finally,weapplyourresultstoincremental Wethenprovethatthelanguageobtainedfromrelationalcalculusbyaddinggroupingandag
Normal Forms And Conservative Extension Properties For Query Languages Over Collection Types
 Journal of Computer and System Sciences
, 1995
Abstract

Strong normalization results are obtained for a general language for collection types. An induced normal form for sets and bags is then used to show that the class of functions whose input has height (that is, the maximal depth of nestings of sets/bags/lists in the complex object) at most i and output has height at most o definable in a nested relational query language without powerset operator is independent of the height of intermediate expressions used. Our proof holds regardless of whether the language is used for querying sets, bags, or lists, even in the presence of variant types. Moreover, the normal forms are useful in a general approach to query optimization. Paredaens and Van Gucht proved a similar result for the special case when i = o = 1. Their result is complemented by Hull and Su who demonstrated the failure of independence when powerset operator is present and i = o = 1. The theorem of Hull and Su was generalized to all i and o by Grumbach and Vianu. Our result generali...
Constraint Databases: A Survey
 Semantics in Databases, number 1358 in LNCS
, 1998
Abstract

. Constraint databases generalize relational databases by finitely representable infinite relations. This paper surveys the state of the art in constraint databases: known results, remaining open problems and current research directions. The paper also describes a new algebra for databases with integer order constraints and a complexity analysis of evaluating queries in this algebra. In memory of Paris C. Kanellakis 1 Introduction There is a growing interest in recent years among database researchers in constraint databases, which are a generalization of relational databases by finitely representable infinite relations. Constraint databases are parametrized by the type of constraint domains and constraint used. The good news is that for many parameters constraint databases leave intact most of the fundamental assumptions of the relational database framework proposed by Codd. In particular, 1. Constraint databases can be queried by constraint query languages that (a) have a semantics ba...
Verifiable Properties of Database Transactions
 Information and Computation
, 1998
Abstract

ing with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from Publications Dept, ACM Inc., fax +1 (212) 8690481, or permissions@acm.org. Verifiable Properties of Database Transactions Michael Benedikt Timothy Griffin Leonid Libkin Bell Laboratories 600 Mountain Avenue, Murray Hill NJ 07974, USA email: fbenedikt, griffin, libking@research.att.com Abstract It is often necessary to ensure that database transactions preserve integrity constraints that specify valid database states. While it is possible to monitor for violations of constraints at runtime, rolling back transactions when violations are detected, it is preferable to verify correctness statically, before transactions are executed. This can be accomplished if we can verify transaction safety with respect to a set of constraints by means of calculating weakest preconditions. We study properties o...