Results 1  10
of
50
Incremental Maintenance of Views with Duplicates
"... We study the problem of efficient maintenance of materialized views that may contain duplicates. This problem is particularly important when queries against such views involve aggregate functions, which need duplicates to produce correct results. Unlike most work on the view maintenance problem that ..."
Abstract

Cited by 164 (8 self)
 Add to MetaCart
We study the problem of efficient maintenance of materialized views that may contain duplicates. This problem is particularly important when queries against such views involve aggregate functions, which need duplicates to produce correct results. Unlike most work on the view maintenance problem that is based on an algorithmic approach, our approach is algebraic and based on equational reasoning. This approach has a number of advantages: it is robust and easily extendible to new language constructs, it produces output that can be used by query optimizers, and it simpli es correctness proofs. We use a natural extension of the relational algebra operations to bags (multisets) as our basic language. We present an algorithm that propagates changes from base relations to materialized views. This algorithm is based on reasoning about equivalence of bagvalued expressions. We prove that it is correct and preserves a certain notion of minimality that ensures that no unnecessary tuples are computed. Although it is generally only a heuristic that computing changes to the view rather than recomputing the view from scratch is more efficient, we prove results saying that under normal circumstances one should expect the change propagation algorithm to be significantly faster and more space efficient than complete recomputing of the view. We also show that our approach interacts nicely with aggregate functions, allowing their correct evaluation on views that change.
New Techniques for Studying Set Languages, Bag Languages and Aggregate Functions
, 1994
"... We provide new techniques for the analysis of the expressive power of query languages for nested collections. These languages may use set or bag semantics and may be further complicated by the presence of aggregate functions. We exhibit certain classes of graphs and prove that the properties of thes ..."
Abstract

Cited by 42 (25 self)
 Add to MetaCart
We provide new techniques for the analysis of the expressive power of query languages for nested collections. These languages may use set or bag semantics and may be further complicated by the presence of aggregate functions. We exhibit certain classes of graphs and prove that the properties of these graphs that can be tested in such languages are either finite or cofinite. This result settles the conjectures of Grumbach, Milo, and Paredaens that parity test, transitive closure, and balanced binary tree test are not expressible in bag languages like the PTIME fragment of BALG of Grumbach and Milo and BQL of Libkin and Wong. Moreover, it implies that many recursive queries, including simple ones like the test for a chain, cannot be expressed in a nested relational language even when aggregate functions are available. In an attempt to generalize the finitecofiniteness result, we study the bounded degree property which says that the number of distinct in and outdegrees in the output of...
Some Properties of Query Languages for Bags
 IN PROCEEDINGS OF 4TH INTERNATIONAL WORKSHOP ON DATABASE PROGRAMMING LANGUAGES
, 1993
"... In this paper we study the expressive power of query languages for nested bags. We define the ambient bag language by generalizing the constructs of the relational language of BreazuTannen, Buneman and Wong, which is known to have precisely the power of the nested relational algebra. Relative s ..."
Abstract

Cited by 40 (27 self)
 Add to MetaCart
In this paper we study the expressive power of query languages for nested bags. We define the ambient bag language by generalizing the constructs of the relational language of BreazuTannen, Buneman and Wong, which is known to have precisely the power of the nested relational algebra. Relative strength of additional polynomial constructs is studied, and the ambient language endowed with the strongest combination of those constructs is chosen as a candidate for the basic bag language, which is called BQL (Bag Query Language). We prove that achieveing the power of BQL in the relational language amounts to adding simple arithmetic to the latter. We show that BQL has shortcomings of the relational algebra: it can not express recursive queries. In particular, parity test is not definable in BQL. We consider augmenting BQL with powerbag and structural recursion to overcome this deficiency. In contrast to the relational case, where powerset and structural recursion are equivalent...
On the Complexity of Nonrecursive XQuery and Functional Query Languages on Complex Values
 In Proc. PODS’05
"... This article studies the complexity of evaluating functional query languages for complex values such as monad algebra and the recursionfree fragment of XQuery. We show that monad algebra with equality restricted to atomic values is complete for the class TA[2O(n) , O(n)] of problems solvable in lin ..."
Abstract

Cited by 40 (1 self)
 Add to MetaCart
This article studies the complexity of evaluating functional query languages for complex values such as monad algebra and the recursionfree fragment of XQuery. We show that monad algebra with equality restricted to atomic values is complete for the class TA[2O(n) , O(n)] of problems solvable in linear exponential time with a linear number of alternations. The monotone fragment of monad algebra with atomic value equality but without negation is complete for nondeterministic exponential time. For monad algebra with deep equality, we establish TA[2O(n) , O(n)] lower and exponentialspace upper bounds. We also study a fragment of XQuery, Core XQuery, that seems to incorporate all the features of a query language on complex values that are traditionally deemed essential. A close connection between monad algebra on lists and Core XQuery (with “child ” as the only axis) is exhibited, and it is shown that these languages are expressively equivalent up to representation issues. We show that Core XQuery is just as hard as monad algebra w.r.t. query and combined complexity, and that it is in TC0 if the query is assumed fixed. As Core XQuery is NEXPTIMEhard, it is commonly believed that any algorithm for evaluating Core XQuery has to require exponential amounts of working memory and doubly exponential time in the worst case. We present a property of queries – the lack of a certain form of composition – that virtually all realworld XQueries have and that allows for query evaluation in singly exponential time and polynomial space. Still, we are able to show for an important special case – Core XQuery with equality testing restricted to atomic values – that the compositionfree language is just as expressive as the language with composition. Thus, under widelyheld complexitytheoretic assumptions, the compositionfree language is an exponentially less succinct version of the language with composition.
Local Properties of Query Languages
"... predeterminedportionoftheinput.Examplesincludeallrelationalcalculusqueries. everyrelationalcalculus(rstorder)queryislocal,thegeneralresultsprovedforlocalqueriescan manyeasyinexpressibilityproofsforlocalqueries.Wethenconsideracloselyrelatedproperty, namely,theboundeddegreeproperty.Itdescribestheoutp ..."
Abstract

Cited by 33 (21 self)
 Add to MetaCart
predeterminedportionoftheinput.Examplesincludeallrelationalcalculusqueries. everyrelationalcalculus(rstorder)queryislocal,thegeneralresultsprovedforlocalqueriescan manyeasyinexpressibilityproofsforlocalqueries.Wethenconsideracloselyrelatedproperty, namely,theboundeddegreeproperty.Itdescribestheoutputsoflocalqueriesonstructuresthat locallylook\simple."Everyquerythatislocalisshowntohavetheboundeddegreeproperty.Since Westartbyprovingageneralresultdescribingoutputsoflocalqueries.Thisresultleadsto toapplythanEhrenfeuchtFrassegames.Wealsoshowthatsomegeneralizationsofthebounded degreepropertythatwereconjecturedtohold,failforrelationalcalculus. beviewedas\otheshelf"strategiesforprovinginexpressibilityresults,whichareofteneasier maintenanceofviews,andshowthatSQLandrelationalcalculusareincapableofmaintainingthe gregates,whichisessentiallyplainSQL,hastheboundeddegreeproperty,thusansweringaques tionthathasbeenopenforseveralyears.Consequently,rstorderquerieswithHartigorRescher quantiersalsohavetheboundeddegreeproperty.Finally,weapplyourresultstoincremental Wethenprovethatthelanguageobtainedfromrelationalcalculusbyaddinggroupingandag
Sequences, Datalog and Transducers
, 1996
"... This paper develops a query language for sequence databases, such as genome databases and text databases. The language, called SequenceDatalog, extends classical Datalog with interpreted function symbols for manipulating sequences. It has both a clear operational and declarative semantics, based on ..."
Abstract

Cited by 24 (5 self)
 Add to MetaCart
This paper develops a query language for sequence databases, such as genome databases and text databases. The language, called SequenceDatalog, extends classical Datalog with interpreted function symbols for manipulating sequences. It has both a clear operational and declarative semantics, based on a new notion called the extended active domain of a database. The extended domain contains all the sequences in the database and all their subsequences. This idea leads to a clear distinction between safe and unsafe recursion over sequences: safe recursion stays inside the extended active domain, while unsafe recursion does not. By carefully limiting the amountof unsafe recursion, the paper develops a safe and expressive subset of Sequence Datalog. As part of the development, a new type of transducer is introduced, called a generalized sequence transducer. Unsafe recursion is allowed only within these generalized transducers. Generalized transducers extend ordinary transducers by allowing them to invoke other transducers as "subroutines." Generalized transducers can be implemented in Sequence Datalog in a straightforward way. Moreover, their introduction into the language leads to simple conditions that guarantee safety and finiteness. This paper develops two such conditions. The first condition expresses exactly the class of ptime sequence functions; and the second expresses exactly the class of elementary sequence functions.
On the Forms of Locality over Finite Models
 In Proc. 12th IEEE Symp. on Logic in Computer Science
, 1997
"... Most proofs showing limitations of expressive power of firstorder logic rely on EhrenfeuchtFraisse games. Playing the game often involves a nontrivial combinatorial argument, so it was proposed to find easier tools for proving expressivity bounds. Most of those known for firstorder logic are base ..."
Abstract

Cited by 18 (10 self)
 Add to MetaCart
Most proofs showing limitations of expressive power of firstorder logic rely on EhrenfeuchtFraisse games. Playing the game often involves a nontrivial combinatorial argument, so it was proposed to find easier tools for proving expressivity bounds. Most of those known for firstorder logic are based on its "locality", that is defined in different ways. In this paper we characterize the relationship between those notions of locality. We note that Gaifman's locality theorem gives rise to two notions: one deals with sentences and one with open formulae. We prove that the former implies Hanf's notion of locality, which in turn implies Gaifman's locality for open formulae. Each of these implies the bounded degree property, which is one of the easiest tools for proving expressivity bounds. These results apply beyond the firstorder case. We use them to derive expressivity bounds for firstorder logic with unary quantifiers and counting. Finally, we apply these results to relational database...
An Algebra for Pomsets
, 1995
"... We study languages for manipulating partially ordered structures with duplicates (e.g. trees, lists). As a general framework, we consider the pomset (partially ordered multiset) data type. We introduce an algebra for pomsets, which generalizes traditional algebras for (nested) sets, bags and list ..."
Abstract

Cited by 17 (3 self)
 Add to MetaCart
We study languages for manipulating partially ordered structures with duplicates (e.g. trees, lists). As a general framework, we consider the pomset (partially ordered multiset) data type. We introduce an algebra for pomsets, which generalizes traditional algebras for (nested) sets, bags and lists. This paper is motivated by the study of the impact of different language primitives on the expressive power. We show that the use of partially ordered types increases the expressive power significantly. Surprisingly, it turns out that the algebra when restricted to both unordered (bags) and totally ordered (lists) intermediate types, yields the same expressive power as fixpoint logic with counting on relational databases. It therefore constitutes a rather robust class of relational queries. On the other hand, we obtain a characterization of PTIME queries on lists by considering only totally ordered types.
TopDown Induction of Decision Trees Classifiers  A Survey
, 2002
"... Decision Trees are considered to be one of the most popular approaches for representing classifiers. Researchers from various disciplines such as statistics, machine learning, pattern recognition, and data mining considered the issue of growing a decision tree from available data. This paper present ..."
Abstract

Cited by 17 (3 self)
 Add to MetaCart
Decision Trees are considered to be one of the most popular approaches for representing classifiers. Researchers from various disciplines such as statistics, machine learning, pattern recognition, and data mining considered the issue of growing a decision tree from available data. This paper presents an updated survey of current methods for constructing decision tree classifiers in topdown manner. The paper suggests a unified algorithmic framework for presenting these algorithms and provides profound descriptions of the various splitting criteria and pruning methodology.