Querying graph databases with XPath
, 2013
Cited by 16 (3 self)
General Terms XPath plays a prominent role as an XML navigational language due to several factors, including its ability to express queries of interest, its close connection to yardstick database query languages (e.g., firstorder logic), and the low complexity of query evaluation for many fragments. Another common database model — graph databases — also requires a heavy use of navigation in queries; yet it largely adopts a different approach to querying, relying on reachability patterns expressed with regular constraints. Our goal here is to investigate the behavior and applicability of XPathlike languages for querying graph databases, concentrating on their expressiveness and complexity of query evaluation. We are particularly interested in a model of graph data that combines navigation through graphs with querying data held in the nodes, such as, for example, in a social network scenario. As navigational languages, we use analogs of core and regular XPath and augment them with various tests on data values. We relate these languages to firstorder logic, its transitive closure extensions, and finitevariable fragments thereof, proving several capture results. In addition, we describe their relative expressive power. We then show that they behave very well computationally: they have a lowdegree polynomial combined complexity, which becomes linear for several fragments. Furthermore, we introduce new types of tests for XPath languages that let them capture firstorder logic with data comparisons and prove that the low complexity bounds continue to apply to such extended languages. Therefore, XPathlike languages seem to be very wellsuited to query graphs.
Efficient Reasoning about Data Trees via Integer Linear Programming
, 2011
Cited by 11 (2 self)
Data trees provide a standard abstraction of XML documents with data values: they are trees whose nodes, in addition to the usual labels, can carry labels from an infinite alphabet (data). Therefore, one is interested in decidable formalisms for reasoning about data trees. While some are known – such as the twovariable logic – they tend to be of very high complexity, and most decidability proofs are highly nontrivial. We are therefore interested in reasonable complexity formalisms as well as better techniques for proving decidability. Here we show that many decidable formalisms for data trees are subsumed – fully or partially – by the power of tree automata together with set constraints and linear constraints on cardinalities of various sets of data values. All these constraints can be translated into instances of integer linear programming, giving us an NP bound on the complexity of the reasoning tasks. We prove that this bound, as well as the key encoding technique, remain very robust, and allow the addition of features such as counting of paths and patterns, and even a concise encoding of constraints, without increasing the complexity. We also relate our results to several reasoning tasks over XML documents, such as satisfiability of schemas and data dependencies and satisfiability of the twovariable logic.
Temporal logics for concurrent recursive programs: Satisfiability and model checking
 In MFCS’11, volume 6907 of LNCS
, 2011
Cited by 9 (3 self)
Abstract. We develop a general framework for the design of temporal logics for concurrent recursive programs. A program execution is modeled as a partial order with multiple nesting relations. To specify properties of executions, we consider any temporal logic whose modalities aredefinable in monadic secondorder logic and that, in addition, allows PDLlike path expressions. This captures, in a unifying framework, a wide range of logics defined for ranked and unranked trees, nested words, and Mazurkiewicz traces that have been studied separately. We show that satisfiability and model checking are decidable in EXPTIME and 2EXPTIME, depending on the precise path modalities. 1
Viewbased Query Answering in Description Logics: Semantics and Complexity
, 2010
Cited by 8 (0 self)
Viewbased query answering is the problem of answering a query based only on the precomputed answers to a set of views. While this problem has been widely investigated in databases, it is largely unexplored in the context of Description Logic ontologies. Differently from traditional databases, Description Logics may express several forms of incomplete information, and this poses challenging problems in characterizing the semantics of views. In this paper, we first present a general framework for viewbased query answering, where we address the above semantical problems by providing two notions of viewbased query answering over ontologies, all based on the idea that the precomputed answers to views are the certain answers to the corresponding queries. We also relate such notions to privacyaware access to ontologies. Then, we provide decidability results, algorithms, and data complexity characterizations for viewbased query answering in several Description Logics, ranging from those with limited modeling capability to highly expressive ones.
Node Selection Query Languages for Trees
Cited by 7 (0 self)
The study of nodeselection query languages for (finite) trees has been a major topic in the recent research on query languages for Web documents. On one hand, there has been an extensive study of XPath and its various extensions. On the other hand, query languages based on classical logics, such as firstorder logic (FO) or monadic secondorder logic (MSO), have been considered. Results in this area typically relate an Xpathbased language to a classical logic. What has yet to emerge is an XPathrelated language that is expressive as MSO, and at the same time enjoys the computational properties of XPath, which are linear query evaluation and exponential querycontainment test. In this paper we propose µXPath, which is the alternationfree fragment of XPath extended with fixpoint operators. Using twoway alternating automata, we show that this language does combine desired expressiveness and computational properties, placing it as an attractive candidate as the definite query language for trees.
Query Reasoning on Trees with Types, Interleaving, and Counting
 PROCEEDINGS OF THE TWENTYSECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE
, 2011
Cited by 4 (4 self)
A major challenge of query language design is the combination of expressivity with effective static analyses such as query containment. In the setting of XML, documents are seen as finite trees, whose structure may additionally be constrained by type constraints such as those described by an XML schema. We consider the problem of query containment in the presence of type constraints for a class of regular path queries extended with counting and interleaving operators. The counting operator restricts the number of occurrences of children nodes satisfying a given logical property. The interleaving operator provides a succinct notation for describing the absence of order between nodes satisfying a logical property. We provide a logicbased framework supporting these operators, which can be used to solve common query reasoning problems such as satisfiability and containment of queries in exponential time.
Efficiently Deciding µcalculus with Converse over Finite Trees
, 2013
Cited by 4 (2 self)
We present a sound and complete satisfiabilitytesting algorithm and its effective implementation for an alternationfree modal µcalculus with converse, where formulas are cyclefree, and which is interpreted over finite ordered trees. The time complexity of the satisfiabilitytesting algorithm is 2 O(n) in terms of formula size n. The algorithm is implemented using symbolic techniques (BDD). We present crucial implementation techniques and heuristics that we used to make the algorithm as fast as possible in practice. Our implementation is available online, and can be used to solve logical formulas of practically significant size.
View update translation for XML
 in "14th International Conference on Database Theory (ICDT
Cited by 3 (0 self)
We study the problem of update translation for views on XML documents. More precisely, given an XML view definition and a user defined view update program, find a source update program that translates the view update without side effects on the view. Additionally, we require the translation to be defined on all possible source documents; this corresponds to Hegner’s notion of uniform translation. The existence of such translation would allow to update XML views without the need of materialization. The class of views we consider can remove parts of the document and rename nodes. Our update programs define the simultaneous application of a collection of atomic update operations among insertion/deletion of a subtree and node renaming. Such update programs are compatible with the XQuery Update Facility (XQUF) snapshot semantics. Both views and update programs are represented by recognizable tree languages. We present as a proof of concept a small fragment of XQUF that can be expressed by our update programs, thus allows for update propagation. Two settings for the update problem are studied: without source constraints, where all source updates are allowed, and with source constraints, where there is a restricted set of authorized source updates. Using tree automata techniques, we establish that without constraints, all view updates are uniformly translatable and the translation is tractable. In presence of constraints, not all view updates are uniformly translatable. However, we introduce a reasonable restriction on update programs for which uniform translation with constraints becomes possible. All authors are members of MOSTRARE, joint team of
A Trichotomy for Regular Simple Path Queries on Graphs
 IN SYMPOSIUM ON PRINCIPLES OF DATABASE SYSTEMS (PODS). ACM
, 2012
Cited by 3 (1 self)
Regular path queries (RPQs) select nodes connected by some path in a graph. The edge labels of such a path have to form a word that matches a given regular expression. We investigate the evaluation of RPQs with an additional constraint that prevents multiple traversals of the same nodes. Those regular simple path queries (RSPQs) find several applications in practice, yet they quickly become intractable, even for basic languages such as (aa) ∗ or a∗ba∗. In this paper, we establish a comprehensive classification of regular languages with respect to the complexity of the corresponding regular simple path query problem. More precisely, we identify the fragment that is maximal in the following sense: regular simple path queries can be evaluated in polynomial time for every regular languageL that belongs to this fragment and evaluation is NPcomplete for languages outside this fragment. We thus fully characterize the frontier between tractability and intractability for RSPQs, and we refine our results to show the following trichotomy: Evaluations of RSPQs is either AC0, NLcomplete or NPcomplete in data complexity, depending on the regular language L. The fragment identified also admits a simple characterization in terms of regular expressions. Finally, we also discuss the complexity of the following decision problem: decide, given a language L, whether finding a regular simple path for L is tractable. We consider several alternative representations of L: DFAs, NFAs or regular expressions, and prove that this problem is NLcomplete for the first representation and PSPACEcomplete for the other two. As a conclusion we extend our results from edgelabeled graphs to vertexlabeled graphs and vertexedge labeled graphs.
Fusing Effectful Comprehensions *
Cited by 1 (1 self)
Abstract List comprehensions provide a powerful abstraction mechanism for expressing computations over ordered collections of data declaratively without having to use explicit iteration constructs. This paper puts forth effectful comprehensions as an elegant way to describe list comprehensions that incorporate loop carried state. This is motivated by operations such as compression/decompression and serialization/deserialization that are common in log/data processing pipelines and require loopcarried state when processing an input stream of data. We build on the underlying theory of symbolic transducers to fuse pipelines of effectful comprehensions into a single representation, from which efficient code can be generated. Using background theory reasoning with an SMT solver our fusion and subsequent reachability based branch elimination algorithms can significantly reduce the complexity of the fused pipelines. Our implementation shows significant speedups over reasonable handwritten code (3×, on average) and a LINQ implementation of the pipeline (5×, on average) for a variety of examples, including scenarios for extracting fields with regular expressions, processing XML with XPath, and running queries over encoded data. Finally, we formalize the semantics of symbolic transducers and their compositions as a transduction monad, which provides a link between the automatatheoretic view and a monadic view of symbolic transducers.