Results 1  10
of
26
Query Optimization of Distributed Pattern Matching
"... Abstract—Greedy algorithms for subgraph pattern matching operations are often sufficient when the graph data set can be held in memory on a single machine. However, as graph data sets increasingly expand and require external storage and partitioning across a cluster of machines, more sophisticated q ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
(Show Context)
Abstract—Greedy algorithms for subgraph pattern matching operations are often sufficient when the graph data set can be held in memory on a single machine. However, as graph data sets increasingly expand and require external storage and partitioning across a cluster of machines, more sophisticated query optimization techniques become critical to avoid explosions in query latency. In this paper, we introduce several query optimization techniques for distributed graph pattern matching. These techniques include (1) a SystemR style dynamic programmingbased optimization algorithm that considers both linear and bushy plans, (2) a cycle detectionbased algorithm that leverages cycles to reduce intermediate result set sizes, and (3) a computation reusing technique that eliminates redundant query execution and data transfer over the network. Experimental results show that these algorithms can lead to an order of magnitude improvement in query performance. I.
A ContextBased Semantics for SPARQL Property Paths over the Web
 In Proceedings of the 12th Extended Semantic Web Conference (ESWC
, 2015
"... Abstract As of today, there exists no standard language for querying Linked Data on the Web, where navigation across distributed data sources is a key feature. A natural candidate seems to be SPARQL, which recently has been enhanced with navigational capabilities thanks to the introduction of proper ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
(Show Context)
Abstract As of today, there exists no standard language for querying Linked Data on the Web, where navigation across distributed data sources is a key feature. A natural candidate seems to be SPARQL, which recently has been enhanced with navigational capabilities thanks to the introduction of property paths (PPs). However, the semantics of SPARQL restricts the scope of navigation via PPs to single RDF graphs. This restriction limits the applicability of PPs on the Web. To fill this gap, in this paper we provide formal foundations for evaluating PPs on the Web, thus contributing to the definition of a query language for Linked Data. In particular, we introduce a query semantics for PPs that couples navigation at the data level with navigation on the Web graph. Given this semantics we find that for some PPbased SPARQL queries a complete evaluation on the Web is not feasible. To enable systems to identify queries that can be evaluated completely, we establish a decidable syntactic property of such queries. 1
A Trichotomy for Regular Simple Path Queries on Graphs
 IN SYMPOSIUM ON PRINCIPLES OF DATABASE SYSTEMS (PODS). ACM
, 2012
"... Regular path queries (RPQs) select nodes connected by some path in a graph. The edge labels of such a path have to form a word that matches a given regular expression. We investigate the evaluation of RPQs with an additional constraint that prevents multiple traversals of the same nodes. Those regul ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
Regular path queries (RPQs) select nodes connected by some path in a graph. The edge labels of such a path have to form a word that matches a given regular expression. We investigate the evaluation of RPQs with an additional constraint that prevents multiple traversals of the same nodes. Those regular simple path queries (RSPQs) find several applications in practice, yet they quickly become intractable, even for basic languages such as (aa) ∗ or a∗ba∗. In this paper, we establish a comprehensive classification of regular languages with respect to the complexity of the corresponding regular simple path query problem. More precisely, we identify the fragment that is maximal in the following sense: regular simple path queries can be evaluated in polynomial time for every regular languageL that belongs to this fragment and evaluation is NPcomplete for languages outside this fragment. We thus fully characterize the frontier between tractability and intractability for RSPQs, and we refine our results to show the following trichotomy: Evaluations of RSPQs is either AC0, NLcomplete or NPcomplete in data complexity, depending on the regular language L. The fragment identified also admits a simple characterization in terms of regular expressions. Finally, we also discuss the complexity of the following decision problem: decide, given a language L, whether finding a regular simple path for L is tractable. We consider several alternative representations of L: DFAs, NFAs or regular expressions, and prove that this problem is NLcomplete for the first representation and PSPACEcomplete for the other two. As a conclusion we extend our results from edgelabeled graphs to vertexlabeled graphs and vertexedge labeled graphs.
Walk logic as a framework for path query languages on graph databases
 In Proceedings of the 16th International Conference on Database Theory, ICDT ’13
, 2013
"... ABSTRACT Motivated by the current interest in languages for expressing path queries to graph databases, this paper proposes to investigate Walk Logic (WL): the extension of firstorder logic on finite graphs with the possibility to explicitly quantify over walks. WL can serve as a unifying framewor ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
(Show Context)
ABSTRACT Motivated by the current interest in languages for expressing path queries to graph databases, this paper proposes to investigate Walk Logic (WL): the extension of firstorder logic on finite graphs with the possibility to explicitly quantify over walks. WL can serve as a unifying framework for path query languages. To support this claim, WL is compared in expressive power with various established query languages for graphs, such as firstorder logic extended with reachability; the monadic secondorder logic of graphs; hybrid computation tree logic; and regular path queries. WL also serves as a framework to investigate the following natural questions: Is quantifying over walks more powerful than quantifying over paths (walks without repeating nodes) only? Is quantifying over infinite walks more powerful than quantifying over finite walks only? WL model checking is decidable, but determining the precise complexity remains an open problem.
On Implementing ProvenanceAware Regular Path Queries with Relational Query Engines
"... Use of graphs is growing rapidly in social networks, semantic web, biological databases, scientific workflow provenance, and other areas. Regular Path Queries (RPQs) can be seen as a core graph query language to answer patternbased reachability queries. Unfortunately, the number of freely available ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
(Show Context)
Use of graphs is growing rapidly in social networks, semantic web, biological databases, scientific workflow provenance, and other areas. Regular Path Queries (RPQs) can be seen as a core graph query language to answer patternbased reachability queries. Unfortunately, the number of freely available systems for querying graphs using RPQs is rather limited, and available implementations do not provide direct support for a number of desirable variants of RPQs, e.g., to return those edges that are contained in some (or all) paths that match the given regular expression R. Thus, by returning not just a pair (x, y) of end points of paths that match R, but also “witness edges ” (u, v) inbetween, our RPQ variants can be understood as returning additional provenance information about the answer (x, y), i.e., those edges (u, v) that are in some (or all) paths from x to y matching R. We propose a number of such RPQ variants and show how they can be implemented using either Datalog or a suitable RDBMS. Our initial experimental results indicate that RPQs and our provenanceaware variants (RPQProv), when implemented using conventional relational technologies, yield reasonable performance even for relatively large graphs. On the other hand, the overhead associated with some of these variants also makes efficient handling of provenanceaware graph queries an interesting challenge for future research.
From dbpedia to wikipedia: Filling the gap by discovering wikipedia conventions
 In 2012 IEEE/WIC/ACM International Conference on Web Intelligence (WI’12
, 2012
"... Abstract—Many relations existing in DBpedia are missing in Wikipedia yielding up an information gap between the semantic web and the social web. Inserting these missing relations requires to automatically discover Wikipedia conventions. From pairs linked by a property p in DBpedia, we find path quer ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
(Show Context)
Abstract—Many relations existing in DBpedia are missing in Wikipedia yielding up an information gap between the semantic web and the social web. Inserting these missing relations requires to automatically discover Wikipedia conventions. From pairs linked by a property p in DBpedia, we find path queries that link the same pairs in Wikipedia. We make the hypothesis that the shortest path query with maximal containment captures the Wikipedia convention for p. We computed missing links and conventions for different DBpedia queries. Next, we inserted some missing links according to computed conventions in Wikipedia and evaluated Wikipedians feedback. Nearly all contributions has been accepted. In this paper, we detail the path indexing algorithms, the results of evaluations and give some details about social feedback. KeywordsWikipedia Conventions; DBpedia; Wikipedia I.
Conjunctive ContextFree Path Queries
, 2014
"... In graph query languages, regular expressions are commonly used to specify the labeling of paths. A natural step in increasing the expressive power of these query languages is replacing regular expressions by contextfree grammars. With the Conjunctive ContextFree Path Queries (CCFPQ) we introduce ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
(Show Context)
In graph query languages, regular expressions are commonly used to specify the labeling of paths. A natural step in increasing the expressive power of these query languages is replacing regular expressions by contextfree grammars. With the Conjunctive ContextFree Path Queries (CCFPQ) we introduce such a language based on the wellknown Conjunctive Regular Path Queries (CRPQ). First, we show that query evaluation of CCFPQ has polynomial time data complexity. Secondly, we look at the generalization of regular expressions, as used in CRPQ, to regular relations and show how similar generalizations can be applied to contextfree grammars, as used in CCFPQ. Thirdly, we investigate the relations between the expressive power of CRPQ, CCFPQ, and their generalizations. In several cases we show that replacing regular expressions by contextfree grammars does increase expressive power. Finally, we look at including contextfree grammars in more powerful logics than conjunctive queries. We do so by adding negation and provide expressivity relations between the obtained languages.
Federation and Navigation in SPARQL 1.1
 In Proceedings of Reasoning Web
, 2012
"... Abstract SPARQL is now widely used as the standard query language for RDF. Since the release of its first version in 2008, the W3C group in charge of the standard has been working on extensions of the language to be included in the new version, SPARQL 1.1. These extensions include several interestin ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
(Show Context)
Abstract SPARQL is now widely used as the standard query language for RDF. Since the release of its first version in 2008, the W3C group in charge of the standard has been working on extensions of the language to be included in the new version, SPARQL 1.1. These extensions include several interesting and very useful features for querying RDF. In this paper, we survey two key features of SPARQL 1.1: Federation and navigation capabilities. We first introduce the SPARQL standard presenting its syntax and formal semantics. We then focus on the formalization of federation and navigation in SPARQL 1.1. We analyze some classical theoretical problems such as expressiveness and complexity, and discuss algorithmic properties. Moreover, we present some important recently discovered issues regarding the normative semantics of federation and navigation in SPARQL 1.1, specifically, on the impossibility of answering some unbounded federated queries and the high computational complexity of the evaluation problem for queries including navigation functionalities. Finally, we discuss on possible alternatives to overcome these issues and their implications on the adoption of the standard. 1