Results 1 - 10
of
13
Regular path queries on graphs with data
- In ICDT’12
"... Graph data models received much attention lately due to applications in social networks, semantic web, biological databases and other areas. Typical query languages for graph databases retrieve their topology, while actual data stored in them is usually queried using standard relational mechanisms. ..."
Abstract
-
Cited by 16 (6 self)
- Add to MetaCart
(Show Context)
Graph data models received much attention lately due to applications in social networks, semantic web, biological databases and other areas. Typical query languages for graph databases retrieve their topology, while actual data stored in them is usually queried using standard relational mechanisms. Our goal is to develop techniques that combine these two modes of querying, and give us query languages that can ask questions about both data and topology. As the basic querying mechanism we consider regular path queries, with the key difference that conditions on paths between nodes now talk not only about labels but also specify how data changes along the path. Paths that combine edge labels with data values are closely related to data words, so for stating conditions in queries, we look at several data-word formalisms developed recently. We show that many of them immediately lead to intractable data complexity for graph queries, with the notable exception of register automata, which can specify many properties of interest, and have NLOGSPACE data and PSPACE combined complexity. As register automata themselves are not easy to use in querying, we define two types of extensions of regular expressions that are more userfriendly, and develop query evaluation techniques for them. For one class, regular expressions with memory, we achieve the same bounds as for automata, and for the other class, regular expressions with equality, we also obtain tractable combined complexity of query evaluation. In addition, we show that results extends to analogs of conjunctive regular path queries.
Schema mappings and data exchange for graph databases
- In Intl. Conf. on Database Theory (ICDT
, 2013
"... ABSTRACT Data exchange and schema mapping management have received little attention so far in the graph database scenario, and tools developed in this context for relational databases have significant drawbacks in the context of graph-structured data. In this paper we embark on the study of interop ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
(Show Context)
ABSTRACT Data exchange and schema mapping management have received little attention so far in the graph database scenario, and tools developed in this context for relational databases have significant drawbacks in the context of graph-structured data. In this paper we embark on the study of interoperability issues for graph databases, including schema mappings, data exchange and certain answers computation. We start by analyzing different possibilities for specifying mappings in graph databases. Our mapping languages are based on the most typical graph databases queries, ranging from regular path queries to conjunctions of nested regular expressions. They subsume all previously considered mapping languages, and let one express many data exchange scenarios in the graph database context. We study the problems of materializing solutions and query answering, in particular, the problem of computing universal representatives and certain answers for various classes of mappings. We show that both problems are difficult with respect to combined complexity, and that for the latter problem, even data complexity is high for some very simple mappings and queries. We then identify relevant classes of mappings and queries for which the problems of materializing solutions and query answering can be solved efficiently.
Efficient Data Partitioning Model for Heterogeneous Graphs
- in the Cloud. In ACM/IEEE SC
, 2013
"... As the size and variety of information networks continue to grow in many scientific and engineering domains, we wit-ness a growing demand for efficient processing of large het-erogeneous graphs using a cluster of compute nodes in the Cloud. One open issue is how to effectively partition a large grap ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
(Show Context)
As the size and variety of information networks continue to grow in many scientific and engineering domains, we wit-ness a growing demand for efficient processing of large het-erogeneous graphs using a cluster of compute nodes in the Cloud. One open issue is how to effectively partition a large graph to process complex graph operations efficiently. In this paper, we present VB-Partitioner − a distributed data partitioning model and algorithms for efficient processing of graph operations over large-scale graphs in the Cloud. Our VB-Partitioner has three salient features. First, it introduces vertex blocks (VBs) and extended vertex blocks (EVBs) as the building blocks for semantic partitioning of large graphs. Second, VB-Partitioner utilizes vertex block grouping algo-rithms to place those vertex blocks that have high corre-lation in graph structure into the same partition. Third, VB-Partitioner employs a VB-partition guided query parti-tioning model to speed up the parallel processing of graph pattern queries by reducing the amount of inter-partition query processing. We conduct extensive experiments on several real-world graphs with millions of vertices and bil-lions of edges. Our results show that VB-Partitioner signif-icantly outperforms the popular random block-based data partitioner in terms of query latency and scalability over large-scale graphs.
Schemaless and Structureless Graph Querying
"... Querying complex graph databases such as knowledge graphs is a challenging task for non-professional users. Due to their complex schemas and variational information descriptions, it becomes very hard for users to formulate a query that can be properly processed by the existing systems. We argue that ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
(Show Context)
Querying complex graph databases such as knowledge graphs is a challenging task for non-professional users. Due to their complex schemas and variational information descriptions, it becomes very hard for users to formulate a query that can be properly processed by the existing systems. We argue that for a user-friendly graph query engine, it must support various kinds of transformations such as synonym, abbreviation, and ontology. Furthermore, the derived query results must be ranked in a principled manner. In this paper, we introduce a novel framework enabling schemaless and structureless graph querying (SLQ), where a user need not describe queries precisely as required by most databases. The query engine is built on a set of transformation functions that automatically map keywords and linkages from a query to their matches in a graph. It automatically learns an effective ranking model, without assuming manually labeled training examples, and can efficiently return top ranked matches using graph sketch and belief propagation. The architecture of SLQ is elastic for “plug-in” new transformation functions and query logs. Our experimental results show that this new graph querying paradigm
A Trichotomy for Regular Simple Path Queries on Graphs
- IN SYMPOSIUM ON PRINCIPLES OF DATABASE SYSTEMS (PODS). ACM
, 2012
"... Regular path queries (RPQs) select nodes connected by some path in a graph. The edge labels of such a path have to form a word that matches a given regular expression. We investigate the evaluation of RPQs with an additional constraint that prevents multiple traversals of the same nodes. Those regul ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
Regular path queries (RPQs) select nodes connected by some path in a graph. The edge labels of such a path have to form a word that matches a given regular expression. We investigate the evaluation of RPQs with an additional constraint that prevents multiple traversals of the same nodes. Those regular simple path queries (RSPQs) find several applica-tions in practice, yet they quickly become intractable, even for basic languages such as (aa) ∗ or a∗ba∗. In this paper, we establish a comprehensive classification of regular languages with respect to the complexity of the corresponding regular simple path query problem. More pre-cisely, we identify the fragment that is maximal in the fol-lowing sense: regular simple path queries can be evaluated in polynomial time for every regular languageL that belongs to this fragment and evaluation is NP-complete for languages outside this fragment. We thus fully characterize the frontier between tractability and intractability for RSPQs, and we refine our results to show the following trichotomy: Evalua-tions of RSPQs is either AC0, NL-complete or NP-complete in data complexity, depending on the regular language L. The fragment identified also admits a simple characteriza-tion in terms of regular expressions. Finally, we also discuss the complexity of the following decision problem: decide, given a language L, whether find-ing a regular simple path for L is tractable. We consider several alternative representations of L: DFAs, NFAs or regular expressions, and prove that this problem is NL-complete for the first representation and PSPACE-complete for the other two. As a conclusion we extend our results from edge-labeled graphs to vertex-labeled graphs and vertex-edge labeled graphs.
Containment of Data Graph Queries
"... The graph database model is currently one of the most pop-ular paradigms for storing data, used in applications such as social networks, biological databases and the Semantic Web. Despite the popularity of this model, the develop-ment of graph database management systems is still in its infancy, and ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
(Show Context)
The graph database model is currently one of the most pop-ular paradigms for storing data, used in applications such as social networks, biological databases and the Semantic Web. Despite the popularity of this model, the develop-ment of graph database management systems is still in its infancy, and there are several fundamental issues regard-ing graph databases that are not fully understood. Indeed, while graph query languages that concentrate on topological properties are now well developed, not much is known about languages that can query both the topology of graphs and their underlying data. Our goal is to conduct a detailed study of static analysis problems for such languages. In this paper we consider the containment problem for several recently proposed classes of queries that manipulate both topology and data: regu-lar queries with memory, regular queries with data tests, and graph XPath. Our results show that the problem is in general undecidable for all of these classes. However, by allowing only positive data comparisons we find natural fragments that enjoy much better static analysis properties: the containment problem is decidable, and its computational complexity ranges from PSPACE-complete to EXPSPACE-complete. We also propose extensions of regular queries with an inverse operator, and study query evaluation and query containment for them.
ProbTree: A Query-Efficient Representation of Probabilistic Graphs Technical Paper
"... Information in many applications, such as mobile wireless systems, social networks, and road networks, is captured by graphs, in many cases uncertain. We study the problem of querying a probabilistic graph; in particular, we examine “source-to-target ” queries, such as computing the shortest path be ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
(Show Context)
Information in many applications, such as mobile wireless systems, social networks, and road networks, is captured by graphs, in many cases uncertain. We study the problem of querying a probabilistic graph; in particular, we examine “source-to-target ” queries, such as computing the shortest path between two vertices. Evaluating ST-queries over prob-abilistic graphs is #P-hard, as it requires examining an exponential number of “possible worlds”. Existing solutions to the ST-query problem, which sample possible worlds, have two downsides: (i) many samples are needed for reasonable accuracy, and (ii) a possible world can be very large. To tackle these issues, we study the ProbTree, a data struc-ture that stores a succinct representation of the probabilistic graph. Existing ST-query solutions are executed on top of this structure, with the number of samples and possible world sizes reduced. 1.
Asymmetric structurepreserving subgraph query for large graphs
- In ICDE
, 2015
"... Abstract-One fundamental type of query for graph databases is subgraph isomorphism queries (a.k.a subgraph queries). Due to the computational hardness of subgraph queries coupled with the cost of managing massive graph data, outsourcing the query computation to a third-party service provider has be ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
(Show Context)
Abstract-One fundamental type of query for graph databases is subgraph isomorphism queries (a.k.a subgraph queries). Due to the computational hardness of subgraph queries coupled with the cost of managing massive graph data, outsourcing the query computation to a third-party service provider has been an economical and scalable approach. However, confidentiality is known to be an important attribute of Quality of Service (QoS) in Query as a Service (QaaS). In this paper, we propose the first practical private approach for subgraph query services, asymmetric structure-preserving subgraph query processing, where the data graph is publicly known and the query structure/topology is kept secret. Unlike other previous methods for subgraph queries, this paper proposes a series of novel optimizations that only exploit graph structures, not the queries. Further, we propose a robust query encoding and adopt the novel cyclic group based encryption so that query processing is transformed into a series of private matrix operations. Our experiments confirm that our techniques are efficient and the optimizations are effective.
The Chilean Database Group
"... Abstract. During the last 15 years, the chilean researchers on databases have built a strong and cohesive group with wide international visiblity. In the present article, we briefly survey the history of the group and describe the research done by the team in five big areas: Semantic web databases, ..."
Abstract
- Add to MetaCart
(Show Context)
Abstract. During the last 15 years, the chilean researchers on databases have built a strong and cohesive group with wide international visiblity. In the present article, we briefly survey the history of the group and describe the research done by the team in five big areas: Semantic web databases, graph databases, data exchange, access control policies, and spatial databases. We also describe the international collaboration networks of the group and the participation of the group members in the organization of AMW, the most important regional event in data management. We finish the article by explaining what are the next steps for the group and what are the plans to achieve them.