DMCA
Diversified Stress Testing of RDF Data Management Systems
Citations: | 11 - 1 self |
Citations
738 | Linked data - the story so far
- Bizer, Heath, et al.
(Show Context)
Citation Context ...ads. Keywords: RDF, SPARQL, systems, benchmarking, workload diversity 1 Introduction With the proliferation of very large, heterogeneous RDF datasets such as those in the Linked Open Data (LOD) cloud =-=[6]-=-, there is increasing demand for high-performance RDF data management systems. A number of such systems have been developed; however, as queries executed on these systems become increasingly more dive... |
651 | Dbpedia: A nucleus for a web of open data
- Auer, Bizer, et al.
- 2007
(Show Context)
Citation Context ...arious SPARQL features such as union and optional graph patterns. – The DBpedia SPARQL Benchmark [17] (DBSB) uses queries that have been generated by mining actual query logs over the DBpedia dataset =-=[5]-=-. Thus, it contains a more “diverse set of queries” [17]. We assess the diversity of existing benchmarks using the structural and data-driven features presented in Section 2. In our evaluations of ben... |
543 | Sesame: A generic architecture for storing and querying RDF and RDF schema
- Broekstra, Kampman, et al.
- 2002
(Show Context)
Citation Context ...f their data representations: (i) tabular and (ii) graph-based. For tabular implementations, one option is to represent data in a single large table. While earlier triplestores followed this approach =-=[8,9]-=-, it has been demonstrated that maintaining redundant copies with different sort orders and indexes can be much more effective [19]. Consequently, in our evaluations we include the popular prototype R... |
378 | LUBM: A Benchmark for OWL Knowledge Base Systems
- Guo, Pan, et al.
- 2005
(Show Context)
Citation Context ...x)]G, respectively, then SEL F G(tp | x) = ∣ ∣ ∣ ′ ′ ′ {µ ∈ Ω | ∃µ ∈ Ω : µ and µ are compatible } ∣ ∣Ω ∣ . (4)6 3 Evaluation of Existing SPARQL Benchmarks Even though existing SPARQL benchmarks [7], =-=[12]-=-, [17], [20] offer valuable testing capabilities, we demonstrate in this section that they are not suitable for stress testing RDF systems. We consider the following 4 benchmarks: – The Lehigh Univers... |
261 | Jena: implementing the semantic web recommendations
- Carroll, Dickinson, et al.
- 2004
(Show Context)
Citation Context ...f their data representations: (i) tabular and (ii) graph-based. For tabular implementations, one option is to represent data in a single large table. While earlier triplestores followed this approach =-=[8,9]-=-, it has been demonstrated that maintaining redundant copies with different sort orders and indexes can be much more effective [19]. Consequently, in our evaluations we include the popular prototype R... |
140 | The berlin sparql benchmark.
- Bizer, Schultz
- 2009
(Show Context)
Citation Context ...[λF(Bx)]G, respectively, then SEL F G(tp | x) = ∣ ∣ ∣ ′ ′ ′ {µ ∈ Ω | ∃µ ∈ Ω : µ and µ are compatible } ∣ ∣Ω ∣ . (4)6 3 Evaluation of Existing SPARQL Benchmarks Even though existing SPARQL benchmarks =-=[7]-=-, [12], [17], [20] offer valuable testing capabilities, we demonstrate in this section that they are not suitable for stress testing RDF systems. We consider the following 4 benchmarks: – The Lehigh U... |
117 |
The RDF-3X engine for scalable management of RDF data.
- Neumann, Weikum
- 2010
(Show Context)
Citation Context ...ngle large table. While earlier triplestores followed this approach [8,9], it has been demonstrated that maintaining redundant copies with different sort orders and indexes can be much more effective =-=[19]-=-. Consequently, in our evaluations we include the popular prototype RDF-3x [19] (v0.3.7) that follows the latter approach. It has also been argued that grouping data can significantly improve performa... |
110 | C.: SP2Bench: A SPARQL Performance Benchmark. In: Data Engineering
- Schmidt, Hornung, et al.
- 2009
(Show Context)
Citation Context ...tively, then SEL F G(tp | x) = ∣ ∣ ∣ ′ ′ ′ {µ ∈ Ω | ∃µ ∈ Ω : µ and µ are compatible } ∣ ∣Ω ∣ . (4)6 3 Evaluation of Existing SPARQL Benchmarks Even though existing SPARQL benchmarks [7], [12], [17], =-=[20]-=- offer valuable testing capabilities, we demonstrate in this section that they are not suitable for stress testing RDF systems. We consider the following 4 benchmarks: – The Lehigh University Benchmar... |
84 | Scalable join processing on very large RDF graphs
- Neumann, Weikum
- 2009
(Show Context)
Citation Context ... the overall “selectiveness” of a CBGP can be attributed to a single triple pattern in that CBGP. In that case, a system could use runtime optimization techniques such as sideways-information passing =-=[18]-=- to early-prune most of the intermediate results, which may not be possible in the original example (for a more detailed discussion refer to [2]). From a testing point of view, it is important to incl... |
76 | SPARQL basic graph pattern optimization using selectivity estimation.
- Stocker, Seaborne, et al.
- 2008
(Show Context)
Citation Context ...query (execution) plan depends on the characteristics of the data as much as the query itself. For example, systems rely heavily on selectivity and cardinality estimations for query plan optimization =-=[23]-=-. Consider the following example: A system chooses to break down a BGP B = {tpA, tpB, tpC} into its triple patterns and to execute them in a specific order, namely, tpA, tpB and then tpC. The system p... |
73 |
Column-store support for RDF data management: not all swans are white.
- Sidirourgos, Goncalves, et al.
- 2008
(Show Context)
Citation Context ...evaluations we include the popular prototype RDF-3x [19] (v0.3.7) that follows the latter approach. It has also been argued that grouping data can significantly improve performance for some workloads =-=[22]-=-. Hence, a second option is to group data by RDF predicates, where data are explicitly partitioned into multiple tables (one table per predicate) and the tables are stored in a column-store [1]. We te... |
72 | SW-Store: A vertically partitioned DBMS for semantic web data management.
- Abadi, Marcus, et al.
- 2009
(Show Context)
Citation Context ...loads [22]. Hence, a second option is to group data by RDF predicates, where data are explicitly partitioned into multiple tables (one table per predicate) and the tables are stored in a column-store =-=[1]-=-. We test the effectiveness of this approach on MonetDB [15] (v1.7), which is a state-of-the-art column-store. A third option is to natively represent RDF graph structure, for which we use the prototy... |
53 | N.: 4store: The design and implementation of a clustered RDF store.
- Harris, Lamb, et al.
- 2009
(Show Context)
Citation Context ... represent RDF graph structure, for which we use the prototype system gStore [24] (v0.2). We also test three industrial systems, namely, Virtuoso Open Source (VOS) [11] (v6.1.8 and v7.1.0) and 4Store =-=[13]-=- (v1.1.5). Both VOS and 4Store group and index data primarily based on RDF predicates. Furthermore, VOS 6.1 is a row-store and VOS 7.1 is a column-store.12 5.2 Experimental Setup In our experiments, ... |
33 |
Apples and oranges: a comparison of RDF benchmarks and real RDF datasets.
- Duan, Kementsietsidis, et al.
- 2011
(Show Context)
Citation Context ...s increasing demand for high-performance RDF data management systems. A number of such systems have been developed; however, as queries executed on these systems become increasingly more diverse [4], =-=[10]-=-, [16], these systems have started to display unpredictable behaviour, even to the extent that on some queries they time out (cf., Fig. 4). This behaviour is not captured by existing benchmarks. The p... |
32 |
la Fuente. An empirical study of real-world SPARQL queries
- Arias, Fernández, et al.
(Show Context)
Citation Context ...ere is increasing demand for high-performance RDF data management systems. A number of such systems have been developed; however, as queries executed on these systems become increasingly more diverse =-=[4]-=-, [10], [16], these systems have started to display unpredictable behaviour, even to the extent that on some queries they time out (cf., Fig. 4). This behaviour is not captured by existing benchmarks.... |
31 | MonetDB: Two decades of research in column-oriented database architectures
- Idreos, Groffen, et al.
(Show Context)
Citation Context ...predicates, where data are explicitly partitioned into multiple tables (one table per predicate) and the tables are stored in a column-store [1]. We test the effectiveness of this approach on MonetDB =-=[15]-=- (v1.7), which is a state-of-the-art column-store. A third option is to natively represent RDF graph structure, for which we use the prototype system gStore [24] (v0.2). We also test three industrial ... |
29 | Semantics of SPARQL. In:
- Pérez, Arenas, et al.
- 2006
(Show Context)
Citation Context ..., referred to as a constrained BGP (CBGP), where B is a finite set of triple patterns (i.e., a BGP) and F is a finite set of SPARQL filter expressions. Hence, by using the algebraic syntax for SPARQL =-=[3]-=-, a CBGP ¯B = 〈B, F 〉 with F = {f1, ... , fn} is equivalent to a SPARQL graph pattern P of the form ( ) (...(B FILTER f1)... )FILTER fn (if F = ∅, then P is the BGP B). Consequently, the evaluation of... |
25 | gStore: Answering SPARQL queries via subgraph matching.
- Zou, Mo, et al.
- 2011
(Show Context)
Citation Context ...ectiveness of this approach on MonetDB [15] (v1.7), which is a state-of-the-art column-store. A third option is to natively represent RDF graph structure, for which we use the prototype system gStore =-=[24]-=- (v0.2). We also test three industrial systems, namely, Virtuoso Open Source (VOS) [11] (v6.1.8 and v7.1.0) and 4Store [13] (v1.1.5). Both VOS and 4Store group and index data primarily based on RDF pr... |
6 |
a Hybrid RDBMS/Graph Column Store
- Erling, “Virtuoso
- 2012
(Show Context)
Citation Context ...store. A third option is to natively represent RDF graph structure, for which we use the prototype system gStore [24] (v0.2). We also test three industrial systems, namely, Virtuoso Open Source (VOS) =-=[11]-=- (v6.1.8 and v7.1.0) and 4Store [13] (v1.1.5). Both VOS and 4Store group and index data primarily based on RDF predicates. Furthermore, VOS 6.1 is a row-store and VOS 7.1 is a column-store.12 5.2 Exp... |
6 | From linked data to relevant data – time is the essence
- Kirchberg, Ko, et al.
(Show Context)
Citation Context ...easing demand for high-performance RDF data management systems. A number of such systems have been developed; however, as queries executed on these systems become increasingly more diverse [4], [10], =-=[16]-=-, these systems have started to display unpredictable behaviour, even to the extent that on some queries they time out (cf., Fig. 4). This behaviour is not captured by existing benchmarks. The problem... |
6 |
A.C.N.: Dbpedia sparql benchmark - performance assessment with real queries on real data
- Morsey, Lehmann, et al.
- 2011
(Show Context)
Citation Context ...d structurally complex queries. Ideally, we would like an RDF system to execute simple queries extremely fast while scaling well with increasing number of triple patterns. In fact, DBpedia query logs =-=[17]-=- reveal that while in general most queries contain only a few triple patterns, users may issue (albeit infrequently) queries having more than 50 triple patterns. Join Vertex Count: This feature repres... |
4 | Workload Matters: Why RDF Databases Need a New Design.
- Aluc, Ozsu, et al.
- 2014
(Show Context)
Citation Context ... in physical design. For example, in a previous work, we illustrate that the choice of physical design in an RDF system is very sensitive to the types of joins that the system can efficiently support =-=[2]-=-. Hence, we introduce a feature called “join vertex type”. Likewise, we note that a system’s performance depends on the characteristics of the data as much as the query itself. Consequently, we introd... |