Results 1 - 10
of
113
DBpedia: A Nucleus for a Web of Open Data
- In 6th Int’l Semantic Web Conference, Busan, Korea
, 2007
"... Abstract DBpedia is a community effort to extract structured information from Wikipedia and to make this information available on the Web. DBpedia allows you to ask sophisticated queries against datasets derived from Wikipedia and to link other datasets on the Web to Wikipedia data. We describe the ..."
Abstract
-
Cited by 203 (19 self)
- Add to MetaCart
Abstract DBpedia is a community effort to extract structured information from Wikipedia and to make this information available on the Web. DBpedia allows you to ask sophisticated queries against datasets derived from Wikipedia and to link other datasets on the Web to Wikipedia data. We describe the extraction of the DBpedia datasets, and how the resulting information is published on the Web for human- and machineconsumption. We describe some emerging applications from the DBpedia community and show how website authors can facilitate DBpedia content within their sites. Finally, we present the current status of interlinking DBpedia with other open datasets on the Web and outline how DBpedia could serve as a nucleus for an emerging Web of open data. 1
Piazza: Data Management Infrastructure for Semantic Web
, 2003
"... The Semantic Web envisions a World Wide Web in which data is described with rich semantics and applications can pose complex queries. To this point, researchers have defined new languages for specifying meanings for concepts and developed techniques for reasoning about them, using RDF as the data mo ..."
Abstract
-
Cited by 134 (11 self)
- Add to MetaCart
The Semantic Web envisions a World Wide Web in which data is described with rich semantics and applications can pose complex queries. To this point, researchers have defined new languages for specifying meanings for concepts and developed techniques for reasoning about them, using RDF as the data model. To flourish, the Semantic Web needs to be able to accommodate the huge amounts of existing data and the applications operating on them. To achieve this, we are faced with two problems. First, most of the world's data is available not in RDF but in XML; XML and the applications consuming it rely not only on the domain structure of the data, but also on its document structure. Hence, to provide interoperability between such sources, we must map between both their domain structures and their document structures. Second, data management practitioners often prefer to exchange data through local point-to-point data translations, rather than mapping to common mediated schemas or ontologies.
PeerDB: A P2P-based System for Distributed Data Sharing
, 2003
"... In this paper, we present the design and evaluation of PeerDB, a peer-to-peer (P2P) distributed data sharing system. PeerDB distinguishes itself from existing P2P systems in several ways. First, it is a full-fledge data management system that supports fine-grain content-based searching. Second, it f ..."
Abstract
-
Cited by 106 (10 self)
- Add to MetaCart
In this paper, we present the design and evaluation of PeerDB, a peer-to-peer (P2P) distributed data sharing system. PeerDB distinguishes itself from existing P2P systems in several ways. First, it is a full-fledge data management system that supports fine-grain content-based searching. Second, it facilitates sharing of data without shared schema. Third, it combines the power of mobile agents into P2P systems to perform operations at peers' sites. Fourth, PeerDB network is self-configurable, i.e., a node can dynamically optimize the set of peers that it can communicate directly with based on some optimization criterion. By keeping peers that provide most information or services in close proximity (i.e, direct communication), the network bandwidth can be better utilized and system performance can be optimized. We implemented and evaluated PeerDB on a cluster of 32 Pentium II PCs. Our experimental results show that PeerDB can effectively exploit P2P technologies for distributed data sharing.
Composing Mappings among Data Sources
- In VLDB
, 2003
"... Semantic mappings between data sources play a key role in several data sharing architectures. Mappings provide the relationships between data stored in different sources, and therefore enable answering queries that require data from other nodes in a data sharing network. Composing mappings is one of ..."
Abstract
-
Cited by 98 (7 self)
- Add to MetaCart
Semantic mappings between data sources play a key role in several data sharing architectures. Mappings provide the relationships between data stored in different sources, and therefore enable answering queries that require data from other nodes in a data sharing network. Composing mappings is one of the core problems that lies at the heart of several optimization methods in data sharing networks, such as caching frequently traversed paths and redundancy analysis.
Efficient Query Reformulation in Peer Data Management Systems
, 2004
"... Peer data management systems (PDMS) offer a flexible architecture for decentralized data sharing. In a PDMS, every peer is associated with a schema that represents the peer's domain of interest, and semantic relationships between peers are provided locally between pairs (or small sets) of peers. By ..."
Abstract
-
Cited by 92 (3 self)
- Add to MetaCart
Peer data management systems (PDMS) offer a flexible architecture for decentralized data sharing. In a PDMS, every peer is associated with a schema that represents the peer's domain of interest, and semantic relationships between peers are provided locally between pairs (or small sets) of peers. By traversing semantic paths of mappings, a query over one peer can obtain relevant data from any reachable peer in the network. Semantic paths are traversed by reformulating queries at a peer into queries on its neighbors. Naively following semantic paths...
Logical foundations of peer-to-peer data integration
- In Proc. of the 23rd ACM SIGACT SIGMOD SIGART Sym. on Principles of Database Systems (PODS-2004
, 2004
"... In peer-to-peer data integration, each peer exports data in terms of its own schema, and data interoperation is achieved by means of mappings among the peer schemas. Peers are autonomous systems and mappings are dynamically created and changed. One of the challenges in these systems is answering que ..."
Abstract
-
Cited by 77 (12 self)
- Add to MetaCart
In peer-to-peer data integration, each peer exports data in terms of its own schema, and data interoperation is achieved by means of mappings among the peer schemas. Peers are autonomous systems and mappings are dynamically created and changed. One of the challenges in these systems is answering queries posed to one peer taking into account the mappings. Obviously, query answering strongly depends on the semantics of the overall system. In this paper, we compare the commonly adopted approach of interpreting peerto-peer systems using a first-order semantics, with an alternative approach based on epistemic logic. We consider several central properties of peer-to-peer systems: modularity, generality, and decidability. We argue that the approach based on epistemic logic is superior with respect to all the above properties. In particular, we show that, in systems in which peers have decidable schemas and conjunctive mappings, but are arbitrarily interconnected, the first-order approach may lead to undecidability of query answering, while the epistemic approach always preserves decidability. This is a fundamental property, since the actual interconnections among peers are not under the control of any actor in the system. 1.
The Architecture of PIER: an Internet-Scale Query Processor
- In CIDR
, 2005
"... This paper presents the architecture of PIER , an Internetscale query engine we have been building over the last three years. PIER is the first general-purpose relational query processor targeted at a peer-to-peer (p2p) architecture of thousands or millions of participating nodes on the Internet. ..."
Abstract
-
Cited by 59 (5 self)
- Add to MetaCart
This paper presents the architecture of PIER , an Internetscale query engine we have been building over the last three years. PIER is the first general-purpose relational query processor targeted at a peer-to-peer (p2p) architecture of thousands or millions of participating nodes on the Internet. It supports massively distributed, database-style dataflows for snapshot and continuous queries. It is intended to serve as a building block for a diverse set of Internet-scale informationcentric applications, particularly those that tap into the standardized data readily available on networked machines, including packet headers, system logs, and file names
Update Exchange with Mappings and Provenance
- In Very Large Data Bases (VLDB
, 2007
"... We consider systems for data sharing among heterogeneous peers related by a network of schema mappings. Each peer has a locally controlled and edited database instance, but wants to ask queries over related data from other peers as well. To achieve this, every peer’s updates propagate along the mapp ..."
Abstract
-
Cited by 44 (25 self)
- Add to MetaCart
We consider systems for data sharing among heterogeneous peers related by a network of schema mappings. Each peer has a locally controlled and edited database instance, but wants to ask queries over related data from other peers as well. To achieve this, every peer’s updates propagate along the mappings to the other peers. However, this update exchange is filtered by trust conditions — expressing what data and sources a peer judges to be authoritative — which may cause a peer to reject another’s updates. In order to support such filtering, updates carry provenance information. These systems target scientific data sharing applications, and their general principles and architecture have been described in [20]. In this paper we present methods for realizing such systems. Specifically, we extend techniques from data integration, data exchange, and incremental view maintenance to propagate updates along mappings; we integrate a novel model for tracking data provenance, such that curators may filter updates based on trust conditions over this provenance; we discuss strategies for implementing our techniques in conjunction with an RDBMS; and we experimentally demonstrate the viability of our techniques in the ORCHES-TRA prototype system. 1.
Crossing the Structure Chasm
- IN CIDR
, 2003
"... It has frequently been observed that most of the world's data lies outside database systems. The reason is that database systems focus on structured data, leaving the unstructured realm to others. The world of unstructured data has several very appealing properties, such as ease of authoring, query ..."
Abstract
-
Cited by 42 (15 self)
- Add to MetaCart
It has frequently been observed that most of the world's data lies outside database systems. The reason is that database systems focus on structured data, leaving the unstructured realm to others. The world of unstructured data has several very appealing properties, such as ease of authoring, querying and data sharing. In contrast, authoring, querying and sharing structured data require significant effort, albeit with the benefit of rich query languages and exact answers. We argue
A Robust Logical and Computational Characterisation of Peer-to-Peer Database Systems
, 2003
"... In this paper we give a robust logical and computational characterisation of peer-to-peer database systems. We rst de ne a precise model-theoretic semantics of a peer-to-peer system, which allows for local inconsistency handling. We then characterise the general computational properties for th ..."
Abstract
-
Cited by 38 (9 self)
- Add to MetaCart
In this paper we give a robust logical and computational characterisation of peer-to-peer database systems. We rst de ne a precise model-theoretic semantics of a peer-to-peer system, which allows for local inconsistency handling. We then characterise the general computational properties for the problem of answering queries to such a peer-topeer system. Finally, we devise tight complexity bounds and distributed procedures for the problem of answering queries in few relevant special cases.

