Results 1 - 10
of
53
Update Exchange with Mappings and Provenance
- In Very Large Data Bases (VLDB
, 2007
"... We consider systems for data sharing among heterogeneous peers related by a network of schema mappings. Each peer has a locally controlled and edited database instance, but wants to ask queries over related data from other peers as well. To achieve this, every peer’s updates propagate along the mapp ..."
Abstract
-
Cited by 44 (25 self)
- Add to MetaCart
We consider systems for data sharing among heterogeneous peers related by a network of schema mappings. Each peer has a locally controlled and edited database instance, but wants to ask queries over related data from other peers as well. To achieve this, every peer’s updates propagate along the mappings to the other peers. However, this update exchange is filtered by trust conditions — expressing what data and sources a peer judges to be authoritative — which may cause a peer to reject another’s updates. In order to support such filtering, updates carry provenance information. These systems target scientific data sharing applications, and their general principles and architecture have been described in [20]. In this paper we present methods for realizing such systems. Specifically, we extend techniques from data integration, data exchange, and incremental view maintenance to propagate updates along mappings; we integrate a novel model for tracking data provenance, such that curators may filter updates based on trust conditions over this provenance; we discuss strategies for implementing our techniques in conjunction with an RDBMS; and we experimentally demonstrate the viability of our techniques in the ORCHES-TRA prototype system. 1.
DL-Lite in the light of first-order logic
- IN PROC. OF THE 22ND CONF. ON AI (AAAI-07)
, 2007
"... The use of ontologies in various application domains, such as Data Integration, the Semantic Web, or ontology-based data management, where ontologies provide the access to large amounts of data, is posing challenging requirements w.r.t. a trade-off between expressive power of a DL and efficiency of ..."
Abstract
-
Cited by 37 (23 self)
- Add to MetaCart
The use of ontologies in various application domains, such as Data Integration, the Semantic Web, or ontology-based data management, where ontologies provide the access to large amounts of data, is posing challenging requirements w.r.t. a trade-off between expressive power of a DL and efficiency of reasoning. The logics of the DL-Lite family were specifically designed to meet such requirements and optimized w.r.t. the data complexity of answering complex types of queries. In this paper we propose DL-Litebool, an extension of DL-Lite with full Booleans and number restrictions, and study the complexity of reasoning in DL-Litebool and its significant sub-logics. We obtain our results, together with useful insights into the properties of the studied logics, by a novel reduction to the one-variable fragment of first-order logic. We study the computational complexity of satisfiability and subsumption, and the data complexity of answering positive existential queries (which extend unions of conjunctive queries). Notably, we extend the LOGSPACE upper bound for the data complexity of answering unions of conjunctive queries in DL-Lite to positive queries and to the possibility of expressing also number restrictions, and hence local functionality in the TBox.
Aspects of distributed and modular ontology reasoning
- In IJCAI
, 2005
"... We investigate a formalism for reasoning with multiple local ontologies, connected by directional semantic mappings. We propose: (1) a relatively small change of semantics which localizes inconsistency (thereby making unnecessary global satisfiability checks), and preserves directionality of “knowle ..."
Abstract
-
Cited by 31 (10 self)
- Add to MetaCart
We investigate a formalism for reasoning with multiple local ontologies, connected by directional semantic mappings. We propose: (1) a relatively small change of semantics which localizes inconsistency (thereby making unnecessary global satisfiability checks), and preserves directionality of “knowledge import”; (2) a characterization of inferences using a fixed-point operator, which can form the basis of a cache-based implementation for local reasoners; (3) a truly distributed tableaux algorithm for cases when the local reasoners use subsets of SHIQ. Throughout, we indicate the applicability of the results to several recent proposals for knowledge representation and reasoning that support modularity, scalability and distributed reasoning. 1
Reconciling while Tolerating Disagreement in Collaborative Data Sharing
, 2006
"... In many data sharing settings, such as within the biological and biomedical communities, global data consistency is not always attainable: different sites’ data may be dirty, uncertain, or even controversial. Collaborators are willing to share their data, and in many cases they also want to selectiv ..."
Abstract
-
Cited by 31 (13 self)
- Add to MetaCart
In many data sharing settings, such as within the biological and biomedical communities, global data consistency is not always attainable: different sites’ data may be dirty, uncertain, or even controversial. Collaborators are willing to share their data, and in many cases they also want to selectively import data from others — but must occasionally diverge when they disagree about uncertain or controversial facts or values. For this reason, traditional data sharing and data integration approaches are not applicable, since they require a globally consistent data instance. Additionally, many of these approaches do not allow participants to make updates; if they do, concurrency control algorithms or inconsistency repair techniques must be used to ensure a consistent view of the data for all users. In this paper,
Consistent Query Answers in Virtual Data Integration Systems
- IN INCONSISTENCY TOLERANCE, SPRINGER LNCS 3300
, 2005
"... When data sources are virtually integrated there is no common and centralized mechanism for maintaining global consistency. In consequHHj9 it is likely that inconsistencies with respect to certain global integrity constraints (ICs)will occu; In this chapter we consider the problem of defining ..."
Abstract
-
Cited by 30 (18 self)
- Add to MetaCart
When data sources are virtually integrated there is no common and centralized mechanism for maintaining global consistency. In consequHHj9 it is likely that inconsistencies with respect to certain global integrity constraints (ICs)will occu; In this chapter we consider the problem of defining andcompu2;) those answers that are consistent wrt the global ICs when global qubal) are posed tovirtuM data integration systems whosesou)33 are specified following the local-as-view approach.
Orchestra: Rapid, collaborative sharing of dynamic data
- In CIDR
, 2005
"... Conventional data integration techniques employ a “top-down ” design philosophy, starting by assessing requirements and defining a global schema, and then mapping data sources to that schema. This works well if the problem domain is well-understood and relatively static, as with enterprise data. How ..."
Abstract
-
Cited by 25 (7 self)
- Add to MetaCart
Conventional data integration techniques employ a “top-down ” design philosophy, starting by assessing requirements and defining a global schema, and then mapping data sources to that schema. This works well if the problem domain is well-understood and relatively static, as with enterprise data. However, it is fundamentally mismatched with the “bottom-up ” model of scientific data sharing, in which new data needs to be rapidly developed, published, and then assessed, filtered, and revised by others. We address the need for bottom-up collaborative data sharing, in which independent researchers or groups with different goals, schemas, and data can share information in the absence of global agreement. Each group independently curates, revises, and extends its data; eventually the groups compare and reconcile their changes, but they are not required to agree. This paper describes our initial design and prototype of the ORCHESTRA system, which focuses on managing disagreement among multiple data representations and instances. Our work represents an important evolution of the concepts of peer-to-peer data sharing [23], which considers revision, disagreement, authority, and intermittent participation. ∗ Work done while an M.S. student at the Univ. of Pennsylvania.
Inconsistency tolerance in p2p data integration: an epistemic logic approach
- In Proc. of the 10th Int. Workshop on Database Programming Languages (DBPL
, 2005
"... Abstract. We study peer-to-peer data integration, where each peer models an autonomous system that exports data in terms of its own schema, and data interoperation is achieved by means of mappings among the peer schemas, rather than through a global schema. We propose a multi-modal epistemic semanti ..."
Abstract
-
Cited by 22 (4 self)
- Add to MetaCart
Abstract. We study peer-to-peer data integration, where each peer models an autonomous system that exports data in terms of its own schema, and data interoperation is achieved by means of mappings among the peer schemas, rather than through a global schema. We propose a multi-modal epistemic semantics based on the idea that each peer is conceived as a rational agent that exchanges knowledge/belief with other peers, thus nicely modeling the modular structure of the system. We then address the issue of dealing with possible inconsistencies, and distinguish between two types of inconsistencies, called local and P2P, respectively. We define a nonmonotonic extension of our logic that is able to reason on the beliefs of peers under inconsistency tolerance. Tolerance to local inconsistency essentially means that the presence of inconsistency within one peer does not affect the consistency of the whole system. Tolerance to P2P inconsistency means being able to resolve inconsistencies arising from the interaction between peers. We study query answering and its data complexity in this setting, and we present an algorithm that is sound and complete with respect to the proposed semantics, and optimal with respect to worst-case complexity. 1
L.: Reconciling concepts and relations in heterogeneous ontologies
- In: Proc. ESWC 2006, Budva
, 2006
"... Abstract. In the extensive usage of ontologies envisaged by the Semantic Web there is a compelling need for expressing mappings between the components of heterogeneous ontologies. These mappings are of many different forms and involve the different components of ontologies. State of the art language ..."
Abstract
-
Cited by 16 (2 self)
- Add to MetaCart
Abstract. In the extensive usage of ontologies envisaged by the Semantic Web there is a compelling need for expressing mappings between the components of heterogeneous ontologies. These mappings are of many different forms and involve the different components of ontologies. State of the art languages for ontology mapping enable to express semantic relations between homogeneous components of different ontologies, namely they allow to map concepts into concepts, individuals into individuals, and properties into properties. Many real cases, however, highlight the necessity to establish semantic relations between heterogeneous components. For example to map a concept into a relation or vice versa. To support the interoperability of ontologies we need therefore to enrich mapping languages with constructs for the representation of heterogeneous mappings. In this paper, we propose an extension of Distributed Description Logics (DDL) to allow for the representation of mapping between concepts and relations. We provide a semantics of the proposed language and show its main logical properties. 1
Hyper: A framework for peer-to-peer data integration on grids
- In Proc. of the Int. Conference on Semantics of a Networked World: Semantics for Grid Databases (ICSNW 2004
, 2004
"... Abstract. Data Grids allow for seeing heterogeneous, distributed, and dynamic informational resources as if they were a uniform, stable, secure, and reliable database. According to this view, current proposals for data integration on Grids are based on the notion of global schema built over a collec ..."
Abstract
-
Cited by 12 (2 self)
- Add to MetaCart
Abstract. Data Grids allow for seeing heterogeneous, distributed, and dynamic informational resources as if they were a uniform, stable, secure, and reliable database. According to this view, current proposals for data integration on Grids are based on the notion of global schema built over a collection of autonomous information sources. On the other hand, in dynamic and distributed environments, such a hierarchical and centralized architecture is not well suited for effective information integration. Peer-to-peer data integration aims at overcoming these drawbacks by modeling autonomous information systems as peers, and establishing mappings among peers without resorting to any hierarchical structure. In this paper, we present Hyper, a joint research initiative of Università di Roma “La Sapienza ” and IBM Italia, which aims at developing principles and techniques for peer-to-peer data integration on a Grid infrastructure. The main contributions presented are a semantic characterization of P2P data integration, the deployment of our P2P framework on a Grid architecture, and the design of a query answering algorithm that is coherent both with the semantics and with the Grid infrastructure. 1
On Reconciling Data Exchange, Data Integration, and Peer Data Management
"... Data exchange and virtual data integration have been the subject of several investigations in the recent literature. At the same time, the notion of peer data management has emerged as a powerful abstraction of many forms of flexible and dynamic data-centered distributed systems. Although research o ..."
Abstract
-
Cited by 12 (1 self)
- Add to MetaCart
Data exchange and virtual data integration have been the subject of several investigations in the recent literature. At the same time, the notion of peer data management has emerged as a powerful abstraction of many forms of flexible and dynamic data-centered distributed systems. Although research on the above issues has progressed considerably in the last years, a clear understanding on how to combine data exchange and data integration in peer data management is still missing. This is the subject of the present paper. We start our investigation by first proposing a novel framework for peer data exchange, showing that it is a generalization of the classical data exchange setting. We also present algorithms for all the relevant data exchange tasks, and show that they can all be done in polynomial time with respect to data complexity. Based on the motivation that typical mappings and integrity constraints found in data integration are not captured by peer data exchange, we extend the framework to incorporate these features. One of the main difficulties is that the constraints of this new class are not amenable to materialization. We address this issue by resorting to a suitable combination of virtual and materialized data exchange, showing that the resulting framework is a generalization of both classical data exchange and classical data integration, and that the new setting incorporates the most expressive types of mapping and constraints considered in the two contexts. Finally, we present algorithms for all the relevant data management tasks also in the new setting, and show that, again, their data complexity is polynomial.

