Data Exchange: Semantics and Query Answering
 In ICDT
, 2003
"... Data exchange is the problem of taking data structured under a source schema and creating an instance of a target schema that reflects the source data as accurately as possible. In this paper, we address foundational and algorithmic issues related to the semantics of data exchange and to query answe ..."
Cited by 420
Data exchange is the problem of taking data structured under a source schema and creating an instance of a target schema that reflects the source data as accurately as possible. In this paper, we address foundational and algorithmic issues related to the semantics of data exchange and to query answering in the context of data exchange. These issues arise because, given a source instance, there may be many target instances that satisfy the constraints of the data exchange problem. We give an algebraic specification that selects, among all solutions to the data exchange problem, a special class of solutions that we call universal. A universal solution has no more and no less data than required for data exchange and it represents the entire space of possible solutions. We then identify fairly general, and practical, conditions that guarantee the existence of a universal solution and yield algorithms to compute a canonical universal solution efficiently. We adopt the notion of "certain answers" in indefinite databases for the semantics for query answering in data exchange. We investigate the computational complexity of computing the certain answers in this context and also study the problem of computing the certain answers of target queries by simply evaluating them on a canonical universal solution.
Data Exchange: Getting to the Core
, 2003
"... Data exchange is the problem of taking data structured under a source schema and creating an instance of a target schema that reflects the source data as accurately as possible. Given a source instance, there may be many solutions to the data exchange problem, that is, many target instances that sat ..."
Cited by 174
Data exchange is the problem of taking data structured under a source schema and creating an instance of a target schema that reflects the source data as accurately as possible. Given a source instance, there may be many solutions to the data exchange problem, that is, many target instances that satisfy the constraints of the data exchange problem. In an earlier paper, we identified a special class of solutions that we call universal. A universal solution has homomorphisms into every possible solution, and hence is a "most general possible" solution. Nonetheless, given a source instance, there may be many universal solutions. This naturally raises the question of whether there is a "best" universal solution, and hence a best solution for data exchange. We answer this question by considering the wellknown notion of the core of a structure, a notion that was first studied in graph theory, but has also played a role in conjunctivequery processing. The core of a structure is the smallest substructure that is also a homomorphic image of the structure. All universal solutions have the same core (up to isomorphism); we show that this core is also a universal solution, and hence the smallest universal solution. The uniqueness of the core of a universal solution together with its minimality make the core an ideal solution for data exchange. Furthermore, we show that the core is the best among all universal solutions for answering unions of conjunctive queries with inequalities. After this, we investigate the computational complexity of producing the core. Wellknown results by Chandra and Merlin imply that, unless P = NP, there is no polynomialtime algorithm that, given a structure as input, returns the core of that structure as output. In contrast, in the context of data e...
Logic and databases: a deductive approach
 ACM Computing Surveys
, 1984
"... The purpose of this paper is to show that logic provides a convenient formalism for studying classical database problems. There are two main parts to the paper, devoted respectively to conventional databases and deductive databases. In the first part, we focus on query languages, integrity modeling ..."
Cited by 168
The purpose of this paper is to show that logic provides a convenient formalism for studying classical database problems. There are two main parts to the paper, devoted respectively to conventional databases and deductive databases. In the first part, we focus on query languages, integrity modeling and maintenance, query optimization, and data
A general Datalogbased framework for tractable query answering over ontologies
 In Proc. PODS2009. ACM
, 2009
"... Ontologies play a key role in the Semantic Web [4], data modeling, and information integration [16]. Recent trends in ontological reasoning have shifted from decidability issues to tractability ones, as e.g. reflected by the work on the DLLite family of tractable description logics (DLs) [11, 19]. ..."
Cited by 135
Ontologies play a key role in the Semantic Web [4], data modeling, and information integration [16]. Recent trends in ontological reasoning have shifted from decidability issues to tractability ones, as e.g. reflected by the work on the DLLite family of tractable description logics (DLs) [11, 19]. An important result of these works is that the main
Schema Mappings, Data Exchange, and Metadata Management
, 2005
"... Schema mappings are highlevel specifications that describe the relationship between database schemas. Schema mappings are prominent in several different areas of database management, including database design, information integration, data exchange, metadata management, and peertopeer data managem ..."
Cited by 126
Schema mappings are highlevel specifications that describe the relationship between database schemas. Schema mappings are prominent in several different areas of database management, including database design, information integration, data exchange, metadata management, and peertopeer data management systems. Our main aim in this paper is to present an overview of recent advances in data exchange and metadata management, where the schema mappings are between relational schemas. In addition, we highlight some research issues and directions for future work.
Reformulation of xml queries and constraints
 In ICDT’03
, 2003
"... Abstract. We state and solve the query reformulation problem for XML publishing in a general setting that allows mixed (XML and relational) storage for the proprietary data and exploits redundancies (materialized views, indexes and caches) to enhance performance. The correspondence between published ..."
Cited by 108
Abstract. We state and solve the query reformulation problem for XML publishing in a general setting that allows mixed (XML and relational) storage for the proprietary data and exploits redundancies (materialized views, indexes and caches) to enhance performance. The correspondence between published and proprietary schemas is specified by views in both directions, and the same algorithm performs rewritingwithviews, compositionwithviews, or the combined effect of both, unifying the GlobalAsView and LocalAsView approaches to data integration. We prove a completeness theorem which guarantees that under certain conditions, our algorithm will find a minimal reformulation if one exists. Moreover, we identify conditions when this algorithm achieves optimal complexity bounds. We solve the reformulation problem for constraints by exploiting a reduction to the problem of query reformulation. 1
Fuzzy Functional Dependencies and Lossless Join Decomposition of Fuzzy Relational Database Systems
 ACM Transactions on Database Systems
, 1988
"... This paper deals with the application of fuzzy logic in a relational database environment with the objective of capturing more meaning of the data. It is shown that with suitable interpretations for the fuzzy membership functions, a fuzzy relational data model can be used to represent ambiguities in ..."
Cited by 90
This paper deals with the application of fuzzy logic in a relational database environment with the objective of capturing more meaning of the data. It is shown that with suitable interpretations for the fuzzy membership functions, a fuzzy relational data model can be used to represent ambiguities in data values as well as impreciseness in the association among them. Relational operators for fuzzy relations have been studied, and applicability of fuzzy logic in capturing integrity constraints has been investigated. By introducing a fuzzy resemblance measure EQUAL for comparing domain values, the definition of classical functional dependency has been generalized to fuzzy functional dependency (ffd). The implication problem of ffds has been examined and a set of sound and complete inference axioms has been proposed. Next, the problem of lossless join decomposition of fuzzy relations for a given set of fuzzy functional dependencies is investigated. It is proved that with a suitable restriction on EQUAL, the design theory of a classical relational database with functional dependencies can be extended to fuzzy relations satisfying fuzzy functional dependencies.
Horn clauses and database dependencies
 Journal of the ACM
, 1982
"... Abstract. Certain firstorder sentences, called "dependencies, " about relations in a database are defined and studied. These dependencies seem to include all prewously defined dependencies as special cases A new concept is mtroduced, called "faithfulness (with respect to ..."
Cited by 89
Abstract. Certain firstorder sentences, called &quot;dependencies, &quot; about relations in a database are defined and studied. These dependencies seem to include all prewously defined dependencies as special cases A new concept is mtroduced, called &quot;faithfulness (with respect to direct product), &quot; which enables powerful results to be proved about the existence of &quot;Armstrong relations &quot; in the presence of these new dependencies. (An Armstrong relaUon is a relation that obeys precisely those dependencies that are the logical consequences of a given set of dependencies.) Results are also obtained about characterizing the class of projections of those relations that obey a given set of dependencies.
J.B.: The chase revisited
 In: PODS (2008
"... We revisit the classical chase procedure, studying its properties as well as its applicability to standard database problems. We settle (in the negative) the open problem of decidability of termination of the standard chase, and we provide sufficient termination conditions which are strictly less ov ..."
Cited by 83
We revisit the classical chase procedure, studying its properties as well as its applicability to standard database problems. We settle (in the negative) the open problem of decidability of termination of the standard chase, and we provide sufficient termination conditions which are strictly less overconservative than the best previously known. We investigate the adequacy of the standard chase for checking query containment under constraints, constraint implication and computing certain answers in data exchange. We find room for improvement after gaining a deeper understanding of the chase by separating the algorithm from its result. We identify the properties of the chase result that are essential to the above applications, and we introduce the more general notion of an Funiversal model set, which supports query and constraint languages that are closed under a class F of mappings. By choosing F appropriately, we extend prior results all the way to existential firstorder queries and ∀∃firstorder constraints (and various standard sublanguages). We show that the standard chase is incomplete for finding universal model sets, and we introduce the extended core chase which is complete, i.e. finds an Funiversal model set when it exists. A key advantage of the new chase is that the same algorithm can be applied for the mapping classes F of interest, by simply modifying appropriately the set of constraints given as input. Even when restricted to the typical input in prior work (unions of conjunctive queries and embedded dependencies), the new chase supports certain answer computation and containment/implication tests in strictly more cases than the incomplete standard chase.
ConstraintBased XML Query Rewriting for Data Integration
 IN SIGMOD
, 2004
"... We study the problem of answering queries through a target schema, given a set of mappings between one or more source schemas and this target schema, and given that the data is at the sources. The schemas can be any combination of relational or XML schemas, and can be independently designed. In addi ..."
Cited by 78
We study the problem of answering queries through a target schema, given a set of mappings between one or more source schemas and this target schema, and given that the data is at the sources. The schemas can be any combination of relational or XML schemas, and can be independently designed. In addition to the sourcetotarget mappings, we consider as part of the mapping scenario a set of target constraints specifying additional properties on the target schema. This becomes particularly important when integrating data from multiple data sources with overlapping data and when such constraints can express data merging rules at the target. We define the semantics of query answering in such an integration scenario, and design two novel algorithms, basic query rewrite and query resolution, to implement the semantics. The basic query rewrite algorithm reformulates target queries in terms of the source schemas, based on the mappings. The query resolution algorithm generates additional rewritings that merge related information from multiple sources and assemble a coherent view of the data, by incorporating target constraints. The algorithms are implemented and then evaluated using a comprehensive set of experiments based on both synthetic and reallife data integration scenarios.