Results 1 - 10
of
28
Data Exchange: Semantics and Query Answering
- In ICDT
, 2003
"... Data exchange is the problem of taking data structured under a source schema and creating an instance of a target schema that reflects the source data as accurately as possible. In this paper, we address foundational and algorithmic issues related to the semantics of data exchange and to query answe ..."
Abstract
-
Cited by 220 (28 self)
- Add to MetaCart
Data exchange is the problem of taking data structured under a source schema and creating an instance of a target schema that reflects the source data as accurately as possible. In this paper, we address foundational and algorithmic issues related to the semantics of data exchange and to query answering in the context of data exchange. These issues arise because, given a source instance, there may be many target instances that satisfy the constraints of the data exchange problem. We give an algebraic specification that selects, among all solutions to the data exchange problem, a special class of solutions that we call universal. A universal solution has no more and no less data than required for data exchange and it represents the entire space of possible solutions. We then identify fairly general, and practical, conditions that guarantee the existence of a universal solution and yield algorithms to compute a canonical universal solution efficiently. We adopt the notion of "certain answers" in indefinite databases for the semantics for query answering in data exchange. We investigate the computational complexity of computing the certain answers in this context and also study the problem of computing the certain answers of target queries by simply evaluating them on a canonical universal solution.
Translating Web Data
- In VLDB
, 2002
"... We present a novel framework for mapping between any combination of XML and relational schemas, in which a high-level, userspecified mapping is translated into semantically meaningful queries that transform source data into the target representation. Our approach works in two phases. In the first ph ..."
Abstract
-
Cited by 156 (31 self)
- Add to MetaCart
We present a novel framework for mapping between any combination of XML and relational schemas, in which a high-level, userspecified mapping is translated into semantically meaningful queries that transform source data into the target representation. Our approach works in two phases. In the first phase, the high-level mapping, expressed as a set of inter-schema correspondences, is converted into a set of mappings that capture the design choices made in the source and target schemas (including their hierarchical organization as well as their nested referential constraints).
Description Logics For Conceptual Data Modeling
, 1998
"... The article aims at establishing a logical approach to class-based data modeling. After a discussion on class-based formalisms for data modeling, we introduce a family of logics, called Description Logics, which stem from research on Knowledge Representation in Arti cial Intelligence. The logics ..."
Abstract
-
Cited by 123 (22 self)
- Add to MetaCart
The article aims at establishing a logical approach to class-based data modeling. After a discussion on class-based formalisms for data modeling, we introduce a family of logics, called Description Logics, which stem from research on Knowledge Representation in Arti cial Intelligence. The logics of this family are particularly well suited for specifying data classes and relationships among classes, and are equipped with both formal semantics and inference mechanisms. We demonstrate that several popular data modeling formalisms, including the Entity-Relationship Model, and the most common variants of object-oriented data models, can be expressed in terms of speci c logics of the family. For this purpose we use a unifying Description Logic, which incorporates all the features needed for the logical reformulation of the data models used in the various contexts. We also discuss the problem of devising reasoning procedures for the unifying formalism, and show that they provide valuable supports for several important data modeling activities.
Data Exchange: Getting to the Core
, 2003
"... Data exchange is the problem of taking data structured under a source schema and creating an instance of a target schema that reflects the source data as accurately as possible. Given a source instance, there may be many solutions to the data exchange problem, that is, many target instances that sat ..."
Abstract
-
Cited by 100 (15 self)
- Add to MetaCart
Data exchange is the problem of taking data structured under a source schema and creating an instance of a target schema that reflects the source data as accurately as possible. Given a source instance, there may be many solutions to the data exchange problem, that is, many target instances that satisfy the constraints of the data exchange problem. In an earlier paper, we identified a special class of solutions that we call universal. A universal solution has homomorphisms into every possible solution, and hence is a "most general possible" solution. Nonetheless, given a source instance, there may be many universal solutions. This naturally raises the question of whether there is a "best" universal solution, and hence a best solution for data exchange. We answer this question by considering the well-known notion of the core of a structure, a notion that was first studied in graph theory, but has also played a role in conjunctive-query processing. The core of a structure is the smallest substructure that is also a homomorphic image of the structure. All universal solutions have the same core (up to isomorphism); we show that this core is also a universal solution, and hence the smallest universal solution. The uniqueness of the core of a universal solution together with its minimality make the core an ideal solution for data exchange. Furthermore, we show that the core is the best among all universal solutions for answering unions of conjunctive queries with inequalities. After this, we investigate the computational complexity of producing the core. Well-known results by Chandra and Merlin imply that, unless P = NP, there is no polynomial-time algorithm that, given a structure as input, returns the core of that structure as output. In contrast, in the context of data e...
Unifying class-based representation formalisms
- J. of Artificial Intelligence Research
, 1999
"... The notion of class is ubiquitous in computer science and is central in many formalisms for the representation of structured knowledge used both in knowledge representation and in databases. In this paper we study the basic issues underlying such representation formalisms and single out both their c ..."
Abstract
-
Cited by 83 (32 self)
- Add to MetaCart
The notion of class is ubiquitous in computer science and is central in many formalisms for the representation of structured knowledge used both in knowledge representation and in databases. In this paper we study the basic issues underlying such representation formalisms and single out both their common characteristics and their distinguishing features. Such investigation leads us to propose a unifying framework in which we are able to capture the fundamental aspects of several representation languages used in different contexts. The proposed formalism is expressed in the style of description logics, which have been introduced in knowledge representation as a means to provide a semantically well-founded basis for the structural aspects of knowledge representation systems. The description logic considered in this paper is a subset of first order logic with nice computational characteristics. It is quite expressive and features a novel combination of constructs that has not been studied before. The distinguishing constructs are number restrictions, which generalize existence and functional dependencies, inverse roles, which allow one to refer to the inverse of a relationship, and possibly cyclic assertions, which are necessary for capturing real world
Foundations of Entity-Relationship Modeling
, 1991
"... Database design methodologies should facilitate database modeling, effectively support database processing and transform a conceptual schema of the database to a high-performance database schema in the model of the corresponding DBMS. The Entity-Relationship Model is extended to the Higher-order Ent ..."
Abstract
-
Cited by 39 (5 self)
- Add to MetaCart
Database design methodologies should facilitate database modeling, effectively support database processing and transform a conceptual schema of the database to a high-performance database schema in the model of the corresponding DBMS. The Entity-Relationship Model is extended to the Higher-order Entity-Relationship Model (HERM) which can be used as a high-level, simple and comprehensive database design model for the complete database information on the structure, operations, static and dynamic semantics. The model has the expressive power of semantic models and possesses the simplicity of the entity-relationship model. The paper shows that the model has a well-founded semantics. Several semantical constraints are considered for this model. 1 Introduction The problem of database design can be stated as follows: Design the logical and physical structure of a database in a given database management system to contain all the information required by the user and required for an efficient b...
Query Folding with Inclusion Dependencies
- In Proc. of the 14th IEEE Int. Conf. on Data Engineering (ICDE'98
, 1998
"... Query folding is a technique for determining how a query may be answered using a given set of resources, which may include materialized views, cached results of previous queries, or queries answerable by other databases. The power of query folding can be considerably enhanced by taking into account ..."
Abstract
-
Cited by 36 (2 self)
- Add to MetaCart
Query folding is a technique for determining how a query may be answered using a given set of resources, which may include materialized views, cached results of previous queries, or queries answerable by other databases. The power of query folding can be considerably enhanced by taking into account integrity constraints that are known to hold on base relations. This paper describes an extension of query folding that utilizes inclusion dependencies to find foldings of queries that would otherwise be overlooked. We describe a complete strategy for finding foldings in the presence of inclusion dependencies and present a basic algorithm that implements that strategy. We also describe extensions to this algorithm when both inclusion and functional dependencies are considered.
Specifying and reasoning about dynamic access-control policies
- of Lecture Notes in Computer Science
, 2006
"... Abstract. Access-control policies have grown from simple matrices to non-trivial specifications written in sophisticated languages. The increasing complexity of these policies demands correspondingly strong automated reasoning techniques for understanding and debugging them. The need for these techn ..."
Abstract
-
Cited by 29 (2 self)
- Add to MetaCart
Abstract. Access-control policies have grown from simple matrices to non-trivial specifications written in sophisticated languages. The increasing complexity of these policies demands correspondingly strong automated reasoning techniques for understanding and debugging them. The need for these techniques is even more pressing given the rich and dynamic nature of the environments in which these policies evaluate. We define a framework to represent the behavior of accesscontrol policies in a dynamic environment. We then specify several interesting, decidable analyses using first-order temporal logic. Our work illustrates the subtle interplay between logical and state-based methods, particularly in the presence of three-valued policies. We also define a notion of policy equivalence that is especially useful for modular reasoning. 1
Making Object-Oriented Schemas More Expressive
, 1994
"... Current object-oriented data models lack several important features that would allow one to express relevant knowledge about the classes of a schema. In particular, there is no data model supporting simultaneously the inverse of the functions represented by attributes, the union, the intersection an ..."
Abstract
-
Cited by 28 (11 self)
- Add to MetaCart
Current object-oriented data models lack several important features that would allow one to express relevant knowledge about the classes of a schema. In particular, there is no data model supporting simultaneously the inverse of the functions represented by attributes, the union, the intersection and the complement of classes, the possibility of using nonbinary relations, and the possibility of expressing cardinality constraints on attributes and relations. In this paper we define a new data model, called CAR, which extends the basic core of current object-oriented data models with all the above mentioned features. A technique is then presented both for checking the consistency of class definitions, and for computing the logical consequences of the knowledge represented in the schema. Finally, the inherent complexity of reasoning in CAR is investigated, and the complexity of our inferencing technique is studied, depending on various assumptions on the schema. 1 Introduction Many recen...
Probabilistic data exchange
- In Proc. ICDT
, 2010
"... The work reported here lays the foundations of data exchange in the presence of probabilistic data. This requires rethinking the very basic concepts of traditional data exchange, such as solution, universal solution, and the certain answers of target queries. We develop a framework for data exchange ..."
Abstract
-
Cited by 21 (3 self)
- Add to MetaCart
The work reported here lays the foundations of data exchange in the presence of probabilistic data. This requires rethinking the very basic concepts of traditional data exchange, such as solution, universal solution, and the certain answers of target queries. We develop a framework for data exchange over probabilistic databases, and make a case for its coherence and robustness. This framework applies to arbitrary schema mappings, and finite or countably infinite probability spaces on the source and target instances. After establishing this framework and formulating the key concepts, we study the application of the framework to a concrete and practical setting where probabilistic databases are compactly encoded by means of annotations formulated over random Boolean variables. In this setting, we study the problems of testing for the existence of solutions and universal solutions, materializing such solutions, and evaluating target queries (for unions of conjunctive queries) in both the exact sense and the approximate sense. For each of the problems, we carry out a complexity analysis based on properties of the annotation, in various classes of dependencies. Finally, we show that the framework and results easily and completely generalize to allow not only the data, but also the schema mapping itself to be probabilistic.

