Results 1 - 10
of
16
The Use of Information Capacity in Schema Integration and Translation
- In VLDB
, 1993
"... In this paper, we carefully explore the assumptions behind using information capacity equivalence as a measure of correctness for judging transformed schemas in schema integration and translation methodologies. We present a classification of common integration and translation tasks based on their op ..."
Abstract
-
Cited by 67 (9 self)
- Add to MetaCart
In this paper, we carefully explore the assumptions behind using information capacity equivalence as a measure of correctness for judging transformed schemas in schema integration and translation methodologies. We present a classification of common integration and translation tasks based on their operational goals and derive from them the relative information capacity requirements of the original and transformed schemas. We show that for many tasks, information capacity equivalence of the schemas is not strictly required. Based on this, we present a new definition of correctness that reflects each undertaken task. We then examine existing methodologies and show how anomalies can arise when using those that do not meet the proposed correctness criteria. 1 Introduction Formal work on schema equivalence has largely been ignored within practical schema integration and translation tools. Practitioners have felt that theoretical work is too narrow in scope to be applicable to the problems ...
Schema Equivalence in Heterogeneous Systems: Bridging Theory and Practice
, 1993
"... Current theoretical work offers measures of schema equivalence based on the information capacity of schemas. This work is based on the existence of abstract functions satisfying various restrictions between the sets of all instances of two schemas. In considering schemas that arise in practice, howe ..."
Abstract
-
Cited by 60 (2 self)
- Add to MetaCart
Current theoretical work offers measures of schema equivalence based on the information capacity of schemas. This work is based on the existence of abstract functions satisfying various restrictions between the sets of all instances of two schemas. In considering schemas that arise in practice, however, it is not clear how to reason about the existence of such abstract functions. Further, these notions of equivalence tend to be too liberal in that schemas are often considered equivalent when a practitioner would consider them to be different. As a result, practical integration methodologies have not utilized this theoretical foundation and most of them have relied on ad-hoc approaches. We present results that seek to bridge this gap. First, we consider the problem of deciding information capacity equivalence and dominance of schemas that occur in practice, i.e., those that can express inheritance and simple integrity constraints. We show that this problem is undecidable. This undecidab...
Algebraic Graph-Based Approach to Management of Multi-Base Systems,II: Mathematical Aspects of Schema Integration
- TR-9502, FRAME INFORM SYSTEMS
, 1995
"... ..."
Management of Multiple Models in an Extensible Database Design Tool
- In Proceedings of EDBT’96, LNCS 1057
, 1996
"... . We describe the development of a tool, called MDM, for the management of multiple models and the translation of database schemes. This tool can be at the basis of an integrated CASE environment, supporting the analysis and design of information systems, that allows different representations for th ..."
Abstract
-
Cited by 34 (6 self)
- Add to MetaCart
. We describe the development of a tool, called MDM, for the management of multiple models and the translation of database schemes. This tool can be at the basis of an integrated CASE environment, supporting the analysis and design of information systems, that allows different representations for the same data schemes. We first present a graphtheoretic framework that allows us to formally investigate desirable properties of schema translations. The formalism is based on a classification of the constructs used in the known data model into a limited set of types. Then, on the basis of formal results, we develop general methodologies for deriving "good" translations between schemes and, more in general, between models. Finally, we define the architecture and the functionalities of a first prototype that implements the various features of the approach. 1 Introduction During the past decade, the availability and use of automated tools for the analysis and development of information systems...
Bootstrapping pay-as-you-go data integration systems
- In Proc. of SIGMOD
, 2008
"... Data integration systems offer a uniform interface to a set of data sources. Despite recent progress, setting up and maintaining a data integration application still requires significant upfront effort of creating a mediated schema and semantic mappings from the data sources to the mediated schema. ..."
Abstract
-
Cited by 29 (6 self)
- Add to MetaCart
Data integration systems offer a uniform interface to a set of data sources. Despite recent progress, setting up and maintaining a data integration application still requires significant upfront effort of creating a mediated schema and semantic mappings from the data sources to the mediated schema. Many application contexts involving multiple data sources (e.g., the web, personal information management, enterprise intranets) do not require full integration in order to provide useful services, motivating a pay-as-you-go approach to integration. With that approach, a system starts with very few (or inaccurate) semantic mappings and these mappings are improved over time as deemed necessary. This paper describes the first completely self-configuring data integration system. The goal of our work is to investigate how advanced of a starting point we can provide a pay-as-you-go system. Our system is based on the new concept of a probabilistic mediated schema that is automatically created from the data sources. We automatically create probabilistic schema mappings between the sources and the mediated schema. We describe experiments in multiple domains, including 50-800 data sources, and show that our system is able to produce high-quality answers with no human intervention.
MDM: a Multiple-Data-Model Tool for the Management of Heterogeneous Database Schemes
"... MDM is a tool that enables the users to define schemes of different data models and to perform translations of schemes from one model to another. These functionalities can be at the basis of a customizable and integrated CASE environment supporting the analysis and design of information systems. MDM ..."
Abstract
-
Cited by 13 (4 self)
- Add to MetaCart
MDM is a tool that enables the users to define schemes of different data models and to perform translations of schemes from one model to another. These functionalities can be at the basis of a customizable and integrated CASE environment supporting the analysis and design of information systems. MDM has two main components: the Model Manager and the Schema Manager. The Model Manager supports a specialized user, the model engineer, in the definition of a variety of models, on the basis of a limited set of metaconstructs covering almost all known conceptual models. The Schema Manager allows designers to create and modify schemes over the defined models, and to generate at each time a translation of a scheme into any of the data models currently available. Translations between models are automatically derived, at definition time, by combining a predefined set of elementary transformations, which implement the standard translations between simple combinations of constructs.
Databases as Graphical Algebras: Algebraic Graph-Based Approach to Data Modeling and Database Design
, 1996
"... . The approach we suggest is based on a graphical specification language possessing formal semantics so that graphical images themselves are precise specifications suitable for implementation. Our specifications are similar to the sketches developed in the category theory but, in contrast to them, ..."
Abstract
-
Cited by 8 (5 self)
- Add to MetaCart
. The approach we suggest is based on a graphical specification language possessing formal semantics so that graphical images themselves are precise specifications suitable for implementation. Our specifications are similar to the sketches developed in the category theory but, in contrast to them, enjoy the possibility of setting arbitrary signatures of diagram properties and operations. An important (and sometimes crucial) step in the process of database design is schema (or view) integration, that is, an activity aimed at producing a global conceptual schema of a database from a set of locally developed user-oriented schemas (views). In our approach, correspondence between semantic schemas to be integrated is specified by equations so that the integration procedure can be reduced to algebraic manipulations with sketches representing schemas. This provides the possibility of automated view integration and, correspondingly, automated database design. In the paper the mathemat...
Databases as Diagram Algebras: Specifying Queries and Views Via the Graph-Based Logic of Sketches
, 1996
"... The goal of the paper is to develop a graphical formalism for specifying queries and views within the sketch data model (SkeDM) introduced in [17]. Sketches are directed multigraphs in which some diagrams are labeled with special markers. These markers denote predicates and operations over diagrams ..."
Abstract
-
Cited by 5 (5 self)
- Add to MetaCart
The goal of the paper is to develop a graphical formalism for specifying queries and views within the sketch data model (SkeDM) introduced in [17]. Sketches are directed multigraphs in which some diagrams are labeled with special markers. These markers denote predicates and operations over diagrams of sets and functions. Given a signature of operations (query language), any sketch (database schema) can be extended with derived items denoting data that can be retrieved from the database. Views to a schema S are then sketch morphisms v : SV ! S 0 from some sketch (view schema) SV into an augmentation of S with derived items, S 0 . In this way one obtains a unifying graph-based formal language for data and metadata definition and manipulation. In particular, a formalized specification framework for heterogeneous multibase systems can be built. The approach is described with a number of examples and then precisely formalized. The main technical contribution is the development of alge...
A Graphical Yet Formalized Framework for Specifying View Systems
, 1997
"... A graphical formalized language is proposed for specifying systems of views over database schemas. The language is based on the notion of arrow (mapping) between data schemas and is suitable for any data model for which schema mappings are defined. In particular, the constructs of query, query langu ..."
Abstract
-
Cited by 5 (5 self)
- Add to MetaCart
A graphical formalized language is proposed for specifying systems of views over database schemas. The language is based on the notion of arrow (mapping) between data schemas and is suitable for any data model for which schema mappings are defined. In particular, the constructs of query, query language, view and view integration can be consistently expressed in this arrow formalism and correspondingly specified. This gives rise to a general graph-based framework for specifying complex view systems. Basic constructions of the language and the entire framework as well can be considered as specialization of very general constructs developed in the mathematical category theory. 1 Introduction The notion of view is one of the central ones in the database (DB) technology. Views make it possible to provide each application with its own presentation of data and isolate them from inessential (for them) details and changes of DB schemas. The practical importance of views is commonly recognized...
Object Views of Relations
- In Proceedings of the 2nd International Conference on Applications of Databases - ADB '95
, 1995
"... This paper investigates the problem of integration between relational and objectoriented databases. We discuss an approach based on class and attribute mappings and show how OQL, the ODMG query language, can be embedded in a set of primitives of a mapping language to serve as a basis for object-rela ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
This paper investigates the problem of integration between relational and objectoriented databases. We discuss an approach based on class and attribute mappings and show how OQL, the ODMG query language, can be embedded in a set of primitives of a mapping language to serve as a basis for object-relational data integration. We then describe the technique used in the implementation of a prototype built on top of an object-oriented view mechanism, which allows the construction of an object view of a relational database. 1 Introduction Many new applications have to manipulate legacy data (data existing in traditional database systems) in addition to new data manipulated with the support of the new generation of object database systems (ODBMSs). ODBMSs provide advanced technology for complex applications but their use is not yet so widespread. Therefore, the problem is to migrate or integrate these legacy data into object based environments without being forced to redevelop existing applic...

