Results 1 - 10
of
80
Query Caching and Optimization in Distributed Mediator Systems
- In Proc. of ACM SIGMOD Conf. on Management of Data
, 1996
"... Query processing and optimization in mediator systems that access distributed non-proprietary sources pose many novel problems. Cost-based query optimization is hard because the mediator does not have access to source statistics information and furthermore it may not be easy to model the source's pe ..."
Abstract
-
Cited by 176 (10 self)
- Add to MetaCart
Query processing and optimization in mediator systems that access distributed non-proprietary sources pose many novel problems. Cost-based query optimization is hard because the mediator does not have access to source statistics information and furthermore it may not be easy to model the source's performance. At the same time, querying remote sources may be very expensive because of high connection overhead, long computation time, financial charges, and temporary unavailability. We propose a costbased optimization technique that caches statistics of actual calls to the sources and consequently estimates the cost of the possible execution plans based on the statistics cache. We investigate issues pertaining to the design of the statistics cache and experimentally analyze various tradeoffs. We also present a query result caching mechanism that allows us to effectively use results of prior queries when the source is not readily available. We employ the novel invariants mechanism, which s...
Model Independent Assertions for Integration of Heterogeneous Schemas
, 1991
"... Due to the proliferation of database applications, the integration of existing databases into a distributed or federated system is one of the major challenges in responding to enterprises' information requirements. Some proposed integration techniques aim at providing database administrators (DBAs) ..."
Abstract
-
Cited by 130 (7 self)
- Add to MetaCart
Due to the proliferation of database applications, the integration of existing databases into a distributed or federated system is one of the major challenges in responding to enterprises' information requirements. Some proposed integration techniques aim at providing database administrators (DBAs) with a view definition language they can use to build the desired integrated schema. These techniques leave to the DBA the responsibility of appropriately restructuring schema elements from existing local schemas and of solving inter-schema conflicts. This paper investigates the assertion-based approach, in which the DB~s action is limited to pointing out corresponding elements in the schemas and to defining the nature of the correspondence in between. This methodology is capable of: en-suring better integration by taking into account additional semantic information (assertions about links); automatically solving structural conflicts; building the integrated schema without requiring conforming of initial schemas; applying inte-gration rules to a variety of data models; and performing view as well as database integration. This paper presents the basic ideas underlying our approach and focuses on resolution of structural conflicts.
View integration: A step forward in solving structural conflicts
- IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING
, 1994
"... Thanks to the development of the federated systems approach on the one hand and the emphasis on user involvement in database design on the other, the interest in schema integration techniques is significantly increasing. Theories, methods and sometime tools have been proposed. Conflict resolution i ..."
Abstract
-
Cited by 96 (6 self)
- Add to MetaCart
Thanks to the development of the federated systems approach on the one hand and the emphasis on user involvement in database design on the other, the interest in schema integration techniques is significantly increasing. Theories, methods and sometime tools have been proposed. Conflict resolution is the key issue. Different perceptions by schema designers may lead to different representations. A way must be found to support these different representations within a single system. Most current integration methodologies rely on modification of initial schemas,to solve the conflicts. This approach needs a strong interaction with the database administrator, who has authority to modify the initial schemas. This paper presents an approach to view integration specifi-cally intended to support the coexistence of different representa-tions of the same real-world objects. The main characteristics of
An approach to resolving semantic heterogeneity in a federation of autonomous, heterogeneous database systems
- INTERNATIONAL JOURNAL OF INTELLIGENT AND COOPERATIVE INFORMATION SYSTEMS
, 1993
"... An approach to accommodating semantic heterogeneity in a federation of interoperable, autonomous, heterogeneous databases is presented. A mechanism is described for identifying and resolving semantic heterogeneity while at the same time honoring the autonomy of the database components that participa ..."
Abstract
-
Cited by 83 (3 self)
- Add to MetaCart
An approach to accommodating semantic heterogeneity in a federation of interoperable, autonomous, heterogeneous databases is presented. A mechanism is described for identifying and resolving semantic heterogeneity while at the same time honoring the autonomy of the database components that participate in the federation. A minimal, common data model is introduced as the basis for describing sharable information, and a three-pronged facility for determining the relationships between information units (objects) is developed. Our approach serves as a basis for the sharing of related concepts through (partial) schema unification without the need for a global view of the data that is stored in the different components. The mechanism presented here can be seen in contrast with more traditional approaches such as “integrated databases” or “distributed databases”. An experimental prototype implementation has been constructed within the framework of the Remote-Exchange experimental system.
Merging Models Based on Given Correspondences
, 2003
"... A model is a formal description of a complex application artifact, such as a database schema, an application interface, a UML model, an ontology, or a message format. The problem of merging such models lies at the core of many meta data applications, such as view integration, mediated schema creat ..."
Abstract
-
Cited by 73 (8 self)
- Add to MetaCart
A model is a formal description of a complex application artifact, such as a database schema, an application interface, a UML model, an ontology, or a message format. The problem of merging such models lies at the core of many meta data applications, such as view integration, mediated schema creation for data integration, and ontology merging. This paper examines the problem of merging two models given correspondences between them. It presents requirements for conducting a merge and a specific algorithm that subsumes previous work.
Virtual Schemas and Bases
- In Proc. EDBT
, 1994
"... We propose the notions of virtual schemas and virtual bases as a coherent way of integrating various features in OODB views. A virtual schema is defined based on some existing (real) schema. A virtual base is obtained when a (real) base is attached to a virtual schema. We study the consequences of t ..."
Abstract
-
Cited by 72 (17 self)
- Add to MetaCart
We propose the notions of virtual schemas and virtual bases as a coherent way of integrating various features in OODB views. A virtual schema is defined based on some existing (real) schema. A virtual base is obtained when a (real) base is attached to a virtual schema. We study the consequences of this simple assumption. In particular, we observe the differences between a real schema and a virtual one. We also consider an extension (that we call generic schemas) where it is necessary to specify several real bases to attach data to a virtual schema. We show how the flexibility provided by virtual schemas can be used to cope with various dynamic features of database systems. 1 Introduction Views are intended to increase the flexibility of database systems and their definition in the object-oriented database (OODB) context comes as a natural extension of the original paradigm. The yet relatively young research on this topic has introduced a large variety of indispensable new features. H...
The Use of Information Capacity in Schema Integration and Translation
- In VLDB
, 1993
"... In this paper, we carefully explore the assumptions behind using information capacity equivalence as a measure of correctness for judging transformed schemas in schema integration and translation methodologies. We present a classification of common integration and translation tasks based on their op ..."
Abstract
-
Cited by 67 (9 self)
- Add to MetaCart
In this paper, we carefully explore the assumptions behind using information capacity equivalence as a measure of correctness for judging transformed schemas in schema integration and translation methodologies. We present a classification of common integration and translation tasks based on their operational goals and derive from them the relative information capacity requirements of the original and transformed schemas. We show that for many tasks, information capacity equivalence of the schemas is not strictly required. Based on this, we present a new definition of correctness that reflects each undertaken task. We then examine existing methodologies and show how anomalies can arise when using those that do not meet the proposed correctness criteria. 1 Introduction Formal work on schema equivalence has largely been ignored within practical schema integration and translation tools. Practitioners have felt that theoretical work is too narrow in scope to be applicable to the problems ...
Theoretical aspects of schema merging
- EDBT
, 1992
"... A general technique for merging database schemas is developed that has a number of advantages over existing techniques, the most important of which is that schemas are placed in a partial order that has bounded joins. This means that the merging operation, when it succeeds, is both associative and c ..."
Abstract
-
Cited by 63 (3 self)
- Add to MetaCart
A general technique for merging database schemas is developed that has a number of advantages over existing techniques, the most important of which is that schemas are placed in a partial order that has bounded joins. This means that the merging operation, when it succeeds, is both associative and commutative, i.e., that the merge of schemas is independent of the order in which they are considered — a property not possessed by existing methods. The operation is appropriate for the design of interactive programs as it allows user assertions about relationships between nodes in the schemas to be considered as elementary schemas. These can be combined with existing schemas using precisely the same merging operation. The technique is general and can be applied to a variety of data models. It can also deal with certain cardinality constraints that arise through the imposition of keys. A prototype implementation, together with a graphical interface, has been developed. 1
Challenges in Integrating Biological Data Sources
- Journal of Computational Biology
, 1995
"... this report, we examine the technical challenges to integration, critique the available tools and resources, and compare the cost and advantages of various methodologies. We begin by analyzing the basic steps in strict and complete integration: 1) transformation of the various schemas to a common da ..."
Abstract
-
Cited by 62 (4 self)
- Add to MetaCart
this report, we examine the technical challenges to integration, critique the available tools and resources, and compare the cost and advantages of various methodologies. We begin by analyzing the basic steps in strict and complete integration: 1) transformation of the various schemas to a common data model; 2) matching of semantically related schema objects; 3) schema integration; 4) transformation of data to the federated database on demand; and 5) matching of semantically equivalent data. Some progress has been made on generic problems such as (1) and (3) within the wider database community, but issues of semantics (steps (2) and (5)) have only been dealt with any degree of success by domain experts within the biological community. We then look at the solution space of integration strategies as defined by two axes, the "tightness" of federation and the "degree" of instantiation, discuss where various solutions fall on this plane, and examine their cost and advantages/disadvantages. Finally, we examine technical challenges that are not -3- July 12, 1995
Suitability of Data Models as Canonical Models for Federated Databases
- SIGMOD Record
, 1991
"... We develop a framework of characteristics, essential and recommended, that a data model should have to be suitable as canonical model for federated databases. This framework is based on the two factors of the representation ability of a model: expressiveness and semantic relativism. Several data m ..."
Abstract
-
Cited by 62 (9 self)
- Add to MetaCart
We develop a framework of characteristics, essential and recommended, that a data model should have to be suitable as canonical model for federated databases. This framework is based on the two factors of the representation ability of a model: expressiveness and semantic relativism. Several data models are analyzed with respect to the characteristics of the framework, to evaluate their adequacy as canonical models. 1 Introduction When several databases (DBs) are to interoperate, they form a federation, and a data model must be chosen as the canonical data model (CDM) for the federation (we use the terminology of [SL90]). Work on federated or interoperable databases has often used an Entity Relationship (ER) model, or some extension of it, as the CDM; others have adopted an Object Oriented (OO) model. Is any data model equally adequate as CDM? This paper discusses some characteristics of a data model that make it suitable as the CDM of a federation. Note that we are not trying to une...

