Results 1 - 10
of
25
Object exchange across heterogeneous information sources
- INTERNATIONAL CONFERENCE ON DATA ENGINEERING
, 1995
"... We address the problem of providing integrated access to diverse and dynamic information sources. We explain how this problem differs from the traditional database integration problem and we focus on one aspect of the information integration problem, namely information exchange. We define an object- ..."
Abstract
-
Cited by 465 (56 self)
- Add to MetaCart
We address the problem of providing integrated access to diverse and dynamic information sources. We explain how this problem differs from the traditional database integration problem and we focus on one aspect of the information integration problem, namely information exchange. We define an object-based information exchange model and a corresponding query language that we believe are well suited for integration of diverse information sources. We describe how, the model and language have been used to integrate heterogeneous bibliographic information sources. We also describe two general-purpose libraries we have implemented for object exchange between clients and servers.
The TSIMMIS Project: Integration of Heterogeneous Information Sources
"... The goal of the Tsimmis Project is to develop tools that facilitate the rapid integration of heterogeneous information sources that may include both structured and unstructured data. This paper gives an overview of the project, describing components that extract properties from unstructured objects, ..."
Abstract
-
Cited by 451 (16 self)
- Add to MetaCart
The goal of the Tsimmis Project is to develop tools that facilitate the rapid integration of heterogeneous information sources that may include both structured and unstructured data. This paper gives an overview of the project, describing components that extract properties from unstructured objects, that translate information into a common object model, that combine information from several sources, that allow browsing of information, and that manage constraints across heterogeneous sites. Tsimmis is a joint project between Stanford and the IBM Almaden Research Center.
Object fusion in mediator systems
- INTERNATIONAL CONFERENCE ON VERY LARGE DATA BASES
, 1996
"... One of the main tasks of mediators is to fuse information from heterogeneous information sources. This may involve, for example, removing redundancies, and resolving inconsistencies in favor of the most reliable source. The problem becomes harder when the sources are unstructured/semistructured and ..."
Abstract
-
Cited by 155 (29 self)
- Add to MetaCart
One of the main tasks of mediators is to fuse information from heterogeneous information sources. This may involve, for example, removing redundancies, and resolving inconsistencies in favor of the most reliable source. The problem becomes harder when the sources are unstructured/semistructured and we do not have complete knowledge of their contents and structure. In this paper we show how many common fusion operations can be specified non-procedurally and succinctly. The key to our approach is to assign semantically meaningful object ids to objects as they are "imported " into the mediator.
A Query Translation Scheme for Rapid Implementation of Wrappers
, 1995
"... Wrappers provide access to heterogeneous information sources by converting application queries into source specific queries or commands. In this paper we present a wrapper implementation toolkit that facilitates rapid development of wrappers. We focus on the query translation component of the toolki ..."
Abstract
-
Cited by 123 (22 self)
- Add to MetaCart
Wrappers provide access to heterogeneous information sources by converting application queries into source specific queries or commands. In this paper we present a wrapper implementation toolkit that facilitates rapid development of wrappers. We focus on the query translation component of the toolkit, called the converter. The converter takes as input a Query Description and Translation Language (QDTL) description of the queries that can be processed by the underlying source. Based on this description the converter decides if an application query is (a) directly supported, i.e., it can be translated to a query of the underlying system following instructions in the QDTL description; (b) logically supported, i.e., logically equivalent to a directly supported query; (c) indirectly supported, i.e., it can be computed by applying a filter, automatically generated by the converter, to the result of a directly supported query. 1 Introduction A wrapper or translator [C + 94, PGMW95] is a s...
MedMaker: A Mediation System Based on Declarative Specifications
- INTERNATIONAL CONFERENCE ON DATA ENGINEERING
, 1996
"... Mediators are used for integration of heterogeneous information sources. We present a system for declaratively specifying mediators. It is targeted for integration of sources with unstructured or semi-structured data and/or sources with changing schemas. We illustrate the main features of the Mediat ..."
Abstract
-
Cited by 120 (17 self)
- Add to MetaCart
Mediators are used for integration of heterogeneous information sources. We present a system for declaratively specifying mediators. It is targeted for integration of sources with unstructured or semi-structured data and/or sources with changing schemas. We illustrate the main features of the Mediator Specification Language (MSL), show how they facilitate integration, and describe the implementation of the system that interprets the MSL specifications.
Scaling Heterogeneous Databases and the Design of Disco
, 1995
"... Access to large numbers of data sources introduces new problems for users of heterogeneous distributed databases. End users and application programmers must deal with unavailable data sources. Database administrators must deal with incorporating each new data source into the system. Database impleme ..."
Abstract
-
Cited by 116 (13 self)
- Add to MetaCart
Access to large numbers of data sources introduces new problems for users of heterogeneous distributed databases. End users and application programmers must deal with unavailable data sources. Database administrators must deal with incorporating each new data source into the system. Database implementors must deal with the transformation of queries between query languages and schemas. The Distributed Information Search COmponent (DISCO) addresses these problems. Query processing semantics give meaning to queries that reference unavailable data sources. Data modeling techniques manage connections to data sources. The component interface to data sources flexibly handles different query languages and different interface functionalities. This paper describes in detail (a) the distributed mediator architecture of DISCO, (b) its query processing semantics, (c) the data model and its modeling of data source connections, and (d) the interface to underlying data sources. We describe several advantages of our system and describe the internal architecture of our planned prototype.
Capabilities-based Query Rewriting in Mediator Systems
- INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED INFORMATION SYSTEMS
, 1996
"... Users today are struggling to integrate a broad range of information sources providing different levels of query capabilities. Currently, data sources with different and limited capabilities are accessed either by writing rich functional wrappers for the more primitive sources, or by dealing with al ..."
Abstract
-
Cited by 77 (15 self)
- Add to MetaCart
Users today are struggling to integrate a broad range of information sources providing different levels of query capabilities. Currently, data sources with different and limited capabilities are accessed either by writing rich functional wrappers for the more primitive sources, or by dealing with all sources at a “lowest common denominator”. This paper explores a third approach, in which a mediator ensures that sources receive queries they can handle, while still taking advantage of all of the query power of the source. We propose an architecture that enables this, and identify a key component of that architecture, the Capabilities-Based Rewriter (CBR). The CBR takes as input a description of the capabilities of a data source, and a query targeted for that data source. From these, the CBR determines component queries to be sent to the sources, commensurate with their abilities, and computes a plan for combining their results using joins, unions, selections, and projections. We provide a language to describe the query capability of data sources and a plan generation algorithm. Our description language and plan generation algorithm are schema independent and handle SPJ queries.
Integrating and Accessing Heterogeneous Information Sources in TSIMMIS
- In Proceedings of the AAAI Symposium on Information Gathering
, 1995
"... The goal of the Tsimmis Project is to develop tools that facilitate the rapid integration of heterogeneous information sources that may include both structured and unstructured data. This paper gives an overview of the project, describing components that extract properties from unstructured objects, ..."
Abstract
-
Cited by 67 (2 self)
- Add to MetaCart
The goal of the Tsimmis Project is to develop tools that facilitate the rapid integration of heterogeneous information sources that may include both structured and unstructured data. This paper gives an overview of the project, describing components that extract properties from unstructured objects, that translate information into a common object model, that combine information from several sources, and that allow browsing of information.
A Methodology For Query Reformulation In Cis Using Semantic Knowledge
, 1996
"... We consider Cooperative Information Systems (CIS) that are multidatabase systems (MDBMS), with a common object-oriented model, based on the ODMG standard, together with local databases that may be relational, object-oriented, or dedicated data servers. The MDBMS interface (or mediator interface) tha ..."
Abstract
-
Cited by 43 (8 self)
- Add to MetaCart
We consider Cooperative Information Systems (CIS) that are multidatabase systems (MDBMS), with a common object-oriented model, based on the ODMG standard, together with local databases that may be relational, object-oriented, or dedicated data servers. The MDBMS interface (or mediator interface) that describes this CIS could be different from the union of the local interfaces, that describe each local database. In particular, the mediator interface may be defined by semantic knowledge, that includes views over particular local databases, integrity constraints, and knowledge about data replication in local databases. We present a methodology for query reformulation which is based on the uniform representation of all semantic knowledge in the form of integrity assertions and mapping rules. A reformulation algorithm exploits this semantic knowledge, and performs semantic rewriting based on pattern-matching, to obtain a query on the union of the local interfaces. A decomposition algorithm ...
Expressive Capabilities Description Languages and Query Rewriting Algorithms
- Journal of Logic Programming
"... Information integration systems have to cope with a wide variety of different information sources, which support query interfaces with very varied capabilities. To deal with this problem, the integration systems need descriptions of the query capabilities of each source, i.e., the set of queries sup ..."
Abstract
-
Cited by 27 (11 self)
- Add to MetaCart
Information integration systems have to cope with a wide variety of different information sources, which support query interfaces with very varied capabilities. To deal with this problem, the integration systems need descriptions of the query capabilities of each source, i.e., the set of queries supported by each source. Moreover, the integration systems need algorithms for deciding how a query can be answered given the capabilities of the sources. Finally, they need to translate a query into the format that the source understands. We present two languages suitable for descriptions of query capabilities of sources and compare their expressive power. We also use one of the languages to automatically derive the capabilities description of the integration system itself, in terms of the capabilities of the sources it integrates. We describe algorithms for deciding whether a query "matches" the description and show their application to the problem of translating user queries into source-spe...

