Results 1 - 10
of
24
Object exchange across heterogeneous information sources
- INTERNATIONAL CONFERENCE ON DATA ENGINEERING
, 1995
"... We address the problem of providing integrated access to diverse and dynamic information sources. We explain how this problem differs from the traditional database integration problem and we focus on one aspect of the information integration problem, namely information exchange. We define an object- ..."
Abstract
-
Cited by 465 (56 self)
- Add to MetaCart
We address the problem of providing integrated access to diverse and dynamic information sources. We explain how this problem differs from the traditional database integration problem and we focus on one aspect of the information integration problem, namely information exchange. We define an object-based information exchange model and a corresponding query language that we believe are well suited for integration of diverse information sources. We describe how, the model and language have been used to integrate heterogeneous bibliographic information sources. We also describe two general-purpose libraries we have implemented for object exchange between clients and servers.
The TSIMMIS Project: Integration of Heterogeneous Information Sources
"... The goal of the Tsimmis Project is to develop tools that facilitate the rapid integration of heterogeneous information sources that may include both structured and unstructured data. This paper gives an overview of the project, describing components that extract properties from unstructured objects, ..."
Abstract
-
Cited by 451 (16 self)
- Add to MetaCart
The goal of the Tsimmis Project is to develop tools that facilitate the rapid integration of heterogeneous information sources that may include both structured and unstructured data. This paper gives an overview of the project, describing components that extract properties from unstructured objects, that translate information into a common object model, that combine information from several sources, that allow browsing of information, and that manage constraints across heterogeneous sites. Tsimmis is a joint project between Stanford and the IBM Almaden Research Center.
The TSIMMIS Approach to Mediation: Data Models and Languages
- JOURNAL OF INTELLIGENT INFORMATION SYSTEMS
, 1997
"... TSIMMIS -- The Stanford-IBM Manager of Multiple Information Sources -- is a system for integrating information. It o ers a data model and a common query language that are designed to support the combining of information from many different sources. It also o ers tools for generating automatically th ..."
Abstract
-
Cited by 344 (8 self)
- Add to MetaCart
TSIMMIS -- The Stanford-IBM Manager of Multiple Information Sources -- is a system for integrating information. It o ers a data model and a common query language that are designed to support the combining of information from many different sources. It also o ers tools for generating automatically the components that are needed to build systems for integrating information. In this paper we shall discuss the principal architectural features and their rationale.
Wrapper Induction: Efficiency and Expressiveness
- Artificial Intelligence
, 2000
"... The Internet presents numerous sources of useful information---telephone directories, product catalogs, stock quotes, event listings, etc. Recently, many systems have been built that automatically gather and manipulate such information on a user's behalf. However, these resources are usually formatt ..."
Abstract
-
Cited by 191 (12 self)
- Add to MetaCart
The Internet presents numerous sources of useful information---telephone directories, product catalogs, stock quotes, event listings, etc. Recently, many systems have been built that automatically gather and manipulate such information on a user's behalf. However, these resources are usually formatted for use by people (e.g., the relevant content is embedded in HTML pages), so extracting their content is difficult. Most systems use customized wrapper procedures to perform this extraction task. Unfortunately, writing wrappers is tedious and error-prone. As an alternative, we advocate wrapper induction, a technique for automatically constructing wrappers. In this article, we describe six wrapper classes, and use a combination of empirical and analytical techniques to evaluate the computational tradeoffs among them. We first consider expressiveness: how well the classes can handle actual Internet resources, and the extent to which wrappers in one class can mimic those in another. We then...
Object fusion in mediator systems
- INTERNATIONAL CONFERENCE ON VERY LARGE DATA BASES
, 1996
"... One of the main tasks of mediators is to fuse information from heterogeneous information sources. This may involve, for example, removing redundancies, and resolving inconsistencies in favor of the most reliable source. The problem becomes harder when the sources are unstructured/semistructured and ..."
Abstract
-
Cited by 155 (29 self)
- Add to MetaCart
One of the main tasks of mediators is to fuse information from heterogeneous information sources. This may involve, for example, removing redundancies, and resolving inconsistencies in favor of the most reliable source. The problem becomes harder when the sources are unstructured/semistructured and we do not have complete knowledge of their contents and structure. In this paper we show how many common fusion operations can be specified non-procedurally and succinctly. The key to our approach is to assign semantically meaningful object ids to objects as they are "imported " into the mediator.
A Query Translation Scheme for Rapid Implementation of Wrappers
, 1995
"... Wrappers provide access to heterogeneous information sources by converting application queries into source specific queries or commands. In this paper we present a wrapper implementation toolkit that facilitates rapid development of wrappers. We focus on the query translation component of the toolki ..."
Abstract
-
Cited by 123 (22 self)
- Add to MetaCart
Wrappers provide access to heterogeneous information sources by converting application queries into source specific queries or commands. In this paper we present a wrapper implementation toolkit that facilitates rapid development of wrappers. We focus on the query translation component of the toolkit, called the converter. The converter takes as input a Query Description and Translation Language (QDTL) description of the queries that can be processed by the underlying source. Based on this description the converter decides if an application query is (a) directly supported, i.e., it can be translated to a query of the underlying system following instructions in the QDTL description; (b) logically supported, i.e., logically equivalent to a directly supported query; (c) indirectly supported, i.e., it can be computed by applying a filter, automatically generated by the converter, to the result of a directly supported query. 1 Introduction A wrapper or translator [C + 94, PGMW95] is a s...
MedMaker: A Mediation System Based on Declarative Specifications
- INTERNATIONAL CONFERENCE ON DATA ENGINEERING
, 1996
"... Mediators are used for integration of heterogeneous information sources. We present a system for declaratively specifying mediators. It is targeted for integration of sources with unstructured or semi-structured data and/or sources with changing schemas. We illustrate the main features of the Mediat ..."
Abstract
-
Cited by 120 (17 self)
- Add to MetaCart
Mediators are used for integration of heterogeneous information sources. We present a system for declaratively specifying mediators. It is targeted for integration of sources with unstructured or semi-structured data and/or sources with changing schemas. We illustrate the main features of the Mediator Specification Language (MSL), show how they facilitate integration, and describe the implementation of the system that interprets the MSL specifications.
Capabilities-based Query Rewriting in Mediator Systems
- INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED INFORMATION SYSTEMS
, 1996
"... Users today are struggling to integrate a broad range of information sources providing different levels of query capabilities. Currently, data sources with different and limited capabilities are accessed either by writing rich functional wrappers for the more primitive sources, or by dealing with al ..."
Abstract
-
Cited by 77 (15 self)
- Add to MetaCart
Users today are struggling to integrate a broad range of information sources providing different levels of query capabilities. Currently, data sources with different and limited capabilities are accessed either by writing rich functional wrappers for the more primitive sources, or by dealing with all sources at a “lowest common denominator”. This paper explores a third approach, in which a mediator ensures that sources receive queries they can handle, while still taking advantage of all of the query power of the source. We propose an architecture that enables this, and identify a key component of that architecture, the Capabilities-Based Rewriter (CBR). The CBR takes as input a description of the capabilities of a data source, and a query targeted for that data source. From these, the CBR determines component queries to be sent to the sources, commensurate with their abilities, and computes a plan for combining their results using joins, unions, selections, and projections. We provide a language to describe the query capability of data sources and a plan generation algorithm. Our description language and plan generation algorithm are schema independent and handle SPJ queries.
Integrating and Accessing Heterogeneous Information Sources in TSIMMIS
- In Proceedings of the AAAI Symposium on Information Gathering
, 1995
"... The goal of the Tsimmis Project is to develop tools that facilitate the rapid integration of heterogeneous information sources that may include both structured and unstructured data. This paper gives an overview of the project, describing components that extract properties from unstructured objects, ..."
Abstract
-
Cited by 67 (2 self)
- Add to MetaCart
The goal of the Tsimmis Project is to develop tools that facilitate the rapid integration of heterogeneous information sources that may include both structured and unstructured data. This paper gives an overview of the project, describing components that extract properties from unstructured objects, that translate information into a common object model, that combine information from several sources, and that allow browsing of information.
A methodology for integration of heterogeneous databases
- IEEE Transactions on Knowledge and Data Engineering
, 1994
"... PROFIT sponsor firms. ..."

