Results 1 - 10
of
76
Using semantic values to facilitate interoperability among heterogeneous information systems
- In ACM Transactions on Database Systems
, 1994
"... Large organizations need to exchange information among many separately developed systems. In order for this exchange to be useful, the individual systems must agree on the meaning of their exchanged data. That is, the organization must ensure semantzc mteroperabdity. This paper provides a theory of ..."
Abstract
-
Cited by 175 (27 self)
- Add to MetaCart
Large organizations need to exchange information among many separately developed systems. In order for this exchange to be useful, the individual systems must agree on the meaning of their exchanged data. That is, the organization must ensure semantzc mteroperabdity. This paper provides a theory of semantzc values as a unit of exchange that facilitates semantic interoperability between heterogeneous information systems. We show how semantic values can either be stored explicitly or be defined by environments. A system architecture is presented that allows autonomous components to share semantic values. The key component in this architecture is called the context medzator, whose job is to identify and construct the semantic values being sent, to determine when the exchange is meaningful, and to convert the semantic values to the form required by the receiver. Our theory is then applied to the relational model. We provide an interpretalon of standard SQL queries in which context conversions and manipulations are transparent to the user. We also introduce an extension of SQL, called Context-SQL (C-SQL), in which the context of a semantic value can be explicitly accessed and updated. Finally, we describe the implementation of a prototype context mediator for a relational C-SQL system.
Semantic and schematic similarities between database objects: A context-based approach
- VLDB Journal
, 1996
"... Inamultidatabase system, schematic con icts between two objects are usually of interest only when the objects have some semantic similarity. We use the concept of semantic proximity, which is essentially an abstraction/mapping between the domains of the two objects associated with the context of com ..."
Abstract
-
Cited by 141 (12 self)
- Add to MetaCart
Inamultidatabase system, schematic con icts between two objects are usually of interest only when the objects have some semantic similarity. We use the concept of semantic proximity, which is essentially an abstraction/mapping between the domains of the two objects associated with the context of comparison. An explicit though partial context representation is proposed and the speci city relationship between contexts is de ned. The contexts are organized as a meet semi-lattice and associated operations like the greatest lower bound (glb) are de ned. The context of comparison and the type of abstractions used to relate the two objects form the basis of a semantic taxonomy. Atthesemantic level, the intensional description of database objects provided by the context is expressed in a description logic language. Schema correspondences are used to store mappings from the semantic level to the data level and are associated with the respective contexts. Inferences about database content at the federation level are modeled as changes in the context and the associated schema correspondences. We try to reconcile the dual (schematic and semantic) perspecitves by: enumerating possible semantic similarities between objects having schema and data conicts, and modeling schema correspondences as the projection of semantic proximity wrt context. 1
Semantic E-Workflow Composition
- Journal of Intelligent Information Systems
, 2003
"... Systems and infrastructures are currently being developed to support Web services. The main idea is to encapsulate an organization’s functionality within an appropriate interface and advertise it as Web services. While in some cases Web services may be utilized in an isolated form, it is normal to e ..."
Abstract
-
Cited by 112 (19 self)
- Add to MetaCart
Systems and infrastructures are currently being developed to support Web services. The main idea is to encapsulate an organization’s functionality within an appropriate interface and advertise it as Web services. While in some cases Web services may be utilized in an isolated form, it is normal to expect Web services to be integrated as part of workflow processes. The composition of workflow processes that model e-service applications differs from the design of traditional workflows, in terms of the number of tasks (Web services) available to the composition process, in their heterogeneity, and in their autonomy. Therefore, two problems need to be solved: how to efficiently discover Web services – based on functional and operational requirements – and how to facilitate the interoperability of heterogeneous Web services. In this paper, we present a solution within the context of the emerging Semantic Web, that includes use of ontologies to overcome some of the problems. We start by illustrating the steps involved in the composition of a workflow. Two of these steps are the discovery of Web services and their posterior integration into a workflow. To assist designers with those two steps, we have devised an algorithm to simultaneously discover Web services and resolve heterogeneity among their interfaces and the workflow host. Finally, we describe a prototype that has been implemented to illustrate how discovery and interoperability functions are achieved.
So Far (Schematically) yet so Near (Semantically)
, 1992
"... In a multidatabase system, schematic conflicts between two objects are usually of interest only when the objects have some semantic affinity. In this paper we try to reconcile the two perspectives. We first define the concept of semantic proximity and provide a semantic taxonomy. We then enumerate a ..."
Abstract
-
Cited by 93 (1 self)
- Add to MetaCart
In a multidatabase system, schematic conflicts between two objects are usually of interest only when the objects have some semantic affinity. In this paper we try to reconcile the two perspectives. We first define the concept of semantic proximity and provide a semantic taxonomy. We then enumerate and classify the schematic and data conflicts. We discuss possible semantic similarities between two objects that have various types of schematic and data conflicts. Issues of uncertain information and inconsistent information are also addressed.
An approach to resolving semantic heterogeneity in a federation of autonomous, heterogeneous database systems
- INTERNATIONAL JOURNAL OF INTELLIGENT AND COOPERATIVE INFORMATION SYSTEMS
, 1993
"... An approach to accommodating semantic heterogeneity in a federation of interoperable, autonomous, heterogeneous databases is presented. A mechanism is described for identifying and resolving semantic heterogeneity while at the same time honoring the autonomy of the database components that participa ..."
Abstract
-
Cited by 83 (3 self)
- Add to MetaCart
An approach to accommodating semantic heterogeneity in a federation of interoperable, autonomous, heterogeneous databases is presented. A mechanism is described for identifying and resolving semantic heterogeneity while at the same time honoring the autonomy of the database components that participate in the federation. A minimal, common data model is introduced as the basis for describing sharable information, and a three-pronged facility for determining the relationships between information units (objects) is developed. Our approach serves as a basis for the sharing of related concepts through (partial) schema unification without the need for a global view of the data that is stored in the different components. The mechanism presented here can be seen in contrast with more traditional approaches such as “integrated databases” or “distributed databases”. An experimental prototype implementation has been constructed within the framework of the Remote-Exchange experimental system.
BioKleisli: A Digital Library for Biomedical Researchers
, 1996
"... Data of interest to biomedical researchers associated with the Human Genome Project (HGP) is stored all over the world in a number of different electronic data formats and accessible through a varietyof interfaces and retrieval languages. These data sources include conventional relational databases ..."
Abstract
-
Cited by 70 (15 self)
- Add to MetaCart
Data of interest to biomedical researchers associated with the Human Genome Project (HGP) is stored all over the world in a number of different electronic data formats and accessible through a varietyof interfaces and retrieval languages. These data sources include conventional relational databases with SQL interfaces, formatted text files on top of which indexing is provided for efficient retrieval (ASN.1-Entrez), and binary files that can be interpreted textually or graphically via special purpose interfaces (ACeDB). Researchers within the HGP wanttocombine data from these different data sources, add value through sophisticated data analysis techniques (such as the biosequence comparison software BLAST and FASTA), and view it using special purpose scientific visualization tools. However, currently there are no commercial tools for enabling such an integrated digital library, and a fundamental barrier to developing such tools appears to be one of language design and optimization: The data f...
A Data Transformation System for Biological Data Sources
- In Proceedings of 21st International Conference on Very Large Data Bases
, 1995
"... Scientific data of importance to biologists in the Human Genome Project resides not only in conventional databases, but in structured files maintained in a number of different formats (e.g. ASN.1 and ACE) as well as sequence analysis packages (e.g. BLAST and FASTA). These formats and packages contai ..."
Abstract
-
Cited by 69 (19 self)
- Add to MetaCart
Scientific data of importance to biologists in the Human Genome Project resides not only in conventional databases, but in structured files maintained in a number of different formats (e.g. ASN.1 and ACE) as well as sequence analysis packages (e.g. BLAST and FASTA). These formats and packages contain a number of data types not found in conventional databases, such as lists and variants, and may be deeply nested. We present in this paper techniques for querying and transforming such data, and illustrate their use in a prototype system developed in conjunction with the Human Genome Center for Chromosome 22. We also describe optimizations performed by the system, a crucial issue for bulk data. 1 Introduction The goal of the Human Genome Project (HGP) is to sequence the 24 distinct chromosomes comprising the human genome. Much of the information associated with the HGP resides not in conventional databases, but in files that have been formatted according to a variety of conventions. These...
Storage and Querying of E-Commerce Data
, 2001
"... New generation of e-commerce applications require data schemas that are constantly evolving and sparsely populated. The conventional horizontal row representation fails to meet these requirements. We represent objects in a vertical format storing an object as a set of tuples. Each tuple consists of ..."
Abstract
-
Cited by 60 (1 self)
- Add to MetaCart
New generation of e-commerce applications require data schemas that are constantly evolving and sparsely populated. The conventional horizontal row representation fails to meet these requirements. We represent objects in a vertical format storing an object as a set of tuples. Each tuple consists of an object identifier and attribute name-value pair. Schema evolution is now easy. However, writing queries against this format becomes cumbersome. We create a logical horizontal view of the vertical representation and transform queries on this view to the vertical table. We present alternative implementations and performance results that show the effectiveness of the vertical representation for sparse data. We also identify additional facilities needed in database systems to support these applications well.
Using Schematically Heterogeneous Structures
- IN SIGMOD
, 1998
"... Schematic heterogeneity arises when information that is represented as data under one schema, is represented within the schema (as metadata) in another. Schematic heterogeneity is an important class of heterogeneity that arises frequently in integrating legacy data in federated or data warehousing a ..."
Abstract
-
Cited by 58 (3 self)
- Add to MetaCart
Schematic heterogeneity arises when information that is represented as data under one schema, is represented within the schema (as metadata) in another. Schematic heterogeneity is an important class of heterogeneity that arises frequently in integrating legacy data in federated or data warehousing applications. Traditional query languages and view mechanisms are insufficient for reconciling and translating data between schematically heterogeneous schemas. Higher order query languages, that permit quantification over schema labels, have been proposed to permit querying and restructuring of data between schematically disparate schemas. We extend this work by considering how these languages can be used in practice. Specifically, we consider a restricted class of higher order views and show the power of these views in integrating legacy structures. Our results provide insights into the properties of restructuring transformations required to resolve schematic discrepancies. In addition, we ...
A polygen model for Heterogeneous Database Systems: The Source Tagging Perspective
- WP # 3119-90 MSA. (Sloan School of Management, MIT
, 1990
"... This paper studies heterogeneous database systems from the multiple (poly) source @rrt) perspective. It aims at addressing issues such as “where is the data from ” and “which intermediate data sources were used to arrive at that data ”- issues which are critical to many users in utilizing informatio ..."
Abstract
-
Cited by 47 (7 self)
- Add to MetaCart
This paper studies heterogeneous database systems from the multiple (poly) source @rrt) perspective. It aims at addressing issues such as “where is the data from ” and “which intermediate data sources were used to arrive at that data ”- issues which are critical to many users in utilizing information composed from multiple sources. Specifically, it presents a polygen model for resolving the Data Source Tagging and Intermediate Source Tagging problems. Secondly, it presents a data-driven query translation mechanism for mapping a polygen query into a set of local queries dynamically. A concrete example is also provided to exemplify polygen query processing. The significance of this paper lies not only in a precise characterization of a practical problem and a solution per se, but also in the establishment of a foundation for resolving many other critical research issues such as domain mismatch, semantic reconciliation, and data conflict amongst data retrieved from different sources. In a federated database environment with hundreds of databases, all of these issues are critical to their effective USt!. I.

