Results 1 - 10
of
55
Semantic integration research in the database community: A brief survey
- AI Magazine
, 2005
"... Semantic integration has been a long-standing challenge for the database community. It has received steady attention over the past two decades, and has now become a prominent area of database research. In this article, we first review database applications that require semantic integration, and disc ..."
Abstract
-
Cited by 75 (4 self)
- Add to MetaCart
Semantic integration has been a long-standing challenge for the database community. It has received steady attention over the past two decades, and has now become a prominent area of database research. In this article, we first review database applications that require semantic integration, and discuss the difficulties underlying the integration process. We then describe recent progress and identify open research issues. We will focus in particular on schema matching, a topic that has received much attention in the database community, but will also discuss data matching (e.g., tuple deduplication), and open issues beyond the match discovery context (e.g., reasoning with matches, match verification and repair, and reconciling inconsistent data values). For previous surveys of database research on semantic integration, see (Rahm & Bernstein 2001;
An interactive clustering-based approach to integrating source query interfaces on the deep web
- In SIGMOD
, 2004
"... An increasing number of data sources now become available on the Web, but often their contents are only accessible through query interfaces. For a domain of interest, there often exist many such sources with varied coverage or querying capabilities. As an important step to the integration of these s ..."
Abstract
-
Cited by 73 (14 self)
- Add to MetaCart
An increasing number of data sources now become available on the Web, but often their contents are only accessible through query interfaces. For a domain of interest, there often exist many such sources with varied coverage or querying capabilities. As an important step to the integration of these sources, we consider the integration of their query interfaces. More specifically, we focus on the crucial step of the integration: accurately matching the interfaces. While the integration of query interfaces has received more attentions recently, current approaches are not sufficiently general: (a) they all model interfaces with flat schemas; (b) most of them only consider 1:1 mappings of fields over the interfaces; (c) they all perform the integration in a blackbox-like fashion and the whole process has to be restarted from scratch if anything goes wrong; and (d) they often require laborious parameter tuning. In this paper, we propose an interactive, clustering-based approach to matching query interfaces. The hierarchical nature of interfaces is captured with ordered trees. Varied types of complex mappings of fields are examined and several approaches are proposed to effectively identify these mappings. We put the human integrator back in the loop and propose several novel approaches to the interactive learning of parameters and the resolution of uncertain mappings. Extensive experiments are conducted and results show that our approach is highly effective. 1.
Generic Model Management: Concepts and Algorithms
- PH.D. THESIS
, 2003
"... Many challenging problems facing information systems engineering involve
the manipulation of complex metadata artifacts, or models, such as database
schemas, interface specifications, or object diagrams, and mappings between
models. The applications that solve metadata manipulation problems are
comp ..."
Abstract
-
Cited by 58 (3 self)
- Add to MetaCart
Many challenging problems facing information systems engineering involve
the manipulation of complex metadata artifacts, or models, such as database
schemas, interface specifications, or object diagrams, and mappings between
models. The applications that solve metadata manipulation problems are
complex and hard to build. The goal of generic model management is to
reduce the amount of programming needed to develop such applications by
providing a database infrastructure in which a set of high-level algebraic
operators, such as Match, Merge, and Compose, are applied to models and
mappings as a whole rather than to their individual building blocks.
This dissertation presents an initial study of the concepts and algorithms
for generic model management. We describe the first prototype of a generic
model management system, introduce the algebraic operators that are used to
manipulate models and mappings, clarify the semantics of the operators, and
develop novel algorithms for implementing them. In particular, we present an
innovative algorithm based on fixpoint computation that is used for implementing
the generic operator Match, which finds correspondences between
two models. Using the prototype and the operators presented in the dissertation,
we develop solutions for several practically relevant problems, such as
change propagation and reintegration.
An Ontology-Driven Framework for Data Transformation in Scientific Workflows
- DILS
, 2004
"... Ecologists spend considerable e#ort integrating heterogeneous data for statistical analyses and simulations, for example, to run and test predictive models. Our research is focused on reducing this e#ort by providing data integration and transformation tools, allowing researchers to focus on "re ..."
Abstract
-
Cited by 42 (7 self)
- Add to MetaCart
Ecologists spend considerable e#ort integrating heterogeneous data for statistical analyses and simulations, for example, to run and test predictive models. Our research is focused on reducing this e#ort by providing data integration and transformation tools, allowing researchers to focus on "real science," that is, discovering new knowledge through analysis and modeling. This paper defines a generic framework for transforming heterogeneous data within scientific workflows. Our approach relies on a formalized ontology, which serves as a simple, unstructured global schema. In the framework, inputs and outputs of services within scientific workflows can have structural types and separate semantic types (expressions of the target ontology). In addition, a registration mapping can be defined to relate input and output structural types to their corresponding semantic types. Using registration mappings, appropriate data transformations can then be generated for each desired service composition. Here, we describe our proposed framework and an initial implementation for services that consume and produce XML data.
First experiments with the ATL model transformation language: Transforming XSLT into XQuery
- 2nd OOPSLA Workshop on Generative Techniques in the context of Model Driven Architecture
, 2003
"... ATL (Atlas Transformation Language) has been defined to perform general transformations within the MDA framework (Model Driven Architecture) recently proposed by the OMG. We are currently learning from the first applications developed with this language. The example used here is a transformation fro ..."
Abstract
-
Cited by 40 (6 self)
- Add to MetaCart
ATL (Atlas Transformation Language) has been defined to perform general transformations within the MDA framework (Model Driven Architecture) recently proposed by the OMG. We are currently learning from the first applications developed with this language. The example used here is a transformation from XSLT to XQuery. Since these are two standard notations that don’t pertain to the MDA space, we first need to provide some justification about this work. The global organization of technological spaces presented at the beginning of the paper is intended to answer this first question. Furthermore we propose the original characterization of a technological space as a framework based on a given unique meta-model. After having briefly presented the ATL framework, we describe the XSLT2XQuery transformation. We may then draw several conclusions from this experiment, suggesting possible improvements to general model transformation frameworks. ATL is still evolving since it is supposed to match the forthcoming QVT/RFP recommendation when it is ready. As a consequence, the concrete expression of the transformation presented in this paper may change, but the general ideas should remain stable.
Reflective Model Driven Engineering
, 2003
"... In many large organizations, the model transformations allowing the engineers to more or less automatically go from platformindependent models (PIM) to platform-specific models (PSM) are increasingly seen as vital assets. As tools evolve, it is critical that these transformations are not prisone ..."
Abstract
-
Cited by 30 (3 self)
- Add to MetaCart
In many large organizations, the model transformations allowing the engineers to more or less automatically go from platformindependent models (PIM) to platform-specific models (PSM) are increasingly seen as vital assets. As tools evolve, it is critical that these transformations are not prisoners of a given CASE tool. Considering in this paper that a CASE tool can be seen as a platform for processing a model transformation, we propose to reflectively apply the MDA to itself. We propose
Implementing Mapping Composition
- IN VLDB
, 2006
"... Mapping composition is a fundamental operation in metadata driven applications. Given a mapping over schemas #1 and #2 and a mapping over schemas #2 and #3 , the composition problem is to compute an equivalent mapping over #1 and #3 . We describe a new composition algorithm that targets practical ap ..."
Abstract
-
Cited by 29 (3 self)
- Add to MetaCart
Mapping composition is a fundamental operation in metadata driven applications. Given a mapping over schemas #1 and #2 and a mapping over schemas #2 and #3 , the composition problem is to compute an equivalent mapping over #1 and #3 . We describe a new composition algorithm that targets practical applications. It incorporates view unfolding. It eliminates as many #2 symbols as possible, even if not all can be eliminated. It covers constraints expressed using arbitrary monotone relational operators and, to a lesser extent, non-monotone operators. And it introduces the new technique of left composition. We describe our implementation, explain how to extend it to support user-defined operators, and present experimental results which validate its effectiveness.
View merging in the presence of incompleteness and inconsistency
- Requir. Eng
, 2006
"... View merging, also called view integration, is a key problem in conceptual modeling. Large models are often constructed and accessed by manipulating individual views, but it is important to be able to consolidate a set of views to gain a unified perspective, to understand interactions between views, ..."
Abstract
-
Cited by 24 (10 self)
- Add to MetaCart
View merging, also called view integration, is a key problem in conceptual modeling. Large models are often constructed and accessed by manipulating individual views, but it is important to be able to consolidate a set of views to gain a unified perspective, to understand interactions between views, or to perform various types of analysis. View merging is complicated by incompleteness and inconsistency: Stakeholders often have varying degrees of confidence about their statements. Their views capture different but overlapping aspects of a problem, and may have discrepancies over the terminology being used, the concepts being modeled, or how these concepts should be structured. Once views are merged, it is important to be able to trace the elements of the merged view back to their sources and to the merge assumptions related to them. In this paper, we present a framework for merging incomplete and inconsistent graph-based views. We introduce a formalism, called annotated graphs, with a built-in annotation scheme for modeling incompleteness and inconsistency. We show how structure-preserving maps can be employed to express the relationships between disparate views modeled as annotated graphs, and provide a general algorithm for merging views with arbitrary interconnections. We provide a systematic way to generate and represent the traceability information required for tracing the merged view elements back to their sources, and to the merge assumptions giving rise to the elements.
Supporting executable mappings in model management
- In SIGMOD
, 2005
"... Model management is an approach to simplify the programming of metadata-intensive applications. It offers developers powerful operators, such as Compose, Diff, and Merge, that are applied to models, such as database schemas or interface specifications, and to mappings between models. Prior model man ..."
Abstract
-
Cited by 21 (1 self)
- Add to MetaCart
Model management is an approach to simplify the programming of metadata-intensive applications. It offers developers powerful operators, such as Compose, Diff, and Merge, that are applied to models, such as database schemas or interface specifications, and to mappings between models. Prior model management solutions focused on a simple class of mappings that do not have executable semantics. Yet many metadata applications require that mappings be executable, expressed in SQL, XSLT, or other data transformation languages. In this paper, we develop a semantics for model-management operators that allows applying the operators to executable mappings. Our semantics captures previously-proposed desiderata and is language-independent: the effect of the operators is expressed in terms of what they do to the instances of models and mappings. We describe an implemented prototype in which mappings are represented as dependencies between relational schemas, and discuss algebraic optimization of model-management scripts. 1.

