Extending the relational algebra with the Mapper operator (2005)
| Citations: | 3 - 2 self |
BibTeX
@TECHREPORT{Carreira05extendingthe,
author = {Paulo Carreira and Antónia Lopes and Helena Galhardas and João Pereira},
title = {Extending the relational algebra with the Mapper operator},
institution = {},
year = {2005}
}
OpenURL
Abstract
Application scenarios such as legacy data migration, Extract-Transform-Load (ETL) processes, and data cleaning require the transformation of input tuples into output tuples. Traditional approaches for implementing these data transformations enclose solutions as Persistent Stored Modules (PSM) executed by an RDBMS or transformation code using a commercial ETL tool. Neither of these is easily maintainable or optimizable. A third approach consists of combining SQL queries with external code, written in a programming language. However, this solution is not expressive enough to specify an important class of data transformations that produce several output tuples for a single input tuple. In this paper, we propose the data mapper operator as an extension to the relational algebra to address this class of data transformations. Furthermore, we supply a set of algebraic rewriting rules for optimizing expressions that combine standard relational operators with mappers. Finally, experimental results report the benefits brought by some of the proposed semantic optimizations. 1







