• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

Efficient query reformulation in peer-data management systems (2004)

by I Tatarinov, A Y Halevy
Venue:in SIGMOD
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 67
Next 10 →

Implementing Mapping Composition

by Philip A. Bernstein , Todd J. Green, Sergey Melnik, Alan Nash - IN VLDB , 2006
"... Mapping composition is a fundamental operation in metadata driven applications. Given a mapping over schemas #1 and #2 and a mapping over schemas #2 and #3 , the composition problem is to compute an equivalent mapping over #1 and #3 . We describe a new composition algorithm that targets practical ap ..."
Abstract - Cited by 29 (3 self) - Add to MetaCart
Mapping composition is a fundamental operation in metadata driven applications. Given a mapping over schemas #1 and #2 and a mapping over schemas #2 and #3 , the composition problem is to compute an equivalent mapping over #1 and #3 . We describe a new composition algorithm that targets practical applications. It incorporates view unfolding. It eliminates as many #2 symbols as possible, even if not all can be eliminated. It covers constraints expressed using arbitrary monotone relational operators and, to a lesser extent, non-monotone operators. And it introduces the new technique of left composition. We describe our implementation, explain how to extend it to support user-defined operators, and present experimental results which validate its effectiveness.

Orchestra: Rapid, collaborative sharing of dynamic data

by Zachary Ives, Nitin Khandelwal, Aneesh Kapur, Murat Cakir - In CIDR , 2005
"... Conventional data integration techniques employ a “top-down ” design philosophy, starting by assessing requirements and defining a global schema, and then mapping data sources to that schema. This works well if the problem domain is well-understood and relatively static, as with enterprise data. How ..."
Abstract - Cited by 25 (7 self) - Add to MetaCart
Conventional data integration techniques employ a “top-down ” design philosophy, starting by assessing requirements and defining a global schema, and then mapping data sources to that schema. This works well if the problem domain is well-understood and relatively static, as with enterprise data. However, it is fundamentally mismatched with the “bottom-up ” model of scientific data sharing, in which new data needs to be rapidly developed, published, and then assessed, filtered, and revised by others. We address the need for bottom-up collaborative data sharing, in which independent researchers or groups with different goals, schemas, and data can share information in the absence of global agreement. Each group independently curates, revises, and extends its data; eventually the groups compare and reconcile their changes, but they are not required to agree. This paper describes our initial design and prototype of the ORCHESTRA system, which focuses on managing disagreement among multiple data representations and instances. Our work represents an important evolution of the concepts of peer-to-peer data sharing [23], which considers revision, disagreement, authority, and intermittent participation. ∗ Work done while an M.S. student at the Univ. of Pennsylvania.

Query routing in a peer-to-peer semantic link network

by Hai Zhuge, Jie Liu, Liang Feng, Xiaoping Sun, Chao He - Computational Intelligence , 2005
"... A semantic link peer-to-peer (P2P) network specifies and manages semantic relationships between peers ’ data schemas and can be used as the semantic layer of a scalable Knowledge Grid. The proposed approach consists of an automatic semantic link discovery method, a tool for building and maintaining ..."
Abstract - Cited by 24 (8 self) - Add to MetaCart
A semantic link peer-to-peer (P2P) network specifies and manages semantic relationships between peers ’ data schemas and can be used as the semantic layer of a scalable Knowledge Grid. The proposed approach consists of an automatic semantic link discovery method, a tool for building and maintaining P2P semantic link networks (P2PSLNs), a semantic-based peer similarity measurement for efficient query routing, and the schema mapping algorithms for query reformulation and heterogeneous data integration. The proposed approach has three important aspects. First, it uses semantic links to enrich the relationships between peers ’ data schemas. Second, it considers not only nodes but also the XML structure in measuring the similarity between schemas to efficiently and accurately forward queries to relevant peers. Third, it copes with semantic and structural heterogeneity and data inconsistency so that peers can exchange and translate heterogeneous information within a uniform view.

Inconsistency tolerance in p2p data integration: an epistemic logic approach

by Diego Calvanese, Domenico Lembo, Maurizio Lenzerini, Riccardo Rosati - In Proc. of the 10th Int. Workshop on Database Programming Languages (DBPL , 2005
"... Abstract. We study peer-to-peer data integration, where each peer models an autonomous system that exports data in terms of its own schema, and data interoperation is achieved by means of mappings among the peer schemas, rather than through a global schema. We propose a multi-modal epistemic semanti ..."
Abstract - Cited by 22 (4 self) - Add to MetaCart
Abstract. We study peer-to-peer data integration, where each peer models an autonomous system that exports data in terms of its own schema, and data interoperation is achieved by means of mappings among the peer schemas, rather than through a global schema. We propose a multi-modal epistemic semantics based on the idea that each peer is conceived as a rational agent that exchanges knowledge/belief with other peers, thus nicely modeling the modular structure of the system. We then address the issue of dealing with possible inconsistencies, and distinguish between two types of inconsistencies, called local and P2P, respectively. We define a nonmonotonic extension of our logic that is able to reason on the beliefs of peers under inconsistency tolerance. Tolerance to local inconsistency essentially means that the presence of inconsistency within one peer does not affect the consistency of the whole system. Tolerance to P2P inconsistency means being able to resolve inconsistencies arising from the interaction between peers. We study query answering and its data complexity in this setting, and we present an algorithm that is sound and complete with respect to the proposed semantics, and optimal with respect to worst-case complexity. 1

Peer-to-peer management of XML data: Issues and research challenges

by Georgia Koloniari, Evaggelia Pitoura - SIGMOD Rec , 2005
"... Peer-to-peer (p2p) systems are attracting increasing attention as an efficient means of sharing data among large, diverse and dynamic sets of users. The widespread use of XML as a standard for representing and exchanging data in the Internet suggests using XML for describing data shared in a p2p sys ..."
Abstract - Cited by 18 (0 self) - Add to MetaCart
Peer-to-peer (p2p) systems are attracting increasing attention as an efficient means of sharing data among large, diverse and dynamic sets of users. The widespread use of XML as a standard for representing and exchanging data in the Internet suggests using XML for describing data shared in a p2p system. However, sharing XML data imposes new challenges in p2p systems related to supporting advanced querying beyond simple keyword-based retrieval. In this paper, we focus on data management issues for processing XML data in a p2p setting, namely indexing, replication, clustering and query routing and processing. For each of these topics, we present the issues that arise, survey related research and highlight open research problems. 1.

Constructing and Querying Peer-to-Peer Warehouses of XML Resources

by Serge Abiteboul, Ioana Manolescu, Nicoleta Preda - In ICDE , 2004
"... We present KADOP, a distributed infrastructure for warehousing XML resources in a peer-to-peer framework. KADOP allows users to build a shared, distributed repository of resources such as XML documents, semantic information about such documents, Web services, and collections of such items. KADOP lev ..."
Abstract - Cited by 15 (5 self) - Add to MetaCart
We present KADOP, a distributed infrastructure for warehousing XML resources in a peer-to-peer framework. KADOP allows users to build a shared, distributed repository of resources such as XML documents, semantic information about such documents, Web services, and collections of such items. KADOP leverages several existing technologies and models: it uses distributed hash tables as a peer communication layer, and ActiveXML as a model for constructing and querying the resources in the peer network. 1

Xpath lookup queries in p2p networks

by Angela Bonifati, Via Pietro Bucci, Alfredo Cuzzocrea - In WIDM’04: Proceedings of the 6th annual ACM international workshop on Web information and data management , 2004
"... We address the problem of querying XML data over a P2P network. In P2P networks, the allowed kinds of queries are usually exact-match queries over file names. We discuss the extensions needed to deal with XML data and XPath queries. A single peer can hold a whole document or a partial/complete fragm ..."
Abstract - Cited by 14 (0 self) - Add to MetaCart
We address the problem of querying XML data over a P2P network. In P2P networks, the allowed kinds of queries are usually exact-match queries over file names. We discuss the extensions needed to deal with XML data and XPath queries. A single peer can hold a whole document or a partial/complete fragment of the latter. Each XML fragment/document is identified by a distinct path expression, which is encoded in a distributed hash table. Our framework differs from content-based routing mechanisms, biased towards finding the most relevant peers holding the data. We perform fragments placement and enable fragments lookup by solely exploiting few path expressions stored on each peer. By taking advantage of quasi-zero replication of global catalogs, our system supports fast full and partial XPath querying. To this purpose, we have extended the Chord simulator and performed an experimental evaluation of our approach.

StreamGlobe: Adaptive query processing and optimization in streaming P2P environments

by Bernhard Stegmaier, Richard Kuntschke, Alfons Kemper - In Proc. of the Intl. Workshop on Data Management for Sensor Networks , 2004
"... Recent research and development efforts show the increasing importance of processing data streams, not only in the context of sensor networks, but also in information retrieval networks. With the advent of various mobile devices being able to participate in ubiquitous (wireless) networks, a major ch ..."
Abstract - Cited by 13 (4 self) - Add to MetaCart
Recent research and development efforts show the increasing importance of processing data streams, not only in the context of sensor networks, but also in information retrieval networks. With the advent of various mobile devices being able to participate in ubiquitous (wireless) networks, a major challenge is to develop data stream management systems (DSMS) for information retrieval in such networks. In this paper, we present the architecture of our StreamGlobe system, which is focused on meeting the challenges of efficiently querying data streams in an ad-hoc network environment. StreamGlobe is based on a federation of heterogeneous peers ranging from small, possibly mobile devices to stationary servers. On this foundation, self-organizing network optimization and expressive in-network query processing capabilities enable powerful information processing and retrieval. Data streams in StreamGlobe are represented in XML and queried using XQuery. We report on our ongoing implementation effort and briefly show our research agenda. 1

On Reconciling Data Exchange, Data Integration, and Peer Data Management

by Domenico Lembo, Maurizio Lenzerini, Riccardo Rosati, Sapienza Università Di Roma
"... Data exchange and virtual data integration have been the subject of several investigations in the recent literature. At the same time, the notion of peer data management has emerged as a powerful abstraction of many forms of flexible and dynamic data-centered distributed systems. Although research o ..."
Abstract - Cited by 12 (1 self) - Add to MetaCart
Data exchange and virtual data integration have been the subject of several investigations in the recent literature. At the same time, the notion of peer data management has emerged as a powerful abstraction of many forms of flexible and dynamic data-centered distributed systems. Although research on the above issues has progressed considerably in the last years, a clear understanding on how to combine data exchange and data integration in peer data management is still missing. This is the subject of the present paper. We start our investigation by first proposing a novel framework for peer data exchange, showing that it is a generalization of the classical data exchange setting. We also present algorithms for all the relevant data exchange tasks, and show that they can all be done in polynomial time with respect to data complexity. Based on the motivation that typical mappings and integrity constraints found in data integration are not captured by peer data exchange, we extend the framework to incorporate these features. One of the main difficulties is that the constraints of this new class are not amenable to materialization. We address this issue by resorting to a suitable combination of virtual and materialized data exchange, showing that the resulting framework is a generalization of both classical data exchange and classical data integration, and that the new setting incorporates the most expressive types of mapping and constraints considered in the two contexts. Finally, we present algorithms for all the relevant data management tasks also in the new setting, and show that, again, their data complexity is polynomial.

Semex: Toward on-the-fly personal information integration

by Xin Dong, Alon Halevy, Ema Nemes, Stephan B. Sigurdsson, Pedro Domingos - In Workshop on Information Integration on the Web (IIWEB , 2004
"... On-the-fly information integration attempts to change the basic cost-benefit equation association with building information integration applications. This paper argues that on-the-fly can be supported by extending one’s personal information space. As a first step in this direction, we describe the S ..."
Abstract - Cited by 11 (3 self) - Add to MetaCart
On-the-fly information integration attempts to change the basic cost-benefit equation association with building information integration applications. This paper argues that on-the-fly can be supported by extending one’s personal information space. As a first step in this direction, we describe the Semex system that provides a logical and integrated view of one’s personal information. 1
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University