Results 1 - 10
of
11
WebContent: Efficient P2P Warehousing of Web Data
, 2008
"... We present the WebContent platform for managing distributed repositories of XML and semantic Web data. The platform allows integrating various data processing building blocks (crawling, translation, semantic annotation, full-text search, structured XML querying, and semantic querying), presented as ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
We present the WebContent platform for managing distributed repositories of XML and semantic Web data. The platform allows integrating various data processing building blocks (crawling, translation, semantic annotation, full-text search, structured XML querying, and semantic querying), presented as Web services, into a large-scale efficient platform. Calls to various services are combined inside ActiveXML [8] documents, which are XML documents including service calls. An ActiveXML optimizer is used to: (i) efficiently distribute computations among sites; (ii) perform XQuery-specific optimizations by leveraging an algebraic XQuery optimizer; and (iii) given an XML query, chose among several distributed indices the most appropriate in order to answer the query.
OptimAX: Optimizing Distributed ActiveXML Applications
- In ICWE
, 2008
"... The Web has become a platform of choice for the deployment of complex applications involving several business partners. Typically, such applications interoperate by means of Web services, exchanging XML information. We present OptimAX, an optimization Web service that applies at the static level (pr ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
The Web has become a platform of choice for the deployment of complex applications involving several business partners. Typically, such applications interoperate by means of Web services, exchanging XML information. We present OptimAX, an optimization Web service that applies at the static level (prior to enacting an application) in order to rewrite it into one whose execution will be more performant. OptimAX builds on the ActiveXML (AXML) data-centric Web service composition language, and demonstrates how database-style techniques can be efficiently integrated in a loosely-coupled, distributed application based on Web services. OptimAX has been fully implemented and we describe its experimental performance. Figure 1. WebContent architecture outline. 1
Materialized views for P2P XML warehousing
"... We consider the efficient, scalable management of XML documents in structured peer-to-peer networks based on distributed hash table (DHT) indices. We present an approach for exploiting materialized views deployed in the DHT network independently by the peers, to answer an interesting dialect of tree ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
We consider the efficient, scalable management of XML documents in structured peer-to-peer networks based on distributed hash table (DHT) indices. We present an approach for exploiting materialized views deployed in the DHT network independently by the peers, to answer an interesting dialect of tree pattern queries. We provide algorithms to index and materialize views in the DHT, show that the rewriting problem is polynomial in the number of views, and describe rewriting algorithms. Our approach is validated by experiments on the complete platform deployed on 1000 peers in a wide area network.
XML materialized views in P2P networks
- in "Fourth International Workshop on Database Technologies for Handling XML Information on the Web, Russie Saint Petersburg", 2009, http:// hal.inria.fr/inria-00425627/en/. Scientific Books (or Scientific Book chapters
"... We consider the efficient, scalable management of XML documents in structured peer-to-peer networks based on distributed hash table (DHT) indices. We present an approach for exploiting indices (or materialized views) deployed in the P2P network independently by the peers, to answer an interesting di ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
We consider the efficient, scalable management of XML documents in structured peer-to-peer networks based on distributed hash table (DHT) indices. We present an approach for exploiting indices (or materialized views) deployed in the P2P network independently by the peers, to answer an interesting dialect of tree pattern queries. We describe the query (and view) language, provide a rewriting algorithm, discuss view definition indexing strategies based on the DHT, and compare their performance through a set of experiments on a completely deployed platform. 1.
Routing of Structured Queries in Large-Scale Distributed Systems
"... In order to search XML-document collections, structural information – given by a user in the form of a structured query or provided by the self-describing structure of XML-documents – have been used in the past years to improve Information Retrieval (IR) quality in terms of recall and precision. How ..."
Abstract
- Add to MetaCart
In order to search XML-document collections, structural information – given by a user in the form of a structured query or provided by the self-describing structure of XML-documents – have been used in the past years to improve Information Retrieval (IR) quality in terms of recall and precision. However, all known approaches have only been used in classical client-/server (C/S) architectures. None have ever been applied to improve retrieval in large-scale distributed systems such as Peer-to-Peer (P2P) networks, where efficiency issues have to be dealt with carefully, e.g. in order to reduce communication overhead between distributed nodes. As P2P networks can be considered promising alternatives to C/S-systems for storing large amounts of information including XML-documents, possibilities for improving the retrieval in such networks should be investigated. In this paper, we concentrate on query routing in such a scenario and raise the question, how structured queries can be routed in a highly distributed environment so as to increase both efficiency and effectiveness. We provide an infrastructure for investigating this question and propose techniques for performing routing based on a mixture of document-, element-, collection- and peerevidence. We also report on preliminary evaluation results with the INEX collection.
Optimized Union of Non-disjoint Distributed Data Sets ∗
"... In a variety of applications, ranging from data integration to distributed query evaluation, there is a need to obtain sets of data items from several sources (peers) and compute their union. As these sets often contain common data items, avoiding the transmission of redundant information is essenti ..."
Abstract
- Add to MetaCart
In a variety of applications, ranging from data integration to distributed query evaluation, there is a need to obtain sets of data items from several sources (peers) and compute their union. As these sets often contain common data items, avoiding the transmission of redundant information is essential for effective union computation. In this paper we define the notion of optimal union plans for nondisjoint data sets residing on distinct peers, and present efficient algorithms for computing and executing such optimal plans. Our algorithms avoid redundant data transmission and optimally exploit the network bandwidth capabilities. A challenge in the design of optimal plans is the lack of a complete map of the distribution of the data items among peers. We analyze the information required for optimal planning and propose novel techniques to obtain compact, cheap to communicate, description of the data sources. We then exploit it for efficient union computation with reasonable accuracy. We demonstrate experimentally the superiority of our approach over the common naive union computation, showing it improves the performance by an order of magnitude. 1.
LCA-based Selection for XML Document Collections
"... In this paper, we address the problem of database selection for XML document collections, that is, given a set of collections and a user query, how to rank the collections based on their goodness to the query. Goodness is determined by the relevance of the documents in the collection to the query. ..."
Abstract
- Add to MetaCart
In this paper, we address the problem of database selection for XML document collections, that is, given a set of collections and a user query, how to rank the collections based on their goodness to the query. Goodness is determined by the relevance of the documents in the collection to the query.
The ViP2P Platform: XML Views in P2P
"... Abstract: The growing volumes of XML data sources on the Web or produced by enterprises, organizations etc. raise many performance challenges for data management applications. In this work, we are concerned with the distributed, peer-to-peer management of large corpora of XML documents, based on dis ..."
Abstract
- Add to MetaCart
Abstract: The growing volumes of XML data sources on the Web or produced by enterprises, organizations etc. raise many performance challenges for data management applications. In this work, we are concerned with the distributed, peer-to-peer management of large corpora of XML documents, based on distributed hash table (or DHT, in short) overlay networks. We present ViP2P (standing for Views in Peer-to-Peer), a distributed platform for sharing XML documents based on a structured P2P network infrastructure (DHT). At the core of ViP2P stand distributed materialized XML views, defined by arbitrary XML queries, filled in with data published anywhere in the network, and exploited to efficiently answer queries issued by any network peer. ViP2P allows user queries to be evaluated over XML documents published by peers in two modes. First, a long-running subscription mode, when a query can be registered in the system and receive answers incrementally when and if published data matches the query. Second, queries can also be asked in an ad-hoc, snapshot mode, where results are required immediately and must be computed based on the results of other long-running, subscription queries. ViP2P innovates over other similar DHT-based XML sharing platforms by using a very expressive structured XML query language. This expressivity leads to a very flexible distribution of XML content in the ViP2P network, and

