Results 11 - 20
of
139
Network-Aware Query Processing for Stream-based Applications
, 2004
"... This paper investigates the benefits of network awareness when processing queries in widelydistributed environments such as the Internet. ..."
Abstract
-
Cited by 36 (0 self)
- Add to MetaCart
This paper investigates the benefits of network awareness when processing queries in widelydistributed environments such as the Internet.
Iterative Dynamic Programming: A New Class of Query Optimization Algorithms
- ACM Trans. on Database Systems
, 1998
"... The query optimizer is one of the most important components of a database system. Most commercial query optimizers today are based on a dynamic-programming algorithm, as proposed in [SAC+79]. While this algorithm produces good optimization results (i.e., good plans), its high complexity can be prohi ..."
Abstract
-
Cited by 36 (5 self)
- Add to MetaCart
The query optimizer is one of the most important components of a database system. Most commercial query optimizers today are based on a dynamic-programming algorithm, as proposed in [SAC+79]. While this algorithm produces good optimization results (i.e., good plans), its high complexity can be prohibitive if complex queries need to be processed, new query execution techniques need to be integrated, or in certain programming environments (e.g., distributed database systems). In this paper, we present and thoroughly evaluate a new class of query optimization algorithms that are based on a principle that we call iterative dynamic programming, or IDP for short. IDP has several important advantages: First, IDP-algorithms produce the best plans of all known algorithms in situations in which dynamic programming is not viable because of its high complexity. Second, some IDP variants are adaptive and produce as good plans as dynamic programming if dynamic programming is viable an...
Resource-Aware Distributed Stream Management using Dynamic Overlays
- In Proc. of 25th IEEE International Conference on Distributed Computing Systems (ICDCS-2005
, 2005
"... We consider distributed applications that continuously stream data across the network, where data needs to be aggregated and processed to produce a 'useful ' stream of updates. Centralized approaches to performing data aggregation suffer from high communication overheads, lack of scalability, and un ..."
Abstract
-
Cited by 35 (10 self)
- Add to MetaCart
We consider distributed applications that continuously stream data across the network, where data needs to be aggregated and processed to produce a 'useful ' stream of updates. Centralized approaches to performing data aggregation suffer from high communication overheads, lack of scalability, and unpredictably high processing workloads at central servers. This paper describes a scalable and efficient solution to distributed stream management based on (1) resource-awareness, which is middleware-level knowledge of underlying network and processing resources, (2) overlay-based in-network data aggregation, and (3) high-level programming constructs to describe data-flow graphs for composing useful streams. Technical contributions include a novel algorithm based on resource-aware network partitioning to support dynamic deployment of dataflow graph components across the network, where efficiency of the deployed overlay is maintained by making use of partition-level resource-awareness. Contributions also include efficient middleware-based support for component deployment, utilizing runtime code generation rather than interpretation techniques, thereby addressing both high performance and resource-constrained applications. Finally, simulation experiments and benchmarks attained with actual operational data corroborate this paper's claims. 1.
Yars2: A federated repository for querying graph structured data from the web
- of Lecture Notes in Computer Science
, 2007
"... Abstract. We present the architecture of an end-to-end semantic search engine that uses a graph data model to enable interactive query answering over structured and interlinked data collected from many disparate sources on the Web. In particular, we study distributed indexing methods for graph-struc ..."
Abstract
-
Cited by 33 (5 self)
- Add to MetaCart
Abstract. We present the architecture of an end-to-end semantic search engine that uses a graph data model to enable interactive query answering over structured and interlinked data collected from many disparate sources on the Web. In particular, we study distributed indexing methods for graph-structured data and parallel query evaluation methods on a cluster of computers. We evaluate the system on a dataset with 430 million statements collected from the Web, and provide scale-up experiments on 7 billion synthetically generated statements. 1
Service-Based Distributed Querying on the Grid
- IN PROC. OF THE 1ST INT. CONF. ON SERVICE ORIENTED COMPUTING
, 2003
"... Service-based approaches (such as Web Services and the Open Grid Services Architecture) have gained considerable attention recently for supporting distributed application development in e-business and e-science. The emergence of a service-oriented view of hardware and software resources raises t ..."
Abstract
-
Cited by 32 (21 self)
- Add to MetaCart
Service-based approaches (such as Web Services and the Open Grid Services Architecture) have gained considerable attention recently for supporting distributed application development in e-business and e-science. The emergence of a service-oriented view of hardware and software resources raises the question as to how database management systems and technologies can best be deployed or adapted for use in such an environment. This paper explores one aspect of service-based computing and data management, viz., how to integrate query processing technology with a service-based Grid. The paper describes in detail the design and implementation of a service-based distributed query processor for the Grid. The query processor is service-based in two orthogonal senses: firstly, it supports querying over data storage and analysis resources that are made available as services, and, secondly, its internal architecture factors out as services the functionalities related to the construction of distributed query plans on the one hand, and to their execution over the Grid on the other. The resulting system both provides a declarative approach to service orchestration in the Grid, and demonstrates how query processing can benefit from dynamic access to computational resources on the Grid.
A Taxonomy of Data Grids for Distributed Data Sharing, Management and Processing
- ACM Comput. Surv
, 2006
"... Data Grids have been adopted as the platform for scientific communities that need to share, access, transport, process and manage large data collections distributed worldwide. They combine high-end computing technologies with high-performance networking and wide-area storage management techniques. ..."
Abstract
-
Cited by 27 (7 self)
- Add to MetaCart
Data Grids have been adopted as the platform for scientific communities that need to share, access, transport, process and manage large data collections distributed worldwide. They combine high-end computing technologies with high-performance networking and wide-area storage management techniques. In this paper, we discuss the key concepts behind Data Grids and compare them with other data sharing and distribution paradigms such as content delivery networks, peer-to-peer networks and distributed databases.
Distributed Query Processing on the Grid
, 2002
"... Distributed query processing (DQP) has been widely used in data intensive applications where data of relevance to users is stored in multiple locations. This paper argues: (i) that DQP can be important in the Grid, as a means of providing high-level, declarative languages for integrating data access ..."
Abstract
-
Cited by 25 (14 self)
- Add to MetaCart
Distributed query processing (DQP) has been widely used in data intensive applications where data of relevance to users is stored in multiple locations. This paper argues: (i) that DQP can be important in the Grid, as a means of providing high-level, declarative languages for integrating data access and analysis
iDM: a unified and versatile data model for personal dataspace management
- In VLDB
, 2006
"... dbis.ethz.ch | iMeMex.org ..."
Index Structures and Algorithms for Querying Distributed RDF Repositories
- WWW2004, MAY 17–22; THE NETHERLANDS
, 2004
"... A technical infrastructure for storing, querying and managing RDF data is a key element in the current semantic web development. Systems like Jena, Sesame or the ICS-FORTH RDF Suite are widely used for building semantic web applications. Currently, none of these systems supports the integrated quer ..."
Abstract
-
Cited by 23 (0 self)
- Add to MetaCart
A technical infrastructure for storing, querying and managing RDF data is a key element in the current semantic web development. Systems like Jena, Sesame or the ICS-FORTH RDF Suite are widely used for building semantic web applications. Currently, none of these systems supports the integrated querying of distributed RDF repositories. We consider this a major shortcoming since the semantic web is distributed by nature. In this paper we present an architecture for querying distributed RDF repositories by extending the existing Sesame system. We discuss the implications of our architecture and propose an index structure as well as algorithms for query processing and optimization in such a distributed context.
Distributed xquery
- In IIWeb
, 2004
"... XQuery is increasingly being used for ad-hoc integration of heterogeneous data sources that are logically mapped to XML. For example, scientists need to query multiple scientific databases, which are distributed over a large geographic area, and it is possible to use XQuery for that. However, the la ..."
Abstract
-
Cited by 20 (1 self)
- Add to MetaCart
XQuery is increasingly being used for ad-hoc integration of heterogeneous data sources that are logically mapped to XML. For example, scientists need to query multiple scientific databases, which are distributed over a large geographic area, and it is possible to use XQuery for that. However, the language currently supports only the data shipping query evaluation model (through the document() function): it fetches all data sources to a single server, then runs the query there. This is a major limitation for many applications, especially when some data sources are very large, or when a data source is only a virtual XML view over some other logical data model. We propose here a simple extension to XQuery that allows query shipping to be expressed in the language, in addition to data shipping. Example 1.1 For a simple illustration, consider the following example:

