Results 1 - 10
of
56
Bullet: High Bandwidth Data Dissemination Using an Overlay Mesh
, 2003
"... In recent years, overlay networks have become an effective alternative to IP multicast for efficient point to multipoint communication across the Internet. Typically, nodes self-organize with the goal of forming an efficient overlay tree, one that meets performance targets without placing undue burd ..."
Abstract
-
Cited by 297 (19 self)
- Add to MetaCart
In recent years, overlay networks have become an effective alternative to IP multicast for efficient point to multipoint communication across the Internet. Typically, nodes self-organize with the goal of forming an efficient overlay tree, one that meets performance targets without placing undue burden on the underlying network. In this paper, we target high-bandwidth data distribution from a single source to a large number of receivers. Applications include large-file transfers and real-time multimedia streaming. For these applications, we argue that an overlay mesh, rather than a tree, can deliver fundamentally higher bandwidth and reliability relative to typical tree structures. This paper presents Bullet, a scalable and distributed algorithm that enables nodes spread across the Internet to self-organize into a high bandwidth overlay mesh. We construct Bullet around the insight that data should be distributed in a disjoint manner to strategic points in the network. Individual Bullet receivers are then responsible for locating and retrieving the data from multiple points in parallel. Key contributions of this work include: i) an algorithm that sends data to di#erent points in the overlay such that any data object is equally likely to appear at any node, ii) a scalable and decentralized algorithm that allows nodes to locate and recover missing data items, and iii) a complete implementation and evaluation of Bullet running across the Internet and in a large-scale emulation environment reveals up to a factor two bandwidth improvements under a variety of circumstances. In addition, we find that, relative to tree-based solutions, Bullet reduces the need to perform expensive bandwidth probing.
Astrolabe: A Robust and Scalable Technology for Distributed System Monitoring, Management, and Data Mining
- ACM Transactions on Computer Systems
, 2001
"... this paper, we describe a new information management service called Astrolabe. Astrolabe monitors the dynamically changing state of a collection of distributed resources, reporting summaries of this information to its users. Like DNS, Astrolabe organizes the resources into a hierarchy of domains, wh ..."
Abstract
-
Cited by 288 (16 self)
- Add to MetaCart
this paper, we describe a new information management service called Astrolabe. Astrolabe monitors the dynamically changing state of a collection of distributed resources, reporting summaries of this information to its users. Like DNS, Astrolabe organizes the resources into a hierarchy of domains, which we call zones to avoid confusion, and associates attributes with each zone. Unlike DNS, zones are not bound to specific servers, the attributes may be highly dynamic, and updates propagate quickly; typically, in tens of seconds
Processing XML Streams with deterministic automata
, 2003
"... Abstract. We consider the problem of evaluating a large number of XPath expressions on an XML stream. Our main contribution consists in showing that Deterministic Finite Automata (DFA) can be used effectively for this problem: in our experiments we achieve a throughput of about 5.4MB/s, independent ..."
Abstract
-
Cited by 107 (3 self)
- Add to MetaCart
Abstract. We consider the problem of evaluating a large number of XPath expressions on an XML stream. Our main contribution consists in showing that Deterministic Finite Automata (DFA) can be used effectively for this problem: in our experiments we achieve a throughput of about 5.4MB/s, independent of the number of XPath expressions (up to 1,000,000 in our tests). The major problem we face is that of the size of the DFA. Since the number of states grows exponentially with the number of XPath expressions, it was previously believed that DFAs cannot be used to process large sets of expressions. We make a theoretical analysis of the number of states in the DFA resulting from XPath expressions, and consider both the case when it is constructed eagerly, and when it is constructed lazily. Our analysis indicates that, when the automaton is constructed lazily, and under certain assumptions about the structure of the input XML data, the number of states in the lazy DFA is manageable. We also validate experimentally our findings, on both synthetic and real XML data sets. 1
Towards an Internet-Scale XML Dissemination Service
, 2004
"... Publish/subscribe systems have demonstrated the ability to scale to large numbers of users and high data rates when providing content-based data dissemination services on the Internet. However, their services are limited by the data semantics and query expressiveness that they support. On the o ..."
Abstract
-
Cited by 87 (3 self)
- Add to MetaCart
Publish/subscribe systems have demonstrated the ability to scale to large numbers of users and high data rates when providing content-based data dissemination services on the Internet. However, their services are limited by the data semantics and query expressiveness that they support. On the other hand, the recent work on selective dissemination of XML data has made significant progress in moving from XML filtering to the richer functionality of transformation for result customization, but in general has ignored the challenges of deploying such XML-based services on an Internet-scale. In this paper, we address these challenges in the context of incorporating the rich functionality of XML data dissemination in a highly scalable system. We present the architectural design of ONYX, a system based on an overlay network. We identify the salient technical challenges in supporting XML filtering and transformation in this environment and propose techniques for solving them.
Containment and equivalence for a fragment of XPath
- Journal of the ACM
, 2004
"... Abstract. XPath is a language for navigating an XML document and selecting a set of element nodes. XPath expressions are used to query XML data, describe key constraints, express transformations, and reference elements in remote documents. This article studies the containment and equivalence problem ..."
Abstract
-
Cited by 74 (0 self)
- Add to MetaCart
Abstract. XPath is a language for navigating an XML document and selecting a set of element nodes. XPath expressions are used to query XML data, describe key constraints, express transformations, and reference elements in remote documents. This article studies the containment and equivalence problems for a fragment of the XPath query language, with applications in all these contexts. In particular, we study a class of XPath queries that contain branching, label wildcards and can express descendant relationships between nodes. Prior work has shown that languages that combine any two of these three features have efficient containment algorithms. However, we show that for the combination of features, containment is coNP-complete. We provide a sound and complete algorithm for containment that runs in exponential time, and study parameterized PTIME special cases. While we identify one parameterized class of queries for which containment can be decided efficiently, we also show that even with some bounded parameters, containment remains coNP-complete. In response to these negative results, we describe a sound algorithm that is efficient for all queries, but may return false negatives in some cases.
Best-Path vs. Multi-Path Overlay Routing
- IN PROC. ACM SIGCOMM INTERNET MEASUREMENT CONFERENCE
, 2003
"... Time-varying congestion on Internet paths and failures due to software, hardware, and configuration errors often disrupt packet delivery on the Internet. Many aproaches to avoiding these problems use multiple paths between two network locations. These approaches rely on a path-independence assumptio ..."
Abstract
-
Cited by 56 (6 self)
- Add to MetaCart
Time-varying congestion on Internet paths and failures due to software, hardware, and configuration errors often disrupt packet delivery on the Internet. Many aproaches to avoiding these problems use multiple paths between two network locations. These approaches rely on a path-independence assumption in order to work well; i.e., they work best when the problems on different paths between two locations are uncorrelated in time. This
Exactly-once Delivery in a Content-based Publish-Subscribe System
- DSN
, 2002
"... This paper presents a general knowledge model for propagating information in a content-based publish-subscribe system. The model is used to derive an efficient and scalable protocol for exactly-once delivery to large numbers (tens of thousands per broker) of content-based subscribers in either publi ..."
Abstract
-
Cited by 48 (6 self)
- Add to MetaCart
This paper presents a general knowledge model for propagating information in a content-based publish-subscribe system. The model is used to derive an efficient and scalable protocol for exactly-once delivery to large numbers (tens of thousands per broker) of content-based subscribers in either publisher order or uniform total order. Our protocol allows intermediate content filtering at each hop, but requires persistent storage only at the publishing site. It is tolerant of message drops, message reorderings, node failures, and link failures, and maintains only "soft" state at intermediate nodes. We evaluate the performance of our implementation both under failure-free conditions and with fault injection.
Opus: an Overlay Peer Utility Service
- In Proceedings of the 5th International Conference on Open Architectures and Network Programming (OPENARCH
, 2002
"... Today, an increasing number of important network services, such as content distribution, replicated services, and storage systems, are deploying overlays across multiple Internet sites to deliver better performance, reliability and adaptability. Currently however, such network services must indi ..."
Abstract
-
Cited by 36 (9 self)
- Add to MetaCart
Today, an increasing number of important network services, such as content distribution, replicated services, and storage systems, are deploying overlays across multiple Internet sites to deliver better performance, reliability and adaptability. Currently however, such network services must individually reimplement substantially similar functionality. For example, applications must configure the overlay to meet their specific demands for scale, service quality and reliability. Further, they must dynamically map data and functions onto network resources---including servers, storage, and network paths---to adapt to changes in load or network conditions.
Filter Similarities in Content-Based Publish/Subscribe Systems
- In International Conference on Architecture of Computing Systems (ARCS
, 2002
"... Matching notifications to subscriptions and routing notifications from producers to interested consumers are the main problems in large-scale publish/subscribe systems. ..."
Abstract
-
Cited by 34 (8 self)
- Add to MetaCart
Matching notifications to subscriptions and routing notifications from producers to interested consumers are the main problems in large-scale publish/subscribe systems.
Maintaining High Bandwidth Under Dynamic Network Conditions
- In Proceedings of USENIX Annual Technical Conference
, 2005
"... The need to distribute large files across multiple wide-area sites is becoming increasingly common, for instance, in support of scientific computing, configuring distributed systems, distributing software updates such as open source ISOs or Windows patches, or disseminating multimedia content. Recen ..."
Abstract
-
Cited by 30 (6 self)
- Add to MetaCart
The need to distribute large files across multiple wide-area sites is becoming increasingly common, for instance, in support of scientific computing, configuring distributed systems, distributing software updates such as open source ISOs or Windows patches, or disseminating multimedia content. Recently a number of techniques have been proposed for simultaneously retrieving portions of a file from multiple remote sites with the twin goals of filling the client’s pipe and overcoming any performance bottlenecks between the client and any individual server. While there are a number of interesting tradeoffs in locating appropriate download sites in the face of dynamically changing network conditions, to date there has been no systematic evaluation of the merits of different protocols. This paper explores the design space of file distribution protocols and conducts a detailed performance evaluation of a number of competing systems running in both controlled emulation environments and live across the Internet. Based on our experience with these systems under a variety of conditions, we propose, implement and evaluate Bullet ′ (Bullet prime), a mesh based high bandwidth data dissemination system that outperforms previous techniques under both static and dynamic conditions. 1

