Results 1 - 10
of
57
Experiences building planetlab
- In Proceedings of the 7th USENIX Symp. on Operating Systems Design and Implementation (OSDI
, 2006
"... Abstract. This paper reports our experiences building PlanetLab over the last four years. It identifies the requirements that shaped PlanetLab, explains the design decisions that resulted from resolving conflicts among these requirements, and reports our experience implementing and supporting the sy ..."
Abstract
-
Cited by 90 (11 self)
- Add to MetaCart
(Show Context)
Abstract. This paper reports our experiences building PlanetLab over the last four years. It identifies the requirements that shaped PlanetLab, explains the design decisions that resulted from resolving conflicts among these requirements, and reports our experience implementing and supporting the system. Due in large part to the nature of the “PlanetLab experiment, ” the discussion focuses on synthesis rather than new techniques, balancing system-wide considerations rather than improving performance along a single dimension, and learning from feedback from a live system rather than controlled experiments using synthetic workloads. 1
Antfarm: Efficient Content Distribution with Managed Swarms
"... This paper describes Antfarm, a content distribution system based on managed swarms. A managed swarm couples peer-to-peer data exchange with a coordinator that directs bandwidth allocation at each peer. Antfarm achieves high throughput by viewing content distribution as a global optimization problem ..."
Abstract
-
Cited by 55 (1 self)
- Add to MetaCart
(Show Context)
This paper describes Antfarm, a content distribution system based on managed swarms. A managed swarm couples peer-to-peer data exchange with a coordinator that directs bandwidth allocation at each peer. Antfarm achieves high throughput by viewing content distribution as a global optimization problem, where the goal is to minimize download latencies for participants subject to bandwidth constraints and swarm dynamics. The system is based on a wire protocol that enables the Antfarm coordinator to gather information on swarm dynamics, detect misbehaving hosts, and direct the peers ’ allotment of upload bandwidth among multiple swarms. Antfarm’s coordinator grants autonomy and local optimization opportunities to participating nodes while guiding the swarms toward an efficient allocation of resources. Extensive simulations and a PlanetLab deployment show that the system can significantly outperform centralized distribution services as well as swarming systems such as BitTorrent. 1
Client behavior and feed characteristics of RSS, a publish-subscribe system for web micronews
- In IMC’05: Proceedings of the Internet Measurement Conference 2005 on Internet Measurement Conference
, 2005
"... While publish-subscribe systems have attracted much research interest since the last decade, few established benchmarks have emerged, and there has been little characterization of how publish-subscribe systems are used in practice. This paper examines RSS, a newly emerging, widely used publish-subsc ..."
Abstract
-
Cited by 53 (1 self)
- Add to MetaCart
(Show Context)
While publish-subscribe systems have attracted much research interest since the last decade, few established benchmarks have emerged, and there has been little characterization of how publish-subscribe systems are used in practice. This paper examines RSS, a newly emerging, widely used publish-subscribe system for Web micronews. Based on a trace study spanning 45 days at a medium-size academic department and periodic polling of approximately 100,000 RSS feeds, we extract characteristics of RSS content and usage. We find that RSS workload resembles the Web in content size and popularity; feeds are typically small (less than 10KB), albeit with a heavy tail, and feed popularity follows a power law distribution. The update rate of RSS feeds is widely distributed; 55 % of RSS feeds are updated hourly, while 25 % show no updates for several days. And, only small portions of RSS content typically change during an update; 64 % of updates involve less than three lines of the RSS content. Overall, this paper presents an analysis of RSS, the first widely deployed publish-subscribe system, and provides insights for the design of next generation publish-subscribe systems. 1
Spidercast: A scalable interest aware overlay for topic-based pub/sub communication
- In Proceedings of the 2007 inaugural international conference on Distributed event-based systems (DEBS 2007
, 2006
"... We introduce SpiderCast, a distributed protocol for constructing scalable churn-resistant overlay topologies for supporting decentralized topic-based pub/sub communication. SpiderCast is designed to effectively tread the balance between average overlay degree and communication cost of event dissemin ..."
Abstract
-
Cited by 41 (11 self)
- Add to MetaCart
(Show Context)
We introduce SpiderCast, a distributed protocol for constructing scalable churn-resistant overlay topologies for supporting decentralized topic-based pub/sub communication. SpiderCast is designed to effectively tread the balance between average overlay degree and communication cost of event dissemination. It employs a novel coverage-optimizing heuristic in which the nodes utilize partial subscription views (provided by a decentralized membership service) to reduce the average node degree while guaranteeing (with high probability) that the events posted on each topic can be routed solely through the nodes interested in this topic (in other words, the overlay is topic-connected). SpiderCast is unique in maintaining an overlay topology that scales well with the average number of topics a node is subscribed to, assuming the subscriptions are correlated insofar as found in most typical workloads. Furthermore, the degree grows logarithmically in the total number of topics, and slowly decreases as the number of nodes increases. We show experimentally that, for many practical workloads, the SpiderCast overlays are both topic-connected and have a low per-topic diameter while requiring each node to maintain a low average number of connections. These properties are satisfied even in very large settings involving up to 10, 000 nodes, 1, 000 topics, and 70 subscriptions per-node, and under high churn rates. In addition, our results demonstrate that, in a large setting, the average node degree in SpiderCast is at least 45 % smaller than in other overlays typically used to support decentralized pub/sub communication (such as e.g., similarity-based, rings-based, and random overlays).
COPS: An Efficient Content Oriented Publish/Subscribe System
"... Content-Centric Networks (CCN) provide substantial flexibility for users to obtain information without regard to the source of the information or its current location. Publish/ subscribe (pub/sub) systems have gained popularity in society to provide the convenience of removing the temporal dependenc ..."
Abstract
-
Cited by 17 (5 self)
- Add to MetaCart
(Show Context)
Content-Centric Networks (CCN) provide substantial flexibility for users to obtain information without regard to the source of the information or its current location. Publish/ subscribe (pub/sub) systems have gained popularity in society to provide the convenience of removing the temporal dependency of the user having to indicate an interest each time he or she wants to receive a particular piece of related information. Currently, on the Internet, such pub/sub systems have been built on top of an IP-based network with the additional responsibility placed on the end-systems and servers to do the work of getting a piece of information to interested recipients. We propose Content-Oriented Pub/Sub system (COPS) to achieve an efficient pub/sub capability for CCN. COPS enhances the heretofore inherently pull-based CCN architectures proposed by integrating push based multicast at the content-centric layer. We emulate an application that is particularly emblematic of a pub/sub environment—Twitter—but one where subscribers are interested in content (e.g., identified by keywords), rather than tweets from a particular individual. Using trace-driven simulation, we demonstrate that our architecture can achieve a scalable and efficient pub/sub content centric network. The simulator is parameterized using the results of
Cobra: Content-based filtering and aggregation of blogs and rss feeds
- In Proc. NSDI ’07
"... Blogs and RSS feeds are becoming increasingly popular. The blogging site LiveJournal has over 11 million user accounts, and according to one report, over 1.6 million postings are made to blogs every day. The “Blogosphere ” is a new hotbed of Internet-based media that represents a shift from mostly s ..."
Abstract
-
Cited by 16 (1 self)
- Add to MetaCart
(Show Context)
Blogs and RSS feeds are becoming increasingly popular. The blogging site LiveJournal has over 11 million user accounts, and according to one report, over 1.6 million postings are made to blogs every day. The “Blogosphere ” is a new hotbed of Internet-based media that represents a shift from mostly static content to dynamic, continuously-updated discussions. The problem is that finding and tracking blogs with interesting content is an extremely cumbersome process. In this paper, we present Cobra (Content-Based RSS Aggregator), a system that crawls, filters, and aggregates vast numbers of RSS feeds, delivering to each user a personalized feed based on their interests. Cobra consists of a three-tiered network of crawlers that scan web feeds, filters that match crawled articles to user subscriptions, and reflectors that provide recently-matching articles on each subscription as an RSS feed, which can be browsed using a standard RSS reader. We present the design, implementation, and evaluation of Cobra in three settings: a dedicated cluster, the Emulab testbed, and on PlanetLab. We present a detailed performance study of the Cobra system, demonstrating that the system is able to scale well to support a large number of source feeds and users; that the mean update detection latency is low (bounded by the crawler rate); and that an offline service provisioning step combined with several performance optimizations are effective at reducing memory usage and network load. 1
Quasar: A Probabilistic Publish-SubscribeSystem forSocial Networks
"... Existing peer-to-peer publish-subscribe systems rely on structured-overlays and rendezvous nodes to store and relay group membership information. While conceptually simple, this design incurs the significant cost of creating and maintaining rigid-structures and introduces hotspots in the system at n ..."
Abstract
-
Cited by 9 (0 self)
- Add to MetaCart
(Show Context)
Existing peer-to-peer publish-subscribe systems rely on structured-overlays and rendezvous nodes to store and relay group membership information. While conceptually simple, this design incurs the significant cost of creating and maintaining rigid-structures and introduces hotspots in the system at nodes that are neither publishers nor subscribers. In this paper, we introduce Quasar, a rendezvous-less probabilistic publish-subscribe system that caters to the specific needs of social networks. It is designed to handle social networks of many groups; on the order of the number of users in the system. It creates a routing infrastructure based on the proactivedisseminationofhighlyaggregatedroutingvectorstoprovideanycast-likedirectedwalksintheoverlay. Thisprimitive, whencoupledwithanovelmechanismfordynamicallynegating routes, enables scalable and efficient group-multicast that obviatesthe need for structure and rendezvousnodes. We examinethefeasibilityofthisapproachandshowinalarge-scale simulationthat thesystemisscalable andefficient. 1
Intelligent personal health record: experience and open issues
- Proceedings of IHI’10
, 2010
"... Abstract Web-based personal health records (PHRs) are under massive deployment. To improve PHR’s capability and usability, we previously proposed the concept of intelligent PHR (iPHR). By introducing and extending expert system technology and Web search technology into the PHR domain, iPHR can autom ..."
Abstract
-
Cited by 8 (4 self)
- Add to MetaCart
(Show Context)
Abstract Web-based personal health records (PHRs) are under massive deployment. To improve PHR’s capability and usability, we previously proposed the concept of intelligent PHR (iPHR). By introducing and extending expert system technology and Web search technology into the PHR domain, iPHR can automatically provide users with personalized healthcare information to facilitate their daily activities of living. Our iPHR system currently provides three functions: guided search for disease information, recommendation of home nursing activities, and recommendation of home medical products. This paper discusses our experience with iPHR as well as the open issues, including both enhancements to the existing functions and potential new functions. We outline some preliminary solutions, whereas a main purpose of this paper is to stimulate future research work in the area of consumer health informatics.
Supporting Generic Cost Models for Wide-Area Stream Processing
"... Abstract — Existing stream processing systems are optimized for a specific metric, which may limit their applicability to diverse applications and environments. This paper presents XFlow, a generic data stream collection, processing, and dissemination system that addresses this limitation efficientl ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
(Show Context)
Abstract — Existing stream processing systems are optimized for a specific metric, which may limit their applicability to diverse applications and environments. This paper presents XFlow, a generic data stream collection, processing, and dissemination system that addresses this limitation efficiently. XFlow can express and optimize a variety of optimization metrics and constraints by distributing stream processing queries across a wide-area network. It uses metric-independent decentralized algorithms that work on localized, aggregated statistics, while avoiding local optima. To facilitate light-weight dynamic changes on the query deployment, XFlow relies on a loosely-coupled, flexible architecture consisting of multiple publish-subscribe overlay trees that can gracefully scale and adapt to changes to network and workload conditions. Based on the desired performance goals, the system progressively refines the query deployment, the structure of the overlay trees, as well as the statistics collection process. We provide an overview of XFlow’s architecture and discuss its decentralized optimization model. We demonstrate its flexibility and the effectiveness using real-world streams and experimental results obtained from XFlow’s deployment on PlanetLab. The experiments reveal that XFlow can effectively optimize various performance metrics in the presence of varying network and workload conditions. I.
Practical High-Throughput Content-Based Routing Using Unicast State and Probabilistic Encodings
, 2009
"... We address the problem that existing publish/subscribe messaging systems, including such commonly used ones as Apache’s ActiveMQ and IBM’s WebSphere MQ, exhibit degraded end-to-end throughput performance in a wide-area network setting. We contend that the cause of this problem is the lack of an appr ..."
Abstract
-
Cited by 4 (3 self)
- Add to MetaCart
(Show Context)
We address the problem that existing publish/subscribe messaging systems, including such commonly used ones as Apache’s ActiveMQ and IBM’s WebSphere MQ, exhibit degraded end-to-end throughput performance in a wide-area network setting. We contend that the cause of this problem is the lack of an appropriate routing protocol. Building on the idea of a content-based network, we introduce a protocol called B-DRP that can demonstrably improve the situation. A content-based network is a content-based publish/subscribe system architected as a datagram network: a message is forwarded hop-by-hop and delivered to any and all hosts that have expressed interest in the message content. This fits well with the character of a wide-area messaging system. B-DRP is based on two main techniques: a message delivery mechanism that utilizes and exploits unicast forwarding state, which can be easily maintained using standard protocols, and a probabilistic data structure to efficiently represent and evaluate receiver interests. We present the design of B-DRP and the results of an experimental evaluation that demonstrates its support for improved throughput in a wide-area setting. Publish/subscribe messaging is a central feature of modern enterprise computing platforms. They are typically based on high-performance implementations of the Java Message Service (JMS). 1 Examples include IBM’s Web-