Results 1 - 10
of
17
Volley: Automated Data Placement for Geo-Distributed Cloud Services
"... Abstract: As cloud services grow to span more and more globally distributed datacenters, there is an increasingly urgent need for automated mechanisms to place application data across these datacenters. This placement must deal with business constraints such as WAN bandwidth costs and datacenter cap ..."
Abstract
-
Cited by 57 (0 self)
- Add to MetaCart
(Show Context)
Abstract: As cloud services grow to span more and more globally distributed datacenters, there is an increasingly urgent need for automated mechanisms to place application data across these datacenters. This placement must deal with business constraints such as WAN bandwidth costs and datacenter capacity limits, while also minimizing user-perceived latency. The task of placement is further complicated by the issues of shared data, data inter-dependencies, application changes and user mobility. We document these challenges by analyzing monthlong traces from Microsoft’s Live Messenger and Live Mesh, two large-scale commercial cloud services. We present Volley, a system that addresses these challenges. Cloud services make use of Volley by submitting logs of datacenter requests. Volley analyzes the logs using an iterative optimization algorithm based on data access patterns and client locations, and outputs migration recommendations back to the cloud service. To scale to the data volumes of cloud service logs, Volley is designed to work in SCOPE [5], a scalable MapReduce-style platform; this allows Volley to perform over 400 machine-hours worth of computation in less than a day. We evaluate Volley on the month-long Live Mesh trace, and we find that, compared to a stateof-the-art heuristic that places data closest to the primary IP address that accesses it, Volley simultaneously reduces datacenter capacity skew by over 2×, reduces inter-datacenter traffic by over 1.8 × and reduces 75th percentile user-latency by over 30%. 1
MillWheel: faulttolerant stream processing at Internet scale
- In Proceedings of the 39th International Conference on Very Large Data Bases (VLDB
, 2013
"... MillWheel is a framework for building low-latency data-processing applications that is widely used at Google. Users specify a directed computation graph and application code for individual nodes, and the system manages persistent state and the continuous flow of records, all within the envelope of t ..."
Abstract
-
Cited by 20 (1 self)
- Add to MetaCart
(Show Context)
MillWheel is a framework for building low-latency data-processing applications that is widely used at Google. Users specify a directed computation graph and application code for individual nodes, and the system manages persistent state and the continuous flow of records, all within the envelope of the framework’s fault-tolerance guarantees. This paper describes MillWheel’s programming model as well as its implementation. The case study of a continuous anomaly detector in use at Google serves to motivate how many of MillWheel’s features are used. MillWheel’s programming model provides a notion of logical time, making it simple to write time-based aggregations. MillWheel was designed from the outset with fault tolerance and scalability in mind. In practice, we find that MillWheel’s unique combination of scalability, fault tolerance, and a versatile programming model lends itself to a wide variety of problems at Google. 1.
Stout: An Adaptive Interface to Scalable Cloud Storage
"... Many of today’s applications are delivered as scalable, multi-tier services deployed in large data centers. These services frequently leverage shared, scale-out, key-value storage layers that can deliver low latency under light workloads, but may exhibit significant queuing delay and even dropped re ..."
Abstract
-
Cited by 16 (0 self)
- Add to MetaCart
(Show Context)
Many of today’s applications are delivered as scalable, multi-tier services deployed in large data centers. These services frequently leverage shared, scale-out, key-value storage layers that can deliver low latency under light workloads, but may exhibit significant queuing delay and even dropped requests under high load. Stout is a system that helps these applications adapt to variation in storage-layer performance by treating scalable key-value storage as a shared resource requiring congestion control. Under light workloads, applications using Stout send requests to the store immediately, minimizing delay. Under heavy workloads, Stout automatically batches the application’s requests together before sending them to the store, resulting in higher throughput and preventing queuing delay. We show experimentally that Stout’s adaptation algorithm converges to an appropriate batch size for workloads that require the batch size to vary by over two orders of magnitude. Compared to a non-adaptive strategy optimized for throughput, Stout delivers over 34 × lower latency under light workloads; compared to a non-adaptive strategy optimized for latency, Stout can scale to over 3 × as many requests. 1.
Opportunistic multipath forwarding in content-based publish/subscribe overlays
- In Middleware ’12
"... Abstract. Fine-grained filtering capabilities prevalent in content-based Publish/Subscribe (pub/sub)overlays lead toscenarios in which publicationspassthroughbrokerswithnomatchinglocal subscribers. Processing of publications at these pure forwarding brokers amounts to inefficient use of resources an ..."
Abstract
-
Cited by 7 (4 self)
- Add to MetaCart
(Show Context)
Abstract. Fine-grained filtering capabilities prevalent in content-based Publish/Subscribe (pub/sub)overlays lead toscenarios in which publicationspassthroughbrokerswithnomatchinglocal subscribers. Processing of publications at these pure forwarding brokers amounts to inefficient use of resources and should ideally be avoided. This paper develops an approach that largely mitigates this problem by building and adaptively maintaining a highly connected overlay mesh superimposed atop a low connectivity primary overlay network. While the primary network provides basic end-to-end forwarding routes, the mesh structure provides a rich set of alternative forwarding choices which can be used to bypass pure forwarding brokers. This provides unique opportunities for load balancing and congestion avoidance. Through extensive experimental evaluation on the SciNet cluster and PlanetLab, we compare the performance of our approach with that of conventional pub/sub algorithms as baseline. Ourresults indicatethatourapproachimprovespublicationdelivery delay and lowers network traffic while incurring negligible computational and bandwidth overhead. Furthermore, compared to the baseline, we observed significant gains of up to 115 % in terms of system throughput. 1
Adaptive performance-aware distributed memory caching
- USENIX Internation Conference on Autonomic Computing
, 2013
"... Distributed in-memory caching systems such as mem-cached have become crucial for improving the perfor-mance of web applications. However, memcached by itself does not control which node is responsible for each data object, and inefficient partitioning schemes can easily lead to load imbalances. Furt ..."
Abstract
-
Cited by 6 (3 self)
- Add to MetaCart
(Show Context)
Distributed in-memory caching systems such as mem-cached have become crucial for improving the perfor-mance of web applications. However, memcached by itself does not control which node is responsible for each data object, and inefficient partitioning schemes can easily lead to load imbalances. Further, a statically sized memcached cluster can be insufficient or inefficient when demand rises and falls. In this paper we present an automated cache management system that both intel-ligently decides how to scale a distributed caching sys-tem and uses a new, adaptive partitioning algorithm that ensures that load is evenly distributed despite variations in object size and popularity. We have implemented an adaptive hashing system1 as a proxy and node control framework for memcached, and evaluate it on EC2 using a set of realistic benchmarks including database dumps and traces from Wikipedia. 1
Natjam: Design and evaluation of eviction policies for supporting priorities and deadlines in mapreduce clusters
- in SoCC. ACM
, 2013
"... Abstract This paper presents Natjam, a system that supports arbitrary job priorities, hard real-time scheduling, and efficient preemption for Mapreduce clusters that are resource-constrained. Our contributions include: i) exploration and evaluation of smart eviction policies for jobs and for tasks, ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
(Show Context)
Abstract This paper presents Natjam, a system that supports arbitrary job priorities, hard real-time scheduling, and efficient preemption for Mapreduce clusters that are resource-constrained. Our contributions include: i) exploration and evaluation of smart eviction policies for jobs and for tasks, based on resource usage, task runtime, and job deadlines; and ii) a work-conserving task preemption mechanism for Mapreduce. We incorporated Natjam into the Hadoop YARN scheduler framework (in Hadoop 0.23). We present experiments from deployments on a test cluster, Emulab and a Yahoo! Inc. commercial cluster, using both synthetic workloads as well as Hadoop cluster traces from Yahoo!. Our results reveal that Natjam incurs overheads as low as 7%, and is preferable to existing approaches.
No compromises: Distributed transactions with consistency, availability, and performance.
- In Proc. 25th ACM Symposium on Operating Systems Principles (SOSP),
, 2015
"... Abstract Transactions with strong consistency and high availability simplify building and reasoning about distributed systems. However, previous implementations performed poorly. This forced system designers to avoid transactions completely, to weaken consistency guarantees, or to provide single-ma ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
(Show Context)
Abstract Transactions with strong consistency and high availability simplify building and reasoning about distributed systems. However, previous implementations performed poorly. This forced system designers to avoid transactions completely, to weaken consistency guarantees, or to provide single-machine transactions that require programmers to partition their data. In this paper, we show that there is no need to compromise in modern data centers. We show that a main memory distributed computing platform called FaRM can provide distributed transactions with strict serializability, high performance, durability, and high availability. FaRM achieves a peak throughput of 140 million TATP transactions per second on 90 machines with a 4.9 TB database, and it recovers from a failure in less than 50 ms. Key to achieving these results was the design of new transaction, replication, and recovery protocols from first principles to leverage commodity networks with RDMA and a new, inexpensive approach to providing non-volatile DRAM.
Commodifying replicated state machines with openreplica,” 2012, avaible at http://openreplica.org/static/papers/OpenReplica.pdf
"... This paper describes OpenReplica, an open service that provides replication and synchronization support for large-scale distributed systems. OpenReplica is designed to commodify Paxos replicated state machines by provid-ing infrastructure for their construction, deployment and maintenance. OpenRepli ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
(Show Context)
This paper describes OpenReplica, an open service that provides replication and synchronization support for large-scale distributed systems. OpenReplica is designed to commodify Paxos replicated state machines by provid-ing infrastructure for their construction, deployment and maintenance. OpenReplica is based on a novel Paxos replicated state machine implementation that employs an object-oriented approach in which the system actively creates and maintains live replicas for user-provided ob-jects. Clients access these replicated objects transpar-ently as if they are local objects. OpenReplica supports complex distributed synchronization constructs through a multi-return mechanism that enables the replicated ob-jects to control the execution flow of their clients, in essence providing blocking and non-blocking method in-vocations that can be used to implement richer synchro-nization constructs. Further, it supports elasticity re-quirements of cloud deployments by enabling any num-ber of servers to be replaced dynamically. A rack-aware placement manager places replicas on nodes that are un-likely to fail together. Experiments with the system show that the latencies associated with replication are compa-rable to ZooKeeper, and that the system scales well. 1
Nezhad A. An Algorithm for implementing BFT Registers in Distributed Systems with Bounded Churn
- In Proceedings of the 13th International Symposium on Stabilization, Safety, and Security of Distributed Systems (SSS
, 2011
"... Distributed storage service is one of the main abstractions provided to the developers of distributed applications due to its capability to hide the complexity generated by the messages exchanged between processes. Many protocols have been proposed to build byzantine-faulttolerant storage services o ..."
Abstract
-
Cited by 4 (3 self)
- Add to MetaCart
(Show Context)
Distributed storage service is one of the main abstractions provided to the developers of distributed applications due to its capability to hide the complexity generated by the messages exchanged between processes. Many protocols have been proposed to build byzantine-faulttolerant storage services on top of a message-passing system, but they do not consider the possibility to have servers joining and leaving the computation (churn phenomenon). This phenomenon, if not properly mastered, can either block protocols or violate the safety of the storage. In this paper, we address the problem of building of a safe register storage resilient to byzantine failures in a distributed system affected from churn. A protocol implementing a safe register in an eventually synchronous system is proposed and some feasibility constraints on the arrival and departure of the processes are given. The protocol is proved to be correct under the assumption that the constraint on the churn is satisfied.
Semantics of Caching with SPOCA: A Stateless, Proportional, Optimally-Consistent Addressing Algorithm
"... A key measure for the success of a Content Delivery Network is controlling cost of the infrastructure required to serve content to its end users. In this paper, we take a closer look at how Yahoo! efficiently serves millions of videos from its video library. A significant portion of this video libra ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
(Show Context)
A key measure for the success of a Content Delivery Network is controlling cost of the infrastructure required to serve content to its end users. In this paper, we take a closer look at how Yahoo! efficiently serves millions of videos from its video library. A significant portion of this video library consists of a large number of relatively unpopular user-generated content and a small set of popular videos that changes over time. Yahoo!’s initial architecture to handle the distribution of videos to Internet clients used shared storage to hold the videos and a hardware load balancer to handle failures and balance the load across the front-end server that did the actual transfers to the clients. The front-end servers used both their memory and hard drives as caches for the content they served. We found that this simple architecture did not use the front-end server caches effectively. We were able to improve our front-end caching while still being able to tolerate faults, gracefully handle the addition and removal of servers, and take advantage of geographic locality when serving content. We describe our solution, called SPOCA (Stateless, Proportional, Optimally-Consistent Addressing), which reduce disk cache misses from 5 % to less than 1%, and increase memory cache hits from 45 % to 80 % and thereby resulting in the overall cache hits from 95 % to 99.6%. Unlike other consistent addressing mechanisms, SPOCA facilitates nearly-optimal load balancing. 1