Results 1 - 10
of
57
Towards Elastic Transactional Cloud Storage with Range Query Support
"... Cloud storage is an emerging infrastructure that offers Platforms as a Service (PaaS). On such platforms, storage and compute power are adjusted dynamically, and therefore it is important to build a highly scalable and reliable storage that can elastically scale ondemand with minimal startup cost. I ..."
Abstract
-
Cited by 19 (5 self)
- Add to MetaCart
(Show Context)
Cloud storage is an emerging infrastructure that offers Platforms as a Service (PaaS). On such platforms, storage and compute power are adjusted dynamically, and therefore it is important to build a highly scalable and reliable storage that can elastically scale ondemand with minimal startup cost. In this paper, we propose ecStore – an elastic cloud storage system that supports automated data partitioning and replication, load balancing, efficient range query, and transactional access. In ec-Store, data objects are distributed and replicated in a cluster of commodity computer nodes located in the cloud. Users can access data via transactions which bundle read and write operations on multiple data items stored on possibly different cluster nodes. The architecture of ecStore follows a stratum design that leverages an underlying distributed index with a replication layer in the middle and a transaction management layer on top. ecStore provides adaptive read consistency on replicated data. We also enhance the system with an effective load balancing scheme using a self-tuning replication technique that is specially designed for large-scale data. Furthermore, a multi-version optimistic concurrency control scheme matches well with the characteristics of data in cloud storages. To validate the performance of the system, we have conducted extensive experiments on various platforms including a commercial cloud (Amazon’s EC2), an in-house cluster, and PlanetLab. 1.
In-situ MapReduce for Log Processing
"... Log analytics are a bedrock component of running many of today’s Internet sites. Application and click logs form the basis for tracking and analyzing customer behaviors and preferences, and they form the basic inputs to ad-targeting algorithms. Logs are also critical for performance and security mon ..."
Abstract
-
Cited by 18 (0 self)
- Add to MetaCart
(Show Context)
Log analytics are a bedrock component of running many of today’s Internet sites. Application and click logs form the basis for tracking and analyzing customer behaviors and preferences, and they form the basic inputs to ad-targeting algorithms. Logs are also critical for performance and security monitoring, debugging, and optimizing the large compute infrastructures that make up the compute “cloud”, thousands of machines spanning multiple data centers. With current log generation rates on the order of 1–10 MB/s per machine, a single data center can create tens of TBs of log data a day. While bulk data processing has proven to be an essential tool for log processing, current practice transfers all logs to a centralized compute cluster. This not only consumes large amounts of network and disk bandwidth, but also delays the completion of time-sensitive analytics. We present an in-situ MapReduce architecture that mines data “on location”, bypassing the cost and wait time of this store-first-query-later approach. Unlike current approaches, our architecture explicitly supports reduced data fidelity, allowing users to annotate queries with latency and fidelity requirements. This approach fills an important gap in current bulk processing systems, allowing users to trade potential decreases in data fidelity for improved response times or reduced load on end systems. We report on the design and implementation of our in-situ MapReduce architecture, and illustrate how it improves our ability to accommodate increasing log generation rates. 1
Consistency-Based Service Level Agreements for Cloud Storage
"... Choosing a cloud storage system and specific operations for reading and writing data requires developers to make decisions that trade off consistency for availability and performance. Applications may be locked into a choice that is not ideal for all clients and changing conditions. Pileus is a repl ..."
Abstract
-
Cited by 15 (1 self)
- Add to MetaCart
(Show Context)
Choosing a cloud storage system and specific operations for reading and writing data requires developers to make decisions that trade off consistency for availability and performance. Applications may be locked into a choice that is not ideal for all clients and changing conditions. Pileus is a replicated key-value store that allows applications to declare their consistency and latency priorities via consistencybased service level agreements (SLAs). It dynamically selects which servers to access in order to deliver the best service given the current configuration and system conditions. In application-specific SLAs, developers can request both strong and eventual consistency as well as intermediate guarantees such as read-mywrites. Evaluations running on a worldwide test bed with geo-replicated data show that the system adapts to varying client-server latencies to provide service that matches or exceeds the best static consistency choice and server selection scheme.
Declarative Automated Cloud Resource Orchestration
"... As cloud computing becomes widely deployed, one of the challenges faced involves the ability to orchestrate a highly complex set of subsystems (compute, storage, network resources) that span large geographic areas serving diverse clients. To ease this process, we present COPE (Cloud Orchestration Po ..."
Abstract
-
Cited by 13 (8 self)
- Add to MetaCart
(Show Context)
As cloud computing becomes widely deployed, one of the challenges faced involves the ability to orchestrate a highly complex set of subsystems (compute, storage, network resources) that span large geographic areas serving diverse clients. To ease this process, we present COPE (Cloud Orchestration Policy Engine), a distributed platform that allows cloud providers to perform declarative automated cloud resource orchestration. In COPE, cloud providers specify system-wide constraints and goals using COPElog, a declarative policy language geared towards specifying distributed constraint optimizations. COPE takes policy specifications and cloud system states as input and then optimizes compute, storage and network resource allocations within the cloud such that provider operational objectives and customer SLAs can be better met. We describe our proposed integration with a cloud orchestration platform, and present initial evaluation results that demonstrate the viability of COPE using production traces from a large hosting company in the US. We further discuss an orchestration scenario that involves geographically distributed data centers, and conclude with an ongoing status of our work. Categories and Subject Descriptors
Surviving failures in bandwidthconstrained datacenters
- In Proceedings of the ACM SIGCOMM 2012 conference on Applications, technologies, architectures, and protocols for computer communication, SIGCOMM ’12
, 2012
"... Abstract-Datacenter networks have been designed to tolerate failures of network equipment and provide sufficient bandwidth. In practice, however, failures and maintenance of networking and power equipment often make tens to thousands of servers unavailable, and network congestion can increase servi ..."
Abstract
-
Cited by 12 (1 self)
- Add to MetaCart
(Show Context)
Abstract-Datacenter networks have been designed to tolerate failures of network equipment and provide sufficient bandwidth. In practice, however, failures and maintenance of networking and power equipment often make tens to thousands of servers unavailable, and network congestion can increase service latency. Unfortunately, there exists an inherent tradeoff between achieving high fault tolerance and reducing bandwidth usage in network core; spreading servers across fault domains improves fault tolerance, but requires additional bandwidth, while deploying servers together reduces bandwidth usage, but also decreases fault tolerance. We present a detailed analysis of a large-scale Web application and its communication patterns. Based on that, we propose and evaluate a novel optimization framework that achieves both high fault tolerance and significantly reduces bandwidth usage in the network core by exploiting the skewness in the observed communication patterns.
TailGate: Handling Long-Tail Content with a Little Help from Friends
- In Proceedings of WWW’12
, 2012
"... Distributing long-tail content is an inherently difficult task due to the low amortization of bandwidth transfer costs as such content has limited number of views. Two recent trends are making this problem harder. First, the increasing popularity of user-generated content (UGC) and online social net ..."
Abstract
-
Cited by 12 (1 self)
- Add to MetaCart
(Show Context)
Distributing long-tail content is an inherently difficult task due to the low amortization of bandwidth transfer costs as such content has limited number of views. Two recent trends are making this problem harder. First, the increasing popularity of user-generated content (UGC) and online social networks (OSNs) create and reinforce such popularity distributions. Second, the recent trend of geo-replicating content across multiple PoPs spread around the world, done for improving quality of experience (QoE) for users and for redundancy reasons, can lead to unnecessary bandwidth costs. We build TailGate, a system that exploits social relationships, regularities in read access patterns, and time-zone differences to efficiently and selectively distribute long-tail content across PoPs. We evaluate TailGate using large traces from an OSN and show that it can decrease WAN bandwidth costs by as much as 80 % as well as reduce latency, improving QoE. We deploy TailGate on PlanetLab and show that even in the case when imprecise social information is available, TailGate can still decrease the latency for accessing long-tail YouTube videos by a factor of 2.
A Hierarchical Model to Evaluate Quality of Experience of Online Services hosted by Cloud Computing
"... Abstract—As online service providers utilize cloud computing to host their services, they are challenged by evaluating the quality of experience and designing redirection strategies in this complicated environment. We propose a hierarchical modeling approach that can easily combine all components of ..."
Abstract
-
Cited by 10 (1 self)
- Add to MetaCart
(Show Context)
Abstract—As online service providers utilize cloud computing to host their services, they are challenged by evaluating the quality of experience and designing redirection strategies in this complicated environment. We propose a hierarchical modeling approach that can easily combine all components of this environment. Identifying interactions among the components is the key to construct such models. In this particular environment, we first construct four sub-models: an outbound bandwidth model, a cloud computing availability model, a latency model and a cloud computing response time model. Then we use a redirection strategy graph to glue them together. We also introduce an all-in-one barometer to ease the evaluation. The numeric results show that our model serves as a very useful analytical tool for online service providers to evaluate cloud computing providers and design redirection strategies. I.
Content and Geographical Locality in User-Generated Content Sharing Systems
- In Proceedings of ACM NOSSDAV
, 2012
"... User Generated Content (UGC), such as YouTube videos, accounts for a substantial fraction of the Internet traffic. To optimize their performance, UGC services usually rely on both proactive and reactive approaches that exploit spa-tial and temporal locality in access patterns. Alternative types of l ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
(Show Context)
User Generated Content (UGC), such as YouTube videos, accounts for a substantial fraction of the Internet traffic. To optimize their performance, UGC services usually rely on both proactive and reactive approaches that exploit spa-tial and temporal locality in access patterns. Alternative types of locality are also relevant and hardly ever consid-ered together. In this paper, we show on a large (more than 650,000 videos) YouTube dataset that content locality (in-duced by the related videos feature) and geographic locality, are in fact correlated. More specifically, we show how the geographic view distribution of a video can be inferred to a large extent from that of its related videos. We leverage these findings to propose a UGC storage system that proac-tively places videos close to the expected requests. Com-pared to a caching-based solution, our system decreases by 16 % the number of requests served from a different country than that of the requesting user, and even in this case, the distance between the user and the server is 29 % shorter on average.
A global name service for a highly mobile internetwork
- in Proceedings of the 2014 ACM conference on SIGCOMM. ACM
"... Mobile devices dominate the Internet today, however the Internet rooted in its tethered origins continues to provide poor infrastructure support for mobility. Our position is that in order to address this problem, a key challenge that must be addressed is the design of a massively scalable global na ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
(Show Context)
Mobile devices dominate the Internet today, however the Internet rooted in its tethered origins continues to provide poor infrastructure support for mobility. Our position is that in order to address this problem, a key challenge that must be addressed is the design of a massively scalable global name service that rapidly resolves identities to network loca-tions under high mobility. Our primary contribution is the design, implementation, and evaluation of Auspice, a next-generation global name service that addresses this challenge. A key insight underlying Auspice is a demand-aware replica placement engine that intelligently replicates name records to provide low lookup latency, low update cost, and high availability. We have implemented a prototype of Auspice and compared it against several commercial managed DNS providers as well as state-of-the-art research alternatives, and shown that Auspice significantly outperforms both. We demonstrate proof-of-concept that Auspice can serve as a complete end-to-end mobility solution as well as enable novel context-based communication primitives that generalize name-or address-based communication in today’s Internet.
Cost optimization for online social networks on geo-distributed clouds
- in ICNP
, 2012
"... clouds provide an intriguing platform to deploy Online Social Network (OSN) services. To leverage the potential of clouds, a major task of OSN providers is optimizing the monetary cost spent on cloud resource utilization while providing satisfactory Quality of Service (QoS) to OSN users. We thus stu ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
(Show Context)
clouds provide an intriguing platform to deploy Online Social Network (OSN) services. To leverage the potential of clouds, a major task of OSN providers is optimizing the monetary cost spent on cloud resource utilization while providing satisfactory Quality of Service (QoS) to OSN users. We thus study the problem of cost optimization for the dynamic OSN on multiple geo-distributed clouds over consecutive time periods, with its QoS meeting the pre-defined requirement. We model the QoS as well as the cost of an OSN, formulate the problem, and design a solution named cosplay. Our experiments with a large-scale Twitter trace show that, while always ensuring the QoS as required, cosplay can achieve superior one-time cost reduction compared with the state of the art, and can also reduce the accumulative cost significantly when continuously evaluated over 48 months with dynamics comparable to real-world OSNs. I.