Results 1 - 10
of
12
Abbadi, “G-Store: A Scalable Data Store for Transactional Multi key
- Access in the Cloud,” in SOCC, 2010
"... Cloud computing has emerged as a preferred platform for deploying scalable web-applications. With the growing scale of these applications and the data associated with them, scalable data management systems form a crucial part of the cloud infrastructure. Key-Value stores – such as Bigtable, PNUTS, D ..."
Abstract
-
Cited by 15 (7 self)
- Add to MetaCart
Cloud computing has emerged as a preferred platform for deploying scalable web-applications. With the growing scale of these applications and the data associated with them, scalable data management systems form a crucial part of the cloud infrastructure. Key-Value stores – such as Bigtable, PNUTS, Dynamo, and their open source analogues – have been the preferred data stores for applications in the cloud. In these systems, data is represented as Key-Value pairs, and atomic access is provided only at the granularity of single keys. While these properties work well for current applications, they are insufficient for the next generation web applications – such as online gaming, social networks, collaborative editing, and many more – which emphasize collaboration. Since collaboration by definition requires consistent access to groups of keys, scalable and consistent multi key access is critical for such applications. We propose the Key Group abstraction that defines a relationship between a group of keys and is the granule for on-demand transactional access. This abstraction allows the Key Grouping protocol to collocate control for the keys in the group to allow efficient access to the group of keys. Using the Key Grouping protocol, we design and implement G-Store which uses a key-value store as an underlying substrate to provide efficient, scalable, and transactional multi key access. Our implementation using a standard key-value store and experiments using a cluster of commodity machines show that G-Store preserves the desired properties of key-value stores, while providing multi key access functionality at a very low overhead.
Megastore: Providing Scalable, Highly Available Storage for Interactive Services
- CONFERENCE ON INNOVATIVE DATABASE RESEARCH (CIDR) 2011
, 2011
"... Megastore is a storage system developed to meet the requirements of today’s interactive online services. Megastore blends the scalability of a NoSQL datastore with the convenience of a traditional RDBMS in a novel way, and provides both strong consistency guarantees and high availability. We provide ..."
Abstract
-
Cited by 15 (0 self)
- Add to MetaCart
Megastore is a storage system developed to meet the requirements of today’s interactive online services. Megastore blends the scalability of a NoSQL datastore with the convenience of a traditional RDBMS in a novel way, and provides both strong consistency guarantees and high availability. We provide fully serializable ACID semantics within fine-grained partitions of data. This partitioning allows us to synchronously replicate each write across a wide area network with reasonable latency and support seamless failover between datacenters. This paper describes Megastore’s semantics and replication algorithm. It also describes our experience supporting a wide range of Google production services built with Megastore.
ElasTraS: An Elastic, Scalable, and Self Managing Transactional Database for the Cloud
"... Cloud computing has emerged as a pervasive platform for deploying scalable and highly available Internet applications. To facilitate the migration of data-driven applications to the cloud: elasticity, scalability, fault-tolerance, and self-manageability (henceforth referred to as cloud features) are ..."
Abstract
-
Cited by 12 (10 self)
- Add to MetaCart
Cloud computing has emerged as a pervasive platform for deploying scalable and highly available Internet applications. To facilitate the migration of data-driven applications to the cloud: elasticity, scalability, fault-tolerance, and self-manageability (henceforth referred to as cloud features) are fundamental requirements for database management systems (DBMS) driving such applications. Even though extremely successful in the traditional enterprise setting – the high cost of commercial relational database software, and the lack of the desired cloud features in the open source counterparts – relational databases (RDBMS) are not a competitive choice for cloud-bound applications. As a result, Key-Value stores have emerged as a preferred choice for scalable and faulttolerant data management, but lack the rich functionality, and transactional guarantees of RDBMS. We present ElasTraS, an Elastic TranSactional relational database, designed to scale out using a cluster of commodity machines while being fault-tolerant and self managing. ElasTraS is designed to support both classes of database needs for the cloud: (i) large databases partitioned across a set of nodes, and (ii) a large number of small and independent databases common in multi-tenant databases. ElasTraS borrows from the design philosophy of scalable Key-Value stores to minimize distributed synchronization and remove scalability bottlenecks, while leveraging decades of research on transaction processing, concurrency control, and recovery to support rich functionality and transactional guarantees. We present the design of ElasTraS, implementation details of our initial prototype system, and experimental results executing the TPC-C benchmark.
Zephyr: Live Migration in Shared Nothing Databases for Elastic Cloud Platforms
"... Multitenant data infrastructures for large cloud platforms hosting hundreds of thousands of applications face the challenge of serving applications characterized by small data footprint and unpredictable load patterns. When such a platform is built on an elastic pay-per-use infrastructure, an added ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
Multitenant data infrastructures for large cloud platforms hosting hundreds of thousands of applications face the challenge of serving applications characterized by small data footprint and unpredictable load patterns. When such a platform is built on an elastic pay-per-use infrastructure, an added challenge is to minimize the system’s operating cost while guaranteeing the tenants ’ service level agreements (SLA). Elastic load balancing is therefore an important feature to enable scale-up during high load while scaling down when the load is low. Live migration, a technique to migrate tenants with minimal service interruption and no downtime, is critical to allow lightweight elastic scaling. We focus on the problem of live migration in the database layer. We propose Zephyr, a technique to efficiently migrate a live database in a shared nothing transactional database architecture. Zephyr uses phases of ondemand pull and asynchronous push of data, requires minimal synchronization, results no service unavailability and few or no aborted transactions, minimizes the data transfer overhead, provides ACID guarantees during migration, and ensures correctness in the presence of failures. We outline a prototype implementation using an open source relational database engine and an present a thorough evaluation using various transactional workloads. Zephyr’s efficiency is evident from the few tens of failed operations, 10-20% change in average transaction latency, minimal messaging, and no overhead during normal operation when migrating a live database. Categories and Subject Descriptors H.2.4 [Database Management]: Systems—Relational databases,
Live Database Migration for Elasticity in a Multitenant Database for Cloud Platforms
"... The growing popularity of cloud computing as a platform for deploying internet scale applications has seen a large number of web applications being deployed in the cloud. These applications (or tenants) are typically characterized by small data footprints, different schemas, and variable load patter ..."
Abstract
-
Cited by 4 (4 self)
- Add to MetaCart
The growing popularity of cloud computing as a platform for deploying internet scale applications has seen a large number of web applications being deployed in the cloud. These applications (or tenants) are typically characterized by small data footprints, different schemas, and variable load patterns. Scalable multitenant database management systems (DBMS) running on a cluster of commodity servers are thus critical for a cloud service provider to support a large number of small applications. Multitenant DBMSs often collocate multiple tenants ’ databases on a single server for effective resource sharing. Due to the variability in load, elastic load balancing of tenants ’ data is critical for performance and cost minimization. On demand migration of tenants ’ databases to distribute load on an elastic cluster of machines is a critical technology for elastic load balancing. Therefore, efficient live database
Albatross: Lightweight Elasticity in Shared Storage Databases for the Cloud using Live Data Migration
"... Database systems serving cloud platforms must serve large numbers of applications (or tenants). In addition to managing tenants with small data footprints, different schemas, and variable load patterns, such multitenant data platforms must minimize their operating costs by efficient resource sharing ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
Database systems serving cloud platforms must serve large numbers of applications (or tenants). In addition to managing tenants with small data footprints, different schemas, and variable load patterns, such multitenant data platforms must minimize their operating costs by efficient resource sharing. When deployed over a pay-per-use infrastructure, elastic scaling and load balancing, enabled by low cost live migration of tenant databases, is critical to tolerate load variations while minimizing operating cost. However, existing databases—relational databases and Key-Value stores alike—lack low cost live migration techniques, thus resulting in heavy performance impact during elastic scaling. We present Albatross, a technique for live migration in a multitenant database serving OLTP style workloads where the persistent database image is stored in a network attached storage. Albatross migrates the database cache and the state of active transactions to ensure minimal impact on transaction execution while allowing transactions active during migration to continue execution. It also guarantees serializability while ensuring correctness during failures. Our evaluation using two OLTP benchmarks shows that Albatross can migrate a live tenant database with no aborted transactions, negligible impact on transaction latency and throughput both during and after migration, and an unavailability window as low as 300 ms. 1.
Abbadi. Who’s Driving this Cloud? Towards Efficient Migration for Elastic and Autonomic Multitenant Databases
, 2010
"... The success of cloud computing as a platform for deploying webapplications has led to a deluge of applications characterized by small data footprints but unpredictable access patterns. An autonomic and scalable multitenant database management system (DBMS) is therefore an important component of the ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
The success of cloud computing as a platform for deploying webapplications has led to a deluge of applications characterized by small data footprints but unpredictable access patterns. An autonomic and scalable multitenant database management system (DBMS) is therefore an important component of the software stack for platforms supporting these applications. Elastic load balancing is a key requirement for effective resource utilization and operational cost minimization. Efficient techniques for database migration are thus essential for elasticity in a multitenant DBMS. Our vision is a DBMS where multitenancy is viewed as virtualization in the database layer, and migration is a first class notion with the same stature as scalability, availability etc. This paper serves as the first step in this direction. We analyze the various models of database multitenancy, formalize the forms of migration, evaluate the offthe-shelf migration techniques, and identify the design space and research goals for an autonomic and elastic multitenant database.
Intelligent management of virtualized resources for database systems in cloud environment
- In ICDE
, 2011
"... Abstract—In a cloud computing environment, resources are shared among different clients. Intelligently managing and allocating resources among various clients is important for system providers, whose business model relies on managing the infrastructure resources in a cost-effective manner while sati ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
Abstract—In a cloud computing environment, resources are shared among different clients. Intelligently managing and allocating resources among various clients is important for system providers, whose business model relies on managing the infrastructure resources in a cost-effective manner while satisfying the client service level agreements (SLAs). In this paper, we address the issue of how to intelligently manage the resources in a shared cloud database system and present SmartSLA, a costaware resource management system. SmartSLA consists of two main components: the system modeling module and the resource allocation decision module. The system modeling module uses machine learning techniques to learn a model that describes the potential profit margins for each client under different resource allocations. Based on the learned model, the resource allocation decision module dynamically adjusts the resource allocations in order to achieve the optimum profits. We evaluate SmartSLA by using the TPC-W benchmark with workload characteristics derived from real-life systems. The performance results indicate that SmartSLA can successfully compute predictive models under different hardware resource allocations, such as CPU and memory, as well as database specific resources, such as the number of replicas in the database systems. The experimental results also show that SmartSLA can provide intelligent service differentiation according to factors such as variable workloads, SLA levels, resource costs, and deliver improved profit margins. Index Terms—cloud computing, virtualization, database systems, multitenant databases I.
a Service (PaaS), and Software as a Service (SaaS). The concept
"... Cloud computing is an extremely successful paradigm of service oriented computing and has revolutionized the way computing infrastructure is abstracted and used. Three most popular cloud ..."
Abstract
- Add to MetaCart
Cloud computing is an extremely successful paradigm of service oriented computing and has revolutionized the way computing infrastructure is abstracted and used. Three most popular cloud
Big Data and Cloud Computing: Current State and Future Opportunities ∗
"... Scalable database management systems (DBMS)—both for update intensive application workloads as well as decision support systems for descriptive and deep analytics—are a critical part of the cloud infrastructure and play an important role in ensuring the smooth transition of applications from the tra ..."
Abstract
- Add to MetaCart
Scalable database management systems (DBMS)—both for update intensive application workloads as well as decision support systems for descriptive and deep analytics—are a critical part of the cloud infrastructure and play an important role in ensuring the smooth transition of applications from the traditional enterprise infrastructures to next generation cloud infrastructures. Though scalable data management has been a vision for more than three decades and much research has focussed on large scale data management in traditional enterprise setting, cloud computing brings its own set of novel challenges that must be addressed to ensure the success of data management solutions in the cloud environment. This tutorial presents an organized picture of the challenges faced by application developers and DBMS designers in developing and deploying internet scale applications. Our background study encompasses both classes of systems: (i) for supporting update heavy applications, and (ii) for ad-hoc analytics and decision support. We then focus on providing an in-depth analysis of systems for supporting update intensive web-applications and provide a survey of the state-of-theart in this domain. We crystallize the design choices made by some successful systems large scale database management systems, analyze the application demands and access patterns, and enumerate the desiderata for a cloud-bound DBMS.

