Results 1 -
6 of
6
Abbadi, “G-Store: A Scalable Data Store for Transactional Multi key
- Access in the Cloud,” in SOCC, 2010
"... Cloud computing has emerged as a preferred platform for deploying scalable web-applications. With the growing scale of these applications and the data associated with them, scalable data management systems form a crucial part of the cloud infrastructure. Key-Value stores – such as Bigtable, PNUTS, D ..."
Abstract
-
Cited by 15 (7 self)
- Add to MetaCart
Cloud computing has emerged as a preferred platform for deploying scalable web-applications. With the growing scale of these applications and the data associated with them, scalable data management systems form a crucial part of the cloud infrastructure. Key-Value stores – such as Bigtable, PNUTS, Dynamo, and their open source analogues – have been the preferred data stores for applications in the cloud. In these systems, data is represented as Key-Value pairs, and atomic access is provided only at the granularity of single keys. While these properties work well for current applications, they are insufficient for the next generation web applications – such as online gaming, social networks, collaborative editing, and many more – which emphasize collaboration. Since collaboration by definition requires consistent access to groups of keys, scalable and consistent multi key access is critical for such applications. We propose the Key Group abstraction that defines a relationship between a group of keys and is the granule for on-demand transactional access. This abstraction allows the Key Grouping protocol to collocate control for the keys in the group to allow efficient access to the group of keys. Using the Key Grouping protocol, we design and implement G-Store which uses a key-value store as an underlying substrate to provide efficient, scalable, and transactional multi key access. Our implementation using a standard key-value store and experiments using a cluster of commodity machines show that G-Store preserves the desired properties of key-value stores, while providing multi key access functionality at a very low overhead.
Megastore: Providing Scalable, Highly Available Storage for Interactive Services
- CONFERENCE ON INNOVATIVE DATABASE RESEARCH (CIDR) 2011
, 2011
"... Megastore is a storage system developed to meet the requirements of today’s interactive online services. Megastore blends the scalability of a NoSQL datastore with the convenience of a traditional RDBMS in a novel way, and provides both strong consistency guarantees and high availability. We provide ..."
Abstract
-
Cited by 15 (0 self)
- Add to MetaCart
Megastore is a storage system developed to meet the requirements of today’s interactive online services. Megastore blends the scalability of a NoSQL datastore with the convenience of a traditional RDBMS in a novel way, and provides both strong consistency guarantees and high availability. We provide fully serializable ACID semantics within fine-grained partitions of data. This partitioning allows us to synchronously replicate each write across a wide area network with reasonable latency and support seamless failover between datacenters. This paper describes Megastore’s semantics and replication algorithm. It also describes our experience supporting a wide range of Google production services built with Megastore.
An Evaluation of Alternative Architectures for Transaction Processing in the Cloud
"... Cloud computing promises a number of advantages for the deployment of data-intensive applications. One important promise is reduced cost with a pay-as-you-go business model. Another promise is (virtually) unlimited throughput by adding servers if the workload increases. This paper lists alternative ..."
Abstract
-
Cited by 11 (1 self)
- Add to MetaCart
Cloud computing promises a number of advantages for the deployment of data-intensive applications. One important promise is reduced cost with a pay-as-you-go business model. Another promise is (virtually) unlimited throughput by adding servers if the workload increases. This paper lists alternative architectures to effect cloud computing for database applications and reports on the results of a comprehensive evaluation of existing commercial cloud services that have adopted these architectures. The focus of this work is on transaction processing (i.e., read and update workloads), rather than analytics or OLAP workloads, which have recently gained a great deal of attention. The results are surprising in several ways. Most importantly, it seems that all major vendors have adopted a different architecture for their cloud services. As a result, the cost and performance of the services vary significantly depending on the workload.
CloudTPS: Scalable transactions for Web applications in the cloud
, 2010
"... NoSQL Cloud data services provide scalability and high availability properties for web applications but at the same time they sacrifice data consistency. However, many applications cannot afford any data inconsistency. CloudTPS is a scalable transaction manager to allow cloud database services to ex ..."
Abstract
-
Cited by 8 (3 self)
- Add to MetaCart
NoSQL Cloud data services provide scalability and high availability properties for web applications but at the same time they sacrifice data consistency. However, many applications cannot afford any data inconsistency. CloudTPS is a scalable transaction manager to allow cloud database services to execute the ACID transactions of web applications, even in the presence of server failures and network partitions. We implement this approach on top of the two main families of scalable data layers: Bigtable and SimpleDB. Performance evaluation on top of HBase (an open-source version of Bigtable) in our local cluster and Amazon SimpleDB in the Amazon cloud shows that our system scales linearly at least up to 40 nodes in our local cluster and 80 nodes in the
IEEE TRANSACTIONS ON SERVICES COMPUTING, SPECIAL ISSUE ON CLOUD COMPUTING, 2011 1 CloudTPS: Scalable Transactions for Web Applications in the Cloud
"... Abstract—NoSQL Cloud data stores provide scalability and high availability properties for web applications, but at the same time they sacrifice data consistency. However, many applications cannot afford any data inconsistency. CloudTPS is a scalable transaction manager which guarantees full ACID pro ..."
Abstract
- Add to MetaCart
Abstract—NoSQL Cloud data stores provide scalability and high availability properties for web applications, but at the same time they sacrifice data consistency. However, many applications cannot afford any data inconsistency. CloudTPS is a scalable transaction manager which guarantees full ACID properties for multi-item transactions issued by Web applications, even in the presence of server failures and network partitions. We implement this approach on top of the two main families of scalable data layers: Bigtable and SimpleDB. Performance evaluation on top of HBase (an open-source version of Bigtable) in our local cluster and Amazon SimpleDB in the Amazon cloud shows that our system scales linearly at least up to 40 nodes in our local cluster and 80 nodes in the Amazon cloud.
Executing Web Application Queries on a Partitioned Database
"... Partitioning data is an attractive way to increase storage server throughput for web-like workloads, but two key challenges arise: (1) web workloads often do not have one clear partitioning and (2) it is challenging for the web developer to determine how to efficiently execute queries over partition ..."
Abstract
- Add to MetaCart
Partitioning data is an attractive way to increase storage server throughput for web-like workloads, but two key challenges arise: (1) web workloads often do not have one clear partitioning and (2) it is challenging for the web developer to determine how to efficiently execute queries over partitioned tables. These two challenges can lead to per-query overhead, rather than useful work, dominating total throughput. This paper presents Dixie, a SQL query planner, optimizer, and executor for databases horizontally partitioned over multiple servers. Dixie automates the exploitation of tables with multiple copies partitioned in different ways, in order to increase throughput by expanding the portion of queries that need not be sent to all servers. Central to Dixie’s design are a cost model and plan generator that are mindful of queries small enough that query overhead may dominate the cost. We evaluate Dixie on a database and query stream taken from Wikipedia, partitioned across ten MySQL servers. By adding one copy of a 13 MB table and using Dixie’s query optimizer, we achieve a throughput improvement of 3.2X over a single optimized partitioning of each table and 8.5X over the same data on a single server. On specific queries Dixie with table copies increases throughput linearly with the number of servers, while the best single-table-copy partitioning achieves little scaling. For a large class of joins, which traditional wisdom suggests requires tables partitioned on the join keys, Dixie can find higher-performance plans using other partitionings. 1

