Results 1 - 10
of
14
Data-Oriented Transaction Execution
"... While hardware technology has undergone major advancements over the past decade, transaction processing systems have remained largely unchanged. The number of cores on a chip grows exponentially, following Moore's Law, allowing for an everincreasing number of transactions to execute in parallel. As ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
While hardware technology has undergone major advancements over the past decade, transaction processing systems have remained largely unchanged. The number of cores on a chip grows exponentially, following Moore's Law, allowing for an everincreasing number of transactions to execute in parallel. As the number of concurrently-executing transactions increases, contended critical sections become scalability burdens. In typical transaction processing systems the centralized lock manager is often the first contended component and scalability bottleneck. In this paper, we identify the conventional thread-totransaction assignment policy as the primary cause of contention. Then, we design DORA, a system that decomposes each transaction to smaller actions and assigns actions to threads based on which data each action is about to access. DORA’s design allows each thread to mostly access thread-local data structures, minimizing interaction with the contention-prone centralized lock manager. Built on top of a conventional storage engine, DORA maintains all the ACID properties. Evaluation of a prototype implementation of DORA on a multicore system demonstrates that DORA attains up to 4.8x higher throughput than a state-of-the-art storage engine when running a variety of synthetic and real-world OLTP workloads. Categories and Subject Descriptors H.2.4 [Database Management]: Systems- transaction
HyPer: A hybrid OLTP&OLAP Main Memory Database System based on Virtual Memory Snapshots
- In ICDE
, 2011
"... Abstract—The two areas of online transaction processing (OLTP) and online analytical processing (OLAP) present different challenges for database architectures. Currently, customers with high rates of mission-critical transactions have split their data into two separate systems, one database for OLTP ..."
Abstract
-
Cited by 3 (3 self)
- Add to MetaCart
Abstract—The two areas of online transaction processing (OLTP) and online analytical processing (OLAP) present different challenges for database architectures. Currently, customers with high rates of mission-critical transactions have split their data into two separate systems, one database for OLTP and one so-called data warehouse for OLAP. While allowing for decent transaction rates, this separation has many disadvantages including data freshness issues due to the delay caused by only periodically initiating the Extract Transform Load-data staging and excessive resource consumption due to maintaining two separate information systems. We present an efficient hybrid system, called HyPer, that can handle both OLTP and OLAP simultaneously by using hardware-assisted replication mechanisms to maintain consistent snapshots of the transactional data. HyPer is a mainmemory database system that guarantees the ACID properties of OLTP transactions and executes OLAP query sessions (multiple queries) on the same, arbitrarily current and consistent snapshot. The utilization of the processor-inherent support for virtual memory management (address translation, caching, copy on update) yields both at the same time: unprecedentedly high transaction rates as high as 100000 per second and very fast OLAP query response times on a single system executing both workloads in parallel. The performance analysis is based on a combined TPC-C and TPC-H benchmark. I.
Relational Cloud: A Database-as-a-Service for the Cloud
"... This paper introduces a new transactional “database-as-a-service” (DBaaS) called Relational Cloud. A DBaaS promises to move much of the operational burden of provisioning, configuration, scaling, performance tuning, backup, privacy, and access control from the database users to the service operator, ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
This paper introduces a new transactional “database-as-a-service” (DBaaS) called Relational Cloud. A DBaaS promises to move much of the operational burden of provisioning, configuration, scaling, performance tuning, backup, privacy, and access control from the database users to the service operator, offering lower overall costs to users. Early DBaaS efforts include Amazon RDS and Microsoft SQL Azure, which are promising in terms of establishing the market need for such a service, but which do not address three important challenges: efficient multi-tenancy, elastic scalability, and database privacy. We argue that these three challenges must be overcome before outsourcing database software and management becomes attractive to many users, and cost-effective for service providers. The key technical features of Relational Cloud include: (1) a workload-aware approach to multi-tenancy that identifies the workloads that can be co-located on a database server, achieving higher consolidation and better performance than existing approaches; (2) the use of a graph-based data partitioning algorithm to achieve near-linear elastic scale-out even for complex transactional workloads; and (3) an adjustable security scheme that enables SQL queries to run over encrypted data, including ordering operations, aggregates, and joins. An underlying theme in the design of the components of Relational Cloud is the notion of workload awareness: by monitoring query patterns and data accesses, the system obtains information useful for various optimization and security functions, reducing the configuration effort for users and operators. 1.
Intrusion Recovery for Database-backed Web Applications
"... WARP is a system that helps users and administrators of web applications recover from intrusions such as SQL injection, cross-site scripting, and clickjacking attacks, while preserving legitimate user changes. WARP repairs from an intrusion by rolling back parts of the database to a version before t ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
WARP is a system that helps users and administrators of web applications recover from intrusions such as SQL injection, cross-site scripting, and clickjacking attacks, while preserving legitimate user changes. WARP repairs from an intrusion by rolling back parts of the database to a version before the attack, and replaying subsequent legitimate actions. WARP allows administrators to retroactively patch security vulnerabilities—i.e., apply new security patches to past executions—to recover from intrusions without requiring the administrator to track down or even detect attacks. WARP’s timetravel database allows fine-grained rollback of database rows, and enables repair to proceed concurrently with normal operation of a web application. Finally, WARP captures and replays user input at the level of a browser’s DOM, to recover from attacks that involve a user’s browser. For a web server running MediaWiki, WARP requires no application source code changes to recover from a range of common web application vulnerabilities with minimal user input at a cost of 24–27 % in throughput and 2–3.2 GB/day in storage.
On Predictive Modeling for Optimizing Transaction Execution in Parallel OLTP Systems
"... A new emerging class of parallel database management systems (DBMS) is designed to take advantage of the partitionable workloads of on-line transaction processing (OLTP) applications [23, 20]. Transactions in these systems are optimized to execute to completion on a single node in a shared-nothing c ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
A new emerging class of parallel database management systems (DBMS) is designed to take advantage of the partitionable workloads of on-line transaction processing (OLTP) applications [23, 20]. Transactions in these systems are optimized to execute to completion on a single node in a shared-nothing cluster without needing to coordinate with other nodes or use expensive concurrency control measures [18]. But some OLTP applications cannot be partitioned such that all of their transactions execute within a singlepartition in this manner. These distributed transactions access data not stored within their local partitions and subsequently require more heavy-weight concurrency control protocols. Further difficulties arise when the transaction’s execution properties, such as the number of partitions it may need to access or whether it will abort, are not known beforehand. The DBMS could mitigate these performance issues if it is provided with additional information about transactions. Thus, in this paper we present a Markov model-based approach for automatically selecting which optimizations a DBMS could use, namely (1) more efficient concurrency control schemes, (2) intelligent scheduling, (3) reduced undo logging, and (4) speculative execution. To evaluate our techniques, we implemented our models and integrated them into a parallel, main-memory OLTP DBMS to show that we can improve the performance of applications with diverse workloads. 1.
Automated Partitioning Design in Parallel Database Systems
"... In recent years, Massively Parallel Processors (MPPs) have gained ground enabling vast amounts of data processing. In such environments, data is partitioned across multiple compute nodes, which results in dramatic performance improvements during parallel query execution. To evaluate certain relation ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
In recent years, Massively Parallel Processors (MPPs) have gained ground enabling vast amounts of data processing. In such environments, data is partitioned across multiple compute nodes, which results in dramatic performance improvements during parallel query execution. To evaluate certain relational operators in a query correctly, data sometimes needs to be re-partitioned (i.e., moved) across compute nodes. Since data movement operations are much more expensive than relational operations, it is crucial to design a suitable data partitioning strategy that minimizes the cost of such expensive data transfers. A good partitioning strategy strongly depends on how the parallel system would be used. In this paper we present a partitioning advisor that recommends the best partitioning design for an expected workload. Our tool recommends which tables should be replicated (i.e., copied into every compute node) and which ones should be distributed according to specific column(s) so that the cost of evaluating similar workloads is minimized. In contrast to previous work, our techniques are deeply integrated with the underlying parallel query optimizer, which results in more accurate recommendations in a shorter amount of time. Our experimental evaluation using a real MPP system, Microsoft SQL Server 2008 Parallel Data Warehouse, with both real and synthetic workloads shows the effectiveness of the proposed techniques and the importance of deep integration of the partitioning advisor with the underlying query optimizer.
Lookup Tables: Fine-Grained Partitioning for Distributed Databases
"... Abstract—The standard way to scale a distributed OLTP DBMS is to horizontally partition data across several nodes. Ideally, this results in each query/transaction being executed at just one node, to avoid the overhead of distribution and allow the system to scale by adding nodes. For some applicatio ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Abstract—The standard way to scale a distributed OLTP DBMS is to horizontally partition data across several nodes. Ideally, this results in each query/transaction being executed at just one node, to avoid the overhead of distribution and allow the system to scale by adding nodes. For some applications, simple strategies such as hashing on primary key provide this property. Unfortunately, for many applications, including social networking and order-fulfillment, simple partitioning schemes applied to many-to-many relationships create a large fraction of distributed queries/transactions. What is needed is a finegrained partitioning, where related individual tuples (e.g., cliques of friends) are co-located together in the same partition. Maintaining a fine-grained partitioning requires storing the location of each tuple. We call this metadata a lookup table. We present a design that efficiently stores very large tables and maintains them as the database is modified. We show they improve scalability for several difficult to partition database workloads, including Wikipedia, Twitter, and TPC-E. Our implementation provides 40 % to 300 % better throughput on these workloads than simple range or hash partitioning. I.
Database Scalability, Elasticity, and Autonomy in the
"... Abstract. Cloud computing has emerged as an extremely successful paradigm for deploying web applications. Scalability, elasticity, pay-per-use pricing, and economies of scale from large scale operations are the major reasons for the successful and widespread adoption of cloud infrastructures. Since ..."
Abstract
- Add to MetaCart
Abstract. Cloud computing has emerged as an extremely successful paradigm for deploying web applications. Scalability, elasticity, pay-per-use pricing, and economies of scale from large scale operations are the major reasons for the successful and widespread adoption of cloud infrastructures. Since a majority of cloud applications are data driven, database management systems (DBMSs) powering these applications form a critical component in the cloud software stack. In this article, we present an overview of our work on instilling these above mentioned “cloud features ” in a database system designed to support a variety of applications deployed in the cloud: designing scalable database management architectures using the concepts of data fission and data fusion, enabling lightweight elasticity using low cost live database migration, and designing intelligent and autonomic controllers for system management without human intervention.
No Bits Left Behind
"... One of the key tenets of database system design is making efficient use of storage and memory resources. However, existing database system implementations are actually extremely wasteful of such resources; for example, most systems leave a great deal of empty space in tuples, index pages, and data p ..."
Abstract
- Add to MetaCart
One of the key tenets of database system design is making efficient use of storage and memory resources. However, existing database system implementations are actually extremely wasteful of such resources; for example, most systems leave a great deal of empty space in tuples, index pages, and data pages, and spend many CPU cycles reading cold records from disk that are never used. In this paper, we identify a number of such sources of waste, and present a series of techniques that limit this waste (e.g., forcing better memory locality for hot data and using empty space in index pages to cache popular tuples) without substantially complicating interfaces or system design. We show that these techniques effectively reduce memory requirements for real scenarios from the Wikipedia database (by up to 17.8×) while increasing query performance (by up to 8×). 1.
3.2 Join Query Types........................... 6
"... NoSQL Cloud data stores provide scalability and high availability properties for web applications, but do not support complex queries such as joins. Developers must therefore design their programs according to the peculiarities of NoSQL data stores rather than established software engineering practi ..."
Abstract
- Add to MetaCart
NoSQL Cloud data stores provide scalability and high availability properties for web applications, but do not support complex queries such as joins. Developers must therefore design their programs according to the peculiarities of NoSQL data stores rather than established software engineering practice. This results in complex and error-prone code, especially when it comes to subtle issues such as data consistency under concurrent read/write queries. CloudTPS implements support for join queries and strongly consistent multi-item read-write transactions in a middleware layer which stands between the Web application and its data store. CloudTPS supports the two main families of scalable data layers: Bigtable and SimpleDB. Performance evaluations show that our system scales linearly under a demanding workload composed of join queries and

