Results 1 - 10
of
16
Automatic physical design tuning: workload as a sequence
- In Proceedings of the ACM International Conference on Management of Data (SIGMOD
, 2006
"... The area of automatic selection of physical database design to optimize the performance of a relational database system based on a workload of SQL queries and updates has gained prominence in recent years. Major database vendors have released automated physical database design tools with the goal of ..."
Abstract
-
Cited by 21 (0 self)
- Add to MetaCart
The area of automatic selection of physical database design to optimize the performance of a relational database system based on a workload of SQL queries and updates has gained prominence in recent years. Major database vendors have released automated physical database design tools with the goal of reducing the total cost of ownership. An important assumption underlying these tools is that the workload is a set of SQL statements. In this paper, we show that being able to treat the workload as a sequence, i.e., exploiting the ordering of statements can significantly broaden the usage of such tools. We present scenarios where exploiting sequence information in the workload is crucial for performance tuning. We also propose techniques for addressing the technical challenges arising from treating the workload as a sequence. We evaluate the effectiveness of our techniques through experiments on Microsoft SQL Server.
Self-tuning database systems: A decade of progress
- in VLDB, 2007
"... In this paper we discuss advances in self-tuning database systems over the past decade, based on our experience in the AutoAdmin project at Microsoft Research. This paper primarily focuses on the problem of automated physical database design. We also highlight other areas where research on self-tuni ..."
Abstract
-
Cited by 18 (0 self)
- Add to MetaCart
In this paper we discuss advances in self-tuning database systems over the past decade, based on our experience in the AutoAdmin project at Microsoft Research. This paper primarily focuses on the problem of automated physical database design. We also highlight other areas where research on self-tuning database technology has made significant progress. We conclude with our thoughts on opportunities and open issues. 1. HISTORY OF AUTOADMIN PROJECT Our VLDB 1997 paper [26] reported our first technical results from the AutoAdmin project that was started in Microsoft Research in the summer of 1996. The SQL Server product group at that time had taken on the ambitious task of redesigning the SQL Server code for their next release (SQL Server 7.0). Ease of use and elimination of knobs was a driving force for their design
Physical design refinement: The “Merge-Reduce” approach
- In International Conference on Extending Database Technology (EDBT
, 2006
"... Physical database design tools rely on a DBA-provided workload to pick an “optimal ” set of indexes and materialized views. Such tools allow either creating a new such configuration or adding new structures to existing ones. However, these tools do not provide adequate support for incremental and fl ..."
Abstract
-
Cited by 9 (4 self)
- Add to MetaCart
Physical database design tools rely on a DBA-provided workload to pick an “optimal ” set of indexes and materialized views. Such tools allow either creating a new such configuration or adding new structures to existing ones. However, these tools do not provide adequate support for incremental and flexible refinement of existing physical structures. Although such refinements are often very valuable for DBAs, a completely manual approach to refinement can lead to infeasible solutions (e.g., excessive use of space). In this paper, we focus on the important problem of physical design refinement and propose a transformational architecture that is based upon two novel primitive operations, called merging and reduction. These operators help refine a configuration, treating indexes and materialized views in a unified way, as well as succinctly explain the refinement process to DBAs. Categories and Subject Descriptors: H.2.2 [Physical Design]: Access Methods
A Case for A Collaborative Query Management System
"... Over the past 40 years, database management systems (DBMSs) have evolved to provide a sophisticated variety of data management capabilities. At the same time, tools for managing queries over the data have remained relatively primitive. One reason for this is that queries are typically issued through ..."
Abstract
-
Cited by 7 (2 self)
- Add to MetaCart
Over the past 40 years, database management systems (DBMSs) have evolved to provide a sophisticated variety of data management capabilities. At the same time, tools for managing queries over the data have remained relatively primitive. One reason for this is that queries are typically issued through applications. They are thus debugged once and re-used repeatedly. This mode of interaction, however, is changing. As scientists (and others) store and share increasingly large volumes of data in data centers, they need the ability to analyze the data by issuing exploratory queries. In this paper, we argue that, in these new settings, data management systems must provide powerful query management capabilities, from query browsing to automatic query recommendations. We first discuss the requirements for a collaborative query management system. We outline an early system architecture and discuss the many research challenges associated with building such an engine. 1.
An Object Placement Advisor for DB2 Using Solid State Storage
"... Solid state disks (SSDs) provide much faster random access to data compared to conventional hard disk drives. Therefore, the response time of a database engine could be improved by moving the objects that are frequently accessed in a random fashion to the SSD. Considering the price and limited stora ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
Solid state disks (SSDs) provide much faster random access to data compared to conventional hard disk drives. Therefore, the response time of a database engine could be improved by moving the objects that are frequently accessed in a random fashion to the SSD. Considering the price and limited storage capacity of solid state disks, the database administrator needs to determine which objects (tables, indexes, materialized views, etc.), if placed on the SSD, would most improve the performance of the system. In this paper we propose a tool called “Object Placement Advisor ” for making a wise decision for the object placement problem. By collecting profile inputs from workload runs, the advisor utility provides a list of objects to be placed on the SSD by applying heuristics like the greedy knapsack technique or dynamic programming. To show that the proposed approach is effective in conventional database management systems, we have conducted experiments on IBM DB2 with queries and schemas based on the TPC-H and TPC-C benchmarks. The results indicate that using a relatively small amount of SSD storage, the response time of the system can be reduced significantly by considering the recommendation of the advisor. 1.
Constrained Physical Design Tuning
, 2008
"... Existing solutions to the automated physical design problem in database systems attempt to minimize execution costs of input workloads for a given a storage constraint. In this paper, we argue that this model is not flexible enough to address several real-world situations. To overcome this limitatio ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
Existing solutions to the automated physical design problem in database systems attempt to minimize execution costs of input workloads for a given a storage constraint. In this paper, we argue that this model is not flexible enough to address several real-world situations. To overcome this limitation, we introduce a constraint language that is simple yet powerful enough to express many important scenarios. We build upon an existing transformation-based framework to effectively incorporate constraints in the search space. We then show experimentally that we are able to handle a rich class of constraints and that our proposed technique scales gracefully.
The Data Ring: Community Content Sharing
- In Third Biennial Conference on Innovative Data Systems Research (CIDR 2007), Asilomar
, 2007
"... Information ubiquity has created a large crowd of users ..."
Abstract
-
Cited by 4 (3 self)
- Add to MetaCart
Information ubiquity has created a large crowd of users
Why Did My Query Slow Down?
"... Many enterprise environments have databases running on networkattached server-storage infrastructure (referred to as Storage Area Networks or SANs). Both the database and the SAN are complex systems that need their own separate administrative teams. This paper puts forth the vision of an innovative ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
Many enterprise environments have databases running on networkattached server-storage infrastructure (referred to as Storage Area Networks or SANs). Both the database and the SAN are complex systems that need their own separate administrative teams. This paper puts forth the vision of an innovative management framework to simplify administrative tasks that require an in-depth understanding of both the database and the SAN. As a concrete instance, we consider the task of diagnosing the slowdown in performance of a database query that is executed multiple times (e.g., in a periodic report-generation setting). This task is very challenging because the space of possible causes includes problems specific to the database, problems specific to the SAN, and problems that arise due to interactions between the two systems. In addition, the monitoring data available from these systems can be noisy. We describe the design of DIADS which is an integrated diagnosis tool for database and SAN administrators. DIADS generates and uses a powerful abstraction called Annotated Plan Graphs (APGs) that ties together the execution path of queries in the database and the SAN. Using an innovative workflow that combines domainspecific knowledge with machine-learning techniques, DIADS was applied successfully to diagnose query slowdowns caused by complex combinations of events across a PostgreSQL database and a production SAN. 1.
Diads: Addressing the “my-problem-or-yours” syndrome with integrated san and database diagnosis
- In To Appear in Proceedings of Usenix File and Storage Technologies (FAST’09
"... We present DIADS, an integrated DIAgnosis tool for Databases and Storage area networks (SANs). Existing diagnosis tools in this domain have a database-only (e.g., [11]) or SAN-only (e.g., [28]) focus. DIADS is a firstof-a-kind framework based on a careful integration of information from the database ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
We present DIADS, an integrated DIAgnosis tool for Databases and Storage area networks (SANs). Existing diagnosis tools in this domain have a database-only (e.g., [11]) or SAN-only (e.g., [28]) focus. DIADS is a firstof-a-kind framework based on a careful integration of information from the database and SAN subsystems; and is not a simple concatenation of database-only and SANonly modules. This approach not only increases the accuracy of diagnosis, but also leads to significant improvements in efficiency. DIADS uses a novel combination of non-intrusive machine learning techniques (e.g., Kernel Density Estimation) and domain knowledge encoded in a new symptoms database design. The machine learning component provides core techniques for problem diagnosis from monitoring data, and domain knowledge acts as checks-andbalances to guide the diagnosis in the right direction. This unique system design enables DIADS to function effectively even in the presence of multiple concurrent problems as well as noisy data prevalent in production environments. We demonstrate the efficacy of our approach through a detailed experimental evaluation of DI-ADS implemented on a real data center testbed with PostgreSQL databases and an enterprise SAN. 1
Generating shifting workloads to benchmark adaptability in relational database systems
- TPCTC ’09: First TPC Technology Conference on Performance Evaluation and Benchmarking, volume 5895 of Lecture Notes in Computer Science
, 2009
"... Abstract. A large body of research concerns the adaptability of database systems. Many commercial systems already contain autonomic processes that adapt configurations as well as data structures and data organization. Yet there is virtually no possibility for a just measurement of the quality of suc ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Abstract. A large body of research concerns the adaptability of database systems. Many commercial systems already contain autonomic processes that adapt configurations as well as data structures and data organization. Yet there is virtually no possibility for a just measurement of the quality of such optimizations. While standard benchmarks have been developed that simulate real-world database applications very precisely, none of them considers variations in workloads produced by human factors. Today’s benchmarks test the performance of database systems by measuring peak performance on homogeneous request streams. Nevertheless, in systems with user interaction access patterns are constantly shifting. We present a benchmark that simulates a web information system with interaction of large user groups. It is based on the analysis of a real online eLearning management system with 15,000 users. The benchmark considers the temporal dependency of user interaction. Main focus is to measure the adaptability of a database management system according to shifting workloads. We will give details on our design approach that uses sophisticated pattern analysis and data mining techniques.

