Results 1 - 10
of
13
Fine-grained dynamic instrumentation of commodity operating system kernels
, 1999
"... We have developed a technology, fine-grained dynamic instrumentation of commodity kernels, which can splice (insert) dynamically generated code before almost any machine code instruction of a completely unmodified running commodity operating system kernel. This technology is well-suited to performan ..."
Abstract
-
Cited by 107 (5 self)
- Add to MetaCart
We have developed a technology, fine-grained dynamic instrumentation of commodity kernels, which can splice (insert) dynamically generated code before almost any machine code instruction of a completely unmodified running commodity operating system kernel. This technology is well-suited to performance profiling, debugging, code coverage, security auditing, runtime code optimizations, and kernel extensions. We have designed and implemented a tool called KernInst that performs dynamic instrumentation on a stock production Solaris kernel running on an UltraSPARC. On top of KernInst, we have implemented a kernel performance profiling tool, and used it to understand kernel and application performance under a Web proxy server workload. We used this information to make two changes (one to the kernel, one to the proxy) that cumulatively reduce the percentage of elapsed time that the proxy spends opening disk cache files from 40 % to 7%. 1
Minerva: an automated resource provisioning tool for large-scale storage systems
- ACM Transactions on Computer Systems
, 2001
"... Enterprise-scale storage systems, which can contain hundreds of host computers and storage devices and up to tens of thousands of disks and logical volumes, are difficult to design. The volume of choices that need to be made is massive, and many choices have unforeseen interactions. Storage system d ..."
Abstract
-
Cited by 103 (24 self)
- Add to MetaCart
Enterprise-scale storage systems, which can contain hundreds of host computers and storage devices and up to tens of thousands of disks and logical volumes, are difficult to design. The volume of choices that need to be made is massive, and many choices have unforeseen interactions. Storage system design is tedious and complicated to do by hand, usually leading to solutions that are grossly overprovisioned, substantially under-performing or, in the worst case, both. To solve the configuration nightmare, we present MINERVA: a suite of tools for designing storage systems automatically. MINERVA uses declarative specifications of application requirements and device capabilities; constraint-based formulations of the various subproblems; and optimization techniques to explore the search space of possible solutions. This paper also explores and evaluates the design decisions that went into MINERVA, using specialized micro and macro-benchmarks. We show that MINERVA can successfully handle a workload with substantial complexity (a decision-support database benchmark). MINERVA created a 16-disk design in only a few minutes that achieved the same performance as a 30-disk system manually designed by human experts. Of equal importance, MINERVA was able to predict the resulting system's performance before it was built.
Making the “Box” Transparent: System Call Performance as a First-class Result
"... For operating system intensive applications, the ability of designers to understand system call performance behavior is essential to achieving high performance. Conventional performance tools, such as monitoring tools and profilers, collect and present their information off-line or via out-ofband ch ..."
Abstract
-
Cited by 19 (2 self)
- Add to MetaCart
For operating system intensive applications, the ability of designers to understand system call performance behavior is essential to achieving high performance. Conventional performance tools, such as monitoring tools and profilers, collect and present their information off-line or via out-ofband channels. We believe that making this information first-class and exposing it to applications via in-band channels on a per-call basis presents opportunities for performance analysis and tuning not available via other mechanisms. Furthermore, our approach provides direct feedback to applications on time spent in the kernel, resource contention, and time spent blocked, allowing them to immediately observe how their actions affect kernel behavior. Not only does this approach provide greater transparency into the workings of the kernel, but it also allows applications to control how performance information is collected, filtered, and correlated with application-level events. To demonstrate the power of this approach, we show that our implementation, DeBox, obtains precise information about OS behavior at low cost, and that it can be used in debugging and tuning application performance on complex workloads. In particular, we focus on the industry-standard SpecWeb99 benchmark running on the Flash Web Server. Using DeBox, we are able to diagnose a series of problematic interactions between the server and the OS. Addressing these issues as well as other optimization opportunities generates an overall factor of four improvement in our SpecWeb99 score, throughput gains on other benchmarks, and latency reductions ranging from a factor of 4 to 47.
Using Dynamic Kernel Instrumentation for Kernel and Application Tuning
, 1999
"... We have designed a new technology, fine-grained dynamic instrumentation of commodity operating system kernels, which can insert runtime-generated code at almost any machine code instruction of an unmodified operating system kernel. This technology is ideally suited for kernel performance profiling, ..."
Abstract
-
Cited by 18 (2 self)
- Add to MetaCart
We have designed a new technology, fine-grained dynamic instrumentation of commodity operating system kernels, which can insert runtime-generated code at almost any machine code instruction of an unmodified operating system kernel. This technology is ideally suited for kernel performance profiling, debugging, code coverage, runtime optimization, and extensibility. We have written a tool called KernInst that implements dynamic instrumentation on a stock production Solaris 2.5.1 kernel running on an UltraSparc CPU. We have written a kernel performance profiler on top of KernInst. Measuring kernel performance has a two-way benefit; it can suggest optimizations to both the kernel and to applications that spend much of their time in kernel code. In this paper, we present our experiences using KernInst to identify kernel bottlenecks when running a web proxy server. By profiling kernel routines, we were able to understand performance bottlenecks inherent in the proxy's disk cache organization...
Disk array models in Minerva
, 2001
"... storage systems, disk arrays, analytical models Enterprise storage systems typically depend on disk arrays to satisfy their capacity and availability needs. To design and maintain storage systems that efficiently satisfy evolving requirements, it is critical to be able to evaluate configuration alte ..."
Abstract
-
Cited by 7 (2 self)
- Add to MetaCart
storage systems, disk arrays, analytical models Enterprise storage systems typically depend on disk arrays to satisfy their capacity and availability needs. To design and maintain storage systems that efficiently satisfy evolving requirements, it is critical to be able to evaluate configuration alternatives without having to physically implement them. Because of the large number of candidate configurations that need to be evaluated in real-life situations, simulation models are excessively slow for that task. In this paper, we describe analytical throughput models for RAID 1/0 and RAID 5 storage in the Hewlett-Packard FC-30 disk array. We validate our models against the real array, and report the relative errors in the models ’ predictions. Our models have a mean error of 5.4 % and a maximum error of 19%, for the set of validations workloads we used.
Algorithms for Off-Line Clock Synchronisation
, 1995
"... Off-line clock synchronisation algorithms, in which synchronisation is performed by adjusting a collection of recorded timestamps, is suitable for use with many monitors for distributed systems. Off-line synchronisation can often achieve very good synchronisation without the need for extra messages. ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
Off-line clock synchronisation algorithms, in which synchronisation is performed by adjusting a collection of recorded timestamps, is suitable for use with many monitors for distributed systems. Off-line synchronisation can often achieve very good synchronisation without the need for extra messages. The work described here builds on earlier work in this area [4] by introducing new synchronisation algorithms, developing ways of evaluating algorithms, and performing an extensive set of experiments based on five different algorithms and a considerable amount of data collected by a monitor for Amoeba. The best algorithms achieve excellent synchronisation, and are used in the Amoeba monitor.
Fast Kernel Tracing: A Performance Evaluation Tool for Linux
- Proceedings of the 19th IASTED International Conference on Applied Informatics (AI 2001)
, 2001
"... This paper describes a new software performance evaluation tool for Linux called FKT: Fast Kernel Tracing. This tool consists of a number of modifications and additions to the Linux kernel on Pentium PCs that provide a mechanism for recording the flow of control through the kernel with very high pre ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
This paper describes a new software performance evaluation tool for Linux called FKT: Fast Kernel Tracing. This tool consists of a number of modifications and additions to the Linux kernel on Pentium PCs that provide a mechanism for recording the flow of control through the kernel with very high precision and very low perturbation to kernel timings. This is accomplished by utilizing special instructions in the Pentium hardware to record the value of the processor's cycle clock and to permit rapid access to shared memory buffers. Data is collected in real-time during a measurement session, and is then analyzed off-line to produce detailed accountings of processor activities. The primary design goal for this tool is to provide a simple, highly-accurate, low-overhead technique for evaluating a kernel's performance. Its use to investigate the TCP-IP protocol stack in Linux will be discussed. It has also proven useful for two other unrelated tasks: kernel debugging and educational instruction, as will also be discussed.
An Approach for Network Forwarding Systems Quality
"... We present a design pattern for improving the reliability of forwarding failures within protocol processes and kernel routers. Our design pattern makes use of a localised Invisible Recovery (IR) process and a technique for optimising the Kernel process. We evaluate the current techniques that provid ..."
Abstract
- Add to MetaCart
We present a design pattern for improving the reliability of forwarding failures within protocol processes and kernel routers. Our design pattern makes use of a localised Invisible Recovery (IR) process and a technique for optimising the Kernel process. We evaluate the current techniques that provide recovery within network systems; by doing so we then pinpoint the different approaches and the benefits of our IR technique has over others.
An Approach for Network Forwarding
, 2001
"... We present a design pattern for improving the reliability of forwarding failures within protocol processes and kernel routers. Our design pattern makes use of a localised Invisible Recovery (IR) process and a technique for optimising the Kernel process. We evaluate the current techniques that pr ..."
Abstract
- Add to MetaCart
We present a design pattern for improving the reliability of forwarding failures within protocol processes and kernel routers. Our design pattern makes use of a localised Invisible Recovery (IR) process and a technique for optimising the Kernel process. We evaluate the current techniques that provide recovery within network systems; by doing so we then pinpoint the different approaches and the benefits of our IR technique has over others.

