Results 1 - 10
of
150
Selective Cache Ways: On-Demand Cache Resource Allocation
, 2000
"... Increasing levels of microprocessor power dissipation call for new approaches at the architectural level that save energy by better matching of on-chip resources to application requirements. Selective cache ways provides the ability to disable a subset of the ways in a set associative cache durin ..."
Abstract
-
Cited by 227 (7 self)
- Add to MetaCart
Increasing levels of microprocessor power dissipation call for new approaches at the architectural level that save energy by better matching of on-chip resources to application requirements. Selective cache ways provides the ability to disable a subset of the ways in a set associative cache during periods of modest cache activity, while the full cache may remain operational for more cache-intensive periods. Because this approach leverages the subarray partitioning that is already present for performance reasons, only minor changes to a conventional cache are required, and therefore, full-speed cache operation can be maintained. Furthermore, the tradeoff between performance and energy is flexible, and can be dynamically tailored to meet changing application and machine environmental conditions. We show that trading off a small performance degradation for energy savings can produce a significant reduction in cache energy dissipation using this approach. 1. Introduction Contin...
The Jalapeño Dynamic Optimizing Compiler for Java
, 1999
"... The JalapeÃño Dynamic Optimizing Compiler is a key component of the JalapeÃño Virtual Machine, a new Java Virtual Machine (JVM) designed to support efficient and scalable execution of Java applications on SMP server machines. This paper describes the design of the JalapeÃño Optimizing Compiler, and ..."
Abstract
-
Cited by 159 (28 self)
- Add to MetaCart
The JalapeÃño Dynamic Optimizing Compiler is a key component of the JalapeÃño Virtual Machine, a new Java Virtual Machine (JVM) designed to support efficient and scalable execution of Java applications on SMP server machines. This paper describes the design of the JalapeÃño Optimizing Compiler, and the implementation results that we have obtained thus far. To the best of our knowledge, this is the first dynamic optimizing compiler for Java that is being used in a JVM with a compile-only approach to program execution.
A Framework for Reducing the Cost of Instrumented Code
- In SIGPLAN Conference on Programming Language Design and Implementation
, 2001
"... Instrumenting code to collect profiling information can cause substantial execution overhead. This overhead makes instrumentation difficult to perform at runtime, often preventing many known offline feedback-directed optimizations from being used in online systems. This paper presents a general fram ..."
Abstract
-
Cited by 147 (8 self)
- Add to MetaCart
Instrumenting code to collect profiling information can cause substantial execution overhead. This overhead makes instrumentation difficult to perform at runtime, often preventing many known offline feedback-directed optimizations from being used in online systems. This paper presents a general framework for performing instrumentation sampling to reduce the overhead of previously expensive instrumentation. The framework is simple and effective, using code-duplication and counter-based sampling to allow switching between instrumented and non-instrumented code.
The concept of dynamic analysis
- In ESEC / SIGSOFT FSE
, 1999
"... Abstract. Dynamic analysis is the analysis of the properties of a run-ning program. In this paper, we explore two new dynamic analyses based on program profiling:- Frequency Spectrum Analysis. We show how analyzing the frequen-cies of program entities in a single execution can help programmers to de ..."
Abstract
-
Cited by 95 (0 self)
- Add to MetaCart
Abstract. Dynamic analysis is the analysis of the properties of a run-ning program. In this paper, we explore two new dynamic analyses based on program profiling:- Frequency Spectrum Analysis. We show how analyzing the frequen-cies of program entities in a single execution can help programmers to decompose a program, identify related computations, and find computations related to specific input and output characteristics of a program.- Coverage Concept Analysis. Concept analysis of test coverage data computes dynamic analogs to static control flow relationships such as domination, postdomination, and regions. Comparison of these dynamically computed relationships to their static counterparts can point to areas of code requiring more testing and can aid program-mers in understanding how a program and its test sets relate to one another. 1
Measuring and Characterizing System Behavior Using Kernel-Level Event Logging
, 2000
"... Analyzing the dynamic behavior and performance of complex software systems is difficult. Currently available systems either analyze each process in isolation, only provide system level cumulative statistics, or provide a fixed and limited number of process group related statistics. The Linux Trace T ..."
Abstract
-
Cited by 76 (2 self)
- Add to MetaCart
Analyzing the dynamic behavior and performance of complex software systems is difficult. Currently available systems either analyze each process in isolation, only provide system level cumulative statistics, or provide a fixed and limited number of process group related statistics. The Linux Trace Toolkit (LTT) introduced here provides a novel, modular, and extensible way of recording and analyzing complete system behavior. Because all significant system events are recorded, it is possible to analyze any desired subset of the running processes, and for instance distinguish between the time spent waiting for some relevant event (data from disk or another process) versus time spent waiting for some unrelated process to use up its time slice. Despite the
Encoding program executions
- In ICSE
, 2001
"... Dynamic analysis is based on collecting data as the program runs. However, raw traces tend to be too voluminous and too unstructured to be used directly for visualization and understanding. We address this problem in two phases: the first phase selects subsets of the data and then compacts it, while ..."
Abstract
-
Cited by 69 (1 self)
- Add to MetaCart
Dynamic analysis is based on collecting data as the program runs. However, raw traces tend to be too voluminous and too unstructured to be used directly for visualization and understanding. We address this problem in two phases: the first phase selects subsets of the data and then compacts it, while the second phase encodes the data in an attempt to infer its structure. Our major compaction/selection techniques include gprof-style N-depth call sequences, selection based on class, compaction based on time intervals, and encoding the whole execution as a directed acyclic graph. Our structure inference techniques include run-length encoding, contextfree grammar encoding, and the building of finite state automata.
A Hardware-Driven Profiling Scheme for Identifying Program Hot Spots to Support Runtime Optimization
- In Proceedings of the 26th Annual International Symposium on Computer Architecture
, 1999
"... This paper presents a novel hardware-based approach for identifying, profiling, and monitoring hot spots in order to support runtime optimization of generalpurpose programs. The proposed approach consists of a set of tightly coupled hardware tables and control logic modules that are placed in the re ..."
Abstract
-
Cited by 68 (4 self)
- Add to MetaCart
This paper presents a novel hardware-based approach for identifying, profiling, and monitoring hot spots in order to support runtime optimization of generalpurpose programs. The proposed approach consists of a set of tightly coupled hardware tables and control logic modules that are placed in the retirement stage of a processor pipeline removed from the critical path. The features of the proposed design include rapid detection of program hot spots after changes in execution behavior, runtime-tunable selection criteria for hot spot detection, and negligible overhead during application execution. Experiments using several SPEC95 benchmarks, as well as several large WindowsNT applications, demonstrate the promise of the proposed design. 1 Introduction Optimizing compilers can gain significant performance benefits by performing code transformations based on a program's runtime profile. Traditionally, profiles are collected by running an instrumented version of the executable. However, bec...
Complete Removal of Redundant Expressions
, 1998
"... Partial redundancy elimination (PRE), the most important component of global optimizers, generalizes the removal of common subexpressions and loop-invariant computations. Because existing PRE implementations are based on code motion, they fail to completely remove the redundancies. In fact, we obser ..."
Abstract
-
Cited by 64 (13 self)
- Add to MetaCart
Partial redundancy elimination (PRE), the most important component of global optimizers, generalizes the removal of common subexpressions and loop-invariant computations. Because existing PRE implementations are based on code motion, they fail to completely remove the redundancies. In fact, we observed that 73% of loop-invariant statements cannot be eliminated from loops by code motion alone. In dynamic terms, traditional PRE eliminates only half of redundancies that are strictly partial. To achieve a complete PRE, control flow restructuring must be applied. However, the resulting code duplication may cause code size explosion. This paper focuses on achieving a complete PRE while incurring an acceptable code growth. First, we present an algorithm for complete removal of partial redundancies, based on the integration of code motion and control flow restructuring. In contrast to existing complete techniques, we resort to restructuring merely to remove obstacles to code motion, rather th...
Software profiling for hot path prediction: less is more
- SIGPLAN Not
"... Recently, there has been a growing interest in exploiting profile information in adaptive systems such as just-in-time compilers, dynamic optimizers and, binary translators. In this paper, we show that sophisticated software profiling schemes that provide highly accurate information in an offline se ..."
Abstract
-
Cited by 61 (0 self)
- Add to MetaCart
Recently, there has been a growing interest in exploiting profile information in adaptive systems such as just-in-time compilers, dynamic optimizers and, binary translators. In this paper, we show that sophisticated software profiling schemes that provide highly accurate information in an offline setting are ill-suited for these dynamic code generation systems. We experimentally demonstrate that hot path predictions must be made early in order to control the rising cost of missed opportunity that result from the prediction delay. We also show that existing sophisticated path profiling schemes, if used in an online setting, offer no prediction advantages over simpler schemes that exhibit much lower runtime overheads. Based on these observation we developed a new low-overhead software profiling scheme for hot path prediction. Using an abstract metric we compare our scheme to path profile based prediction and show that our scheme achieves comparable prediction quality. In our second set of experiments we include runtime overhead and evaluate the performance of our scheme in a realistic application: Dynamo, a dynamic optimization system. The results show that our prediction scheme clearly outperforms path profile based prediction and thus confirm that less profiling as exhibited in our scheme will actually lead to more effective hot path prediction. 1.
BIT: A Tool for Instrumenting Java Bytecodes
, 1997
"... BIT (Bytecode Instrumenting Tool) is a collection of Java classes that allow one to build customized tools to instrument Java Virtual Machine (JVM) bytecodes. Because understanding program behavior is an essential part of developing effective optimization algorithms, researchers and software develop ..."
Abstract
-
Cited by 56 (0 self)
- Add to MetaCart
BIT (Bytecode Instrumenting Tool) is a collection of Java classes that allow one to build customized tools to instrument Java Virtual Machine (JVM) bytecodes. Because understanding program behavior is an essential part of developing effective optimization algorithms, researchers and software developers have built numerous tools that carry out program analysis. Although there are existing tools that analyze and modify executables on a variety of operating systems and machine architectures, there currently is no framework for carrying out the same task for JVM bytecodes. In this paper, we describe BIT, which allows the user to insert calls to analysis methods anywhere in the bytecode, so that information can be extracted from the user program while it is being executed. In this paper, we describe several simple tools built using BIT and also report on BIT's performance. We found that the overhead for the execution speed and size were between 23% to 150%. 1. Introduction It is often imp...

