Results 1 - 10
of
21
Bounding worst-case instruction cache performance
- In IEEE Real-Time Systems Symposium
, 1994
"... The use of caches poses a difficult tradeoff for architects of real-time systems. While caches provide significant performance advantages, they have also been viewed as inherently unpredictable since the behavior of a cache reference depends upon the history of the previous references. The use of ca ..."
Abstract
-
Cited by 108 (35 self)
- Add to MetaCart
The use of caches poses a difficult tradeoff for architects of real-time systems. While caches provide significant performance advantages, they have also been viewed as inherently unpredictable since the behavior of a cache reference depends upon the history of the previous references. The use of caches will only be suitable for realtime systems if a reasonably tight bound on the performance of programs using cache memory can be predicted. This paper describes an approach for bounding the worstcase instruction cache performance of large code segments. First, a new method called Static Cache Simulation is used to analyze a program’s control flow to statically categorize the caching behavior of each instruction. A timing analyzer, which uses the categorization information, then estimates the worst-case instruction cache performance for each loop and function in the program. 1.
Bounding Pipeline and Instruction Cache Performance
- IEEE Transactions on Computers
, 1999
"... Predicting the execution time of code segments in real-time systems is challenging. Most recently designed machines contain pipelines and caches. Pipeline hazards may result in multicycle delays. Instruction or data memory references may not be found in cache and these misses typically require sever ..."
Abstract
-
Cited by 104 (22 self)
- Add to MetaCart
Predicting the execution time of code segments in real-time systems is challenging. Most recently designed machines contain pipelines and caches. Pipeline hazards may result in multicycle delays. Instruction or data memory references may not be found in cache and these misses typically require several cycles to resolve. Whether an instruction will stall due to apipeline hazard oracache miss depends on the dynamic sequence of previous instructions executed and memory references performed. Furthermore, these penalties are not independent since delays due to pipeline stalls and cache miss penalties may overlap. This paper describes an approach for bounding the worst and best-case performance of large code segments on machines that exploit both pipelining and instruction caching. First, a method is used to analyze a program’s control flow to statically categorize the caching behavior of each instruction. Next, these categorizations are used in the pipeline analysis of sequences of instructions representing paths within the program. A timing analyzer uses the pipeline path analysis to estimate the worst and best-case execution performance of each loop and function in the program. Finally, agraphical user interface is invoked that allows a user to request timing predictions on portions of the program. The results indicate that the timing analyzer efficiently produces tight predictions of worst and best-case performance for pipelining and instruction caching. Index terms: real-time systems, worst-case execution time, best-case execution time, timing analysis, instruction cache, pipelining
Integrating the timing analysis of pipelining and instruction caching
- In IEEE Real-Time Systems Symposium
, 1995
"... Recently designed machines contain pipelines and caches. While both features provide significant performance advantages, they also pose problems for predicting execution time of code segments in real-time systems. Pipeline hazards may result in multicycle delays. Instruction or data memory reference ..."
Abstract
-
Cited by 92 (16 self)
- Add to MetaCart
Recently designed machines contain pipelines and caches. While both features provide significant performance advantages, they also pose problems for predicting execution time of code segments in real-time systems. Pipeline hazards may result in multicycle delays. Instruction or data memory references may not be found in cache and these misses typically require several cycles to resolve. Whether an instruction will stall due to a pipeline hazard or a cache miss depends on the dynamic sequence of previous instructions executed and memory references performed. Furthermore, these penalties are not independent since delays due to pipeline stalls and cache miss penalties may overlap. This paper describes an approach for bounding the worst-case performance of large code segments on machines that exploit both pipelining and instruction caching. First, a method is used to analyze a program’s control flow to statically categorize the caching behavior of each instruction. Next, these categorizations are used in the pipeline analysis of sequences of instructions representing paths within the program. A timing analyzer uses the pipeline path analysis to estimate the worst-case execution performance of each loop and function in the program. Finally, agraphical user interface is invoked that allows a user to request timing predictions on portions of the program. 1.
Pipelined Processors And Worst Case Execution Times
- Real-Time Systems
, 1993
"... The calculation of worst case execution time (WCET) is a fundamental requirement of almost all scheduling approaches for hard real-time systems. Due to their unpredicatability, hardware enhancements such as cache and pipelining are often ignored in attempts to find WCET of programs. This results in ..."
Abstract
-
Cited by 51 (7 self)
- Add to MetaCart
The calculation of worst case execution time (WCET) is a fundamental requirement of almost all scheduling approaches for hard real-time systems. Due to their unpredicatability, hardware enhancements such as cache and pipelining are often ignored in attempts to find WCET of programs. This results in estimations that are excessively pessimistic. In this paper a simple instruction pipeline is modeled so that more accurate estimations are obtained. The model presented can be used with any schedulability analysis that allows sections of non-preemptable code to be included. Our results indicate that WCET over-estimates at basic block level can be reduced from over 20% to less than 2%, and that the over-estimates for typical structured real-time programs can be reduced by 17%-40%. 1. Introduction In real-time systems, predictable temporal behaviour is an essential requirement. To be able to predict, and therefore guarantee, the timing behaviour of a real-time system two issues need to be ad...
Real-Time Systems
, 1988
"... computing with the physical world via sensors and actuators, physical computing systems promise to give society an improved living standard, greater security, and unparalleled convenience and efficiency. ..."
Abstract
-
Cited by 48 (3 self)
- Add to MetaCart
computing with the physical world via sensors and actuators, physical computing systems promise to give society an improved living standard, greater security, and unparalleled convenience and efficiency.
Static Cache Simulation and its Applications
, 1994
"... This work takes a fresh look at the simulation of cache memories. It introduces the technique of static cache simulation that statically predicts a large portion of cache references. To efficiently utilize this technique, a method to perform efficient on-the-fly analysis of programs in general is de ..."
Abstract
-
Cited by 41 (13 self)
- Add to MetaCart
This work takes a fresh look at the simulation of cache memories. It introduces the technique of static cache simulation that statically predicts a large portion of cache references. To efficiently utilize this technique, a method to perform efficient on-the-fly analysis of programs in general is developed and proved correct. This method is combined with static cache simulation for a number of applications. The application of fast instruction cache analysis provides a new framework to evaluate instruction cache memories that outperforms even the fastest techniques published. Static cache simulation is shown to address the issue of predicting cache behavior, contrary to the belief that cache memories introduce unpredictability to real-time systems that cannot be efficiently analyzed. Static cache simulation for instruction caches provides a large degree of predictability for real-time systems. In addition, an architectural modification through bit-encoding is introduced that provides fu...
Predicting Instruction Cache Behavior
- In ACM SIGPLAN Workshop on Language, Compiler, and Tool Support for Real-Time Systems
, 1993
"... It has been claimed that the execution time of a program can often be predicted more accurately on an uncached system than on a system with cache memory [5, 20]. Thus, caches are often disabled for critical real-time tasks to ensure the predictability required for scheduling analysis. This work show ..."
Abstract
-
Cited by 33 (7 self)
- Add to MetaCart
It has been claimed that the execution time of a program can often be predicted more accurately on an uncached system than on a system with cache memory [5, 20]. Thus, caches are often disabled for critical real-time tasks to ensure the predictability required for scheduling analysis. This work shows that instruction caching can be exploited to gain execution speed without sacrificing predictability. A new method called Static Cache Simulation is introduced which uses control-flow information provided by the back-end of a compiler. This simulator statically predicts the caching behavior of a large portion of the instruction cache references of a program. In addition, a fetch-frommemory bit is added to the instruction encoding which indicates whether an instruction shall be fetched from the instruction cache or from main memory. This bitencoding approach provides a significant speedup in execution time (factor 3-8) over systems with a disabled instruction cache without any sacrifice in...
A Reflective Architecture for Real-Time Operating Systems
, 1997
"... Introduction In the field of complex real-time systems, it is a common understanding that we want integrated system-wide solutions so that design, implementation, testing, monitoring, dependability, and validation (both functional and timing) are all addressed. However, building real-time systems f ..."
Abstract
-
Cited by 14 (3 self)
- Add to MetaCart
Introduction In the field of complex real-time systems, it is a common understanding that we want integrated system-wide solutions so that design, implementation, testing, monitoring, dependability, and validation (both functional and timing) are all addressed. However, building real-time systems for critical applications and showing that they meet functional, fault tolerance, and timing requirements are complex tasks. At the heart of this complexity there exist several opposing factors. These include: ffl the desire for predictability versus the need for flexibility to handle nondeterministic environments, failures, and system evolution, Sec. 2.2 State of the Art --- Reflection 23 ffl the need for abstraction to handle complexity versus the need to include implementation details in order to assess timing properties, ffl the need for efficient performance and low cost versus understandability, and ff
Compilation Support for Fine-Grained Execution Time Analysis
- In Proceedings of the ACM SIGPLAN Workshop on Languages, Compilers and Tools for Real-Time Systems
, 1994
"... This paper discusses the problem of calculating accurate source-level execution time bounds for real-time programs in the presence of codeimproving transformations. The description focuses on the compiler's role of collection information about the structure of the program and the execution times of ..."
Abstract
-
Cited by 13 (0 self)
- Add to MetaCart
This paper discusses the problem of calculating accurate source-level execution time bounds for real-time programs in the presence of codeimproving transformations. The description focuses on the compiler's role of collection information about the structure of the program and the execution times of its basic components, and on the problem of maintaining the accuracy of this information in the presence of code-improving transformations. 1 Introduction The automatic calculation of execution time bounds by static analysis is an active area of research (see for instance [2, 6, 8, 5, 7]). To our knowledge, previous approaches have not yet addressed the problems posed by code-improving transformations: They either require manual intervention by the programmer (i.e., specification of execution frequency bounds at the assembler level), or rule out such transformations altogether. We have designed and implemented a system [9] that calculates source-level execution time bounds of programs writt...
The Spring Scheduling Co-Processor: A Scheduling Accelerator
- IEEE Transactions on VLSI
, 1993
"... We present SSCoP, a novel VLSI scheduling accelerator for multi-processor real-time systems. The co-processor can be used for static scheduling as well as for on-line scheduling. Many different policies such as earliest deadline first, highest value first, or resourceoriented policies, for example, ..."
Abstract
-
Cited by 10 (6 self)
- Add to MetaCart
We present SSCoP, a novel VLSI scheduling accelerator for multi-processor real-time systems. The co-processor can be used for static scheduling as well as for on-line scheduling. Many different policies such as earliest deadline first, highest value first, or resourceoriented policies, for example, earliest available time first, or their combinations can be used. When any on-line scheduling algorithm is used it is important to assess not only the speed of the scheduling itself, but also the overall performance impact of the interface of the co-processor to the host system. In this paper we describe the co-processor architecture, a CMOS implementation, an implementation of the host--co-processor interface and a study of the overall performance improvement. We show that the current VLSI chip speeds up the main portion of the scheduling operation by over three orders of magnitude. We also present an overall system improvement analysis by accounting for the operating system overheads and i...

