Results 1 - 10
of
86
Performance Analysis of Embedded Software Using Implicit Path Enumeration
, 1995
"... Embedded computer systems are characterized by the presence of a processor running application specific software. A large number of these systems must satisfy real-time constraints. This paper examines the problem of determining the bound on the running time of a given program on a given processor. ..."
Abstract
-
Cited by 146 (1 self)
- Add to MetaCart
Embedded computer systems are characterized by the presence of a processor running application specific software. A large number of these systems must satisfy real-time constraints. This paper examines the problem of determining the bound on the running time of a given program on a given processor. An important aspect of this problem is determining the extreme case program paths. The state of the art solution here relies on an explicit enumeration of program paths. This runs out of steam rather quickly since the number of feasible program paths is typically exponential in the size of the program. We present a solution for this problem, which considers all paths implicitly by using integer linear programming. This solution is implemented in the program cinderella which currently targets a popular embedded processor -- the Intel i960. The preliminary results of using this tool are presented here.
Efficient Microarchitecture Modeling and Path Analysis for Real-Time Software
- In IEEE Real-Time Systems Symposium
, 1995
"... Real-time systems are characterized by the presence of timing constraints in which a task must be completed within a specific amount of time. This paper examines the problem of determining the bound on the worst case execution time (WCET) of a given program on a given processor. There are two import ..."
Abstract
-
Cited by 104 (0 self)
- Add to MetaCart
Real-time systems are characterized by the presence of timing constraints in which a task must be completed within a specific amount of time. This paper examines the problem of determining the bound on the worst case execution time (WCET) of a given program on a given processor. There are two important issues in solving this problem: (i) program path analysis, which determines what sequence of instructions will be executed in the worst case, and (ii) microarchitecture modeling, which models the hardware system and determines the WCET of a known sequence of instructions. To obtain a tight estimate on the bound, both these issues must be addressed accurately and efficiently. The latter is becoming difficult to model for modern processors due to the presence of pipelined instruction execution units and cached memory systems. Because of the complexity of the problem, all existing methods that we know of focus only on one of above issues. This limits the accuracy of the estimated bound and the size of the program that can be analyzed. We present a more effective solution that addresses both issues and uses an integer linear programming formulation to solve the problem. This solution is implemented in the program cinderella 1 which currently targets the Intel i960KB processor and we present some experimental results of using this tool. 1
Bounding Pipeline and Instruction Cache Performance
- IEEE Transactions on Computers
, 1999
"... Predicting the execution time of code segments in real-time systems is challenging. Most recently designed machines contain pipelines and caches. Pipeline hazards may result in multicycle delays. Instruction or data memory references may not be found in cache and these misses typically require sever ..."
Abstract
-
Cited by 104 (22 self)
- Add to MetaCart
Predicting the execution time of code segments in real-time systems is challenging. Most recently designed machines contain pipelines and caches. Pipeline hazards may result in multicycle delays. Instruction or data memory references may not be found in cache and these misses typically require several cycles to resolve. Whether an instruction will stall due to apipeline hazard oracache miss depends on the dynamic sequence of previous instructions executed and memory references performed. Furthermore, these penalties are not independent since delays due to pipeline stalls and cache miss penalties may overlap. This paper describes an approach for bounding the worst and best-case performance of large code segments on machines that exploit both pipelining and instruction caching. First, a method is used to analyze a program’s control flow to statically categorize the caching behavior of each instruction. Next, these categorizations are used in the pipeline analysis of sequences of instructions representing paths within the program. A timing analyzer uses the pipeline path analysis to estimate the worst and best-case execution performance of each loop and function in the program. Finally, agraphical user interface is invoked that allows a user to request timing predictions on portions of the program. The results indicate that the timing analyzer efficiently produces tight predictions of worst and best-case performance for pipelining and instruction caching. Index terms: real-time systems, worst-case execution time, best-case execution time, timing analysis, instruction cache, pipelining
An Accurate Worst Case Timing Analysis for RISC Processors
- IN IEEE REAL-TIME SYSTEMS SYMPOSIUM
, 1995
"... An accurate and safe estimation of a task's worst case execution time (WCET) is crucial for reasoning about the timing properties of real-time systems. In RISC processors, the execution time of a program construct (e.g., a statement) is affected by various factors such as cache hits/misses and pi ..."
Abstract
-
Cited by 94 (3 self)
- Add to MetaCart
An accurate and safe estimation of a task's worst case execution time (WCET) is crucial for reasoning about the timing properties of real-time systems. In RISC processors, the execution time of a program construct (e.g., a statement) is affected by various factors such as cache hits/misses and pipeline hazards, and these factors impose serious problems in analyzing the WCETs of tasks. To analyze the timing effects of RISC's pipelined execution and cache memory, we propose extensions to the original timing schema where the timing information associated with each program construct is a simple time-bound. In our approach, associated with each program construct is what we call a WCTA (Worst Case Timing Abstraction), which contains detailed timing information of every execution path that might be the worst case execution path of the program construct. This extension leads to a revised timing schema that is similar to the original timing schema except that concatenation and pruning...
Integrating the timing analysis of pipelining and instruction caching
- In IEEE Real-Time Systems Symposium
, 1995
"... Recently designed machines contain pipelines and caches. While both features provide significant performance advantages, they also pose problems for predicting execution time of code segments in real-time systems. Pipeline hazards may result in multicycle delays. Instruction or data memory reference ..."
Abstract
-
Cited by 92 (16 self)
- Add to MetaCart
Recently designed machines contain pipelines and caches. While both features provide significant performance advantages, they also pose problems for predicting execution time of code segments in real-time systems. Pipeline hazards may result in multicycle delays. Instruction or data memory references may not be found in cache and these misses typically require several cycles to resolve. Whether an instruction will stall due to a pipeline hazard or a cache miss depends on the dynamic sequence of previous instructions executed and memory references performed. Furthermore, these penalties are not independent since delays due to pipeline stalls and cache miss penalties may overlap. This paper describes an approach for bounding the worst-case performance of large code segments on machines that exploit both pipelining and instruction caching. First, a method is used to analyze a program’s control flow to statically categorize the caching behavior of each instruction. Next, these categorizations are used in the pipeline analysis of sequences of instructions representing paths within the program. A timing analyzer uses the pipeline path analysis to estimate the worst-case execution performance of each loop and function in the program. Finally, agraphical user interface is invoked that allows a user to request timing predictions on portions of the program. 1.
Cache Behavior Prediction by Abstract Interpretation
- Science of Computer Programming
, 1996
"... 1 Cache Memories and Real-Time Applications Caches are used to improve the access times of fast microprocessors to relatively slow main memories. They can reduce the number of cycles a processor is waiting for data by providing faster access to recently referenced regions of memory1. Programs with h ..."
Abstract
-
Cited by 72 (11 self)
- Add to MetaCart
1 Cache Memories and Real-Time Applications Caches are used to improve the access times of fast microprocessors to relatively slow main memories. They can reduce the number of cycles a processor is waiting for data by providing faster access to recently referenced regions of memory1. Programs with hard real time constraints have to be subjected to a schedulability analysis by the compiler [17, 6]; it has to be determined whether all timing constraints can be satisfied. WCETs (Worst Case Execution Times) for processes have to be used for this. For hardware with caches, the appropriate worst case assumption is that all accesses miss the cache. This is an overly pessimistic assumption which leads to a waste of hardware resources. 1 Hennessy and Patterson [8] describe typical values for caches in 1990 workstations
OS-controlled cache predictability for real-time systems
- In Third IEEE Real-time Technology and Applications Symposium (RTAS
, 1997
"... Cache-partitioning techniques have been invented to make modern processors with an extensive cache structure useful in real-time systems where task switches disrupt cache working sets and hence make execution times unpredictable. This paper describes an OS-controlled application-transparent cache-pa ..."
Abstract
-
Cited by 65 (12 self)
- Add to MetaCart
Cache-partitioning techniques have been invented to make modern processors with an extensive cache structure useful in real-time systems where task switches disrupt cache working sets and hence make execution times unpredictable. This paper describes an OS-controlled application-transparent cache-partitioning technique. The resulting partitions can be transparently assigned to tasks for their exclusive use. The major drawbacks found in other cache-partitioning techniques, namely waste of memory and additions on the critical performance path within CPUs, are avoided using memory coloring techniques that do not require changes within the chips of modern CPUs or on the critical path for performance. A simple filter algorithm commonly used in real-time systems, a matrixmultiplication algorithm and the interaction of both are analysed with regard to cache-induced worst case penalties. Worst-case penalties are determined for different widely-used cache architectures. Some insights regarding the impact of cache architectures on worst-case execution are described.
Timing Analysis for Data Caches and Set-Associative Caches
, 1997
"... The contributions of this paper are twofold. First, an automatic tool-based approach is described to bound worst-case data cache performance. The given approach works on fully optimized code, performs the analysis over the entire control flow of a program, detects and exploits both spatial and tempo ..."
Abstract
-
Cited by 63 (11 self)
- Add to MetaCart
The contributions of this paper are twofold. First, an automatic tool-based approach is described to bound worst-case data cache performance. The given approach works on fully optimized code, performs the analysis over the entire control flow of a program, detects and exploits both spatial and temporal locality within data references, produces results typically within a few seconds, and estimates, on average, 30% tighter WCET bounds than can be predicted without analyzing data cache behavior. Results obtained by running the system on representative programs are presented and indicate that timing analysis of data cache behavior can result in significantly tighter worst-case performance predictions. Second, a framework to bound worst-case instruction cache performance for set-associative caches is formally introduced and operationally described. Results of incorporating instruction cache predictions within pipeline simulation show that timing predictions for set-associative caches remain...
Timing Analysis for Instruction Caches
- REAL-TIME SYSTEMS
, 2000
"... This paper contributes a comprehensive study of a framework to bound worst-case instruction cache performance for caches with arbitrary levels of associativity. The framework is formally introduced, operationally described and its correctness is shown. Results of incorporating instruction cache pred ..."
Abstract
-
Cited by 55 (22 self)
- Add to MetaCart
This paper contributes a comprehensive study of a framework to bound worst-case instruction cache performance for caches with arbitrary levels of associativity. The framework is formally introduced, operationally described and its correctness is shown. Results of incorporating instruction cache predictions within pipeline simulation show that timing predictions for set-associative caches remain just as tight as predictions for direct-mapped caches. The low cache simulation overhead allows interactive use of the analysis tool and scales well with increasing associativity. The approach taken is based on a data-ow specication of the problem and provides another step toward worst-case execution time prediction of contemporary architectures and its use in schedulability analysis for hard real-time systems.
Analysis of Cache-related Preemption Delay in Fixed-priority Preemptive Scheduling
, 1996
"... We propose a technique for analyzing cache-related preemption delays of tasks that cause unpredictable variation in task execution time in the context of fixed-priority preemptive scheduling. The proposed technique consists of two steps. The first step performs a per-task analysis to estimate cache- ..."
Abstract
-
Cited by 53 (4 self)
- Add to MetaCart
We propose a technique for analyzing cache-related preemption delays of tasks that cause unpredictable variation in task execution time in the context of fixed-priority preemptive scheduling. The proposed technique consists of two steps. The first step performs a per-task analysis to estimate cache-related preemption cost for each execution point in a given task. The second step computes the worst case response time of each task that includes the cache-related preemption delay using a response time equation and a linear programming technique. This step takes as its input the preemption cost information of tasks obtained in the first step. This paper also compares the proposed approach with previous approaches. The results show that the proposed approach gives a prediction of the worst case cache-related preemption delay that is up to 60% tighter than the best of predictions obtained from the previous approaches. Index Terms--- real-time system, fixed-priority scheduling, cache memory,...

