Results 1 - 10
of
18
Compiler-Decided Dynamic Memory Allocation for Scratch-Pad Based Embedded Systems
, 2006
"... In this research we propose a highly predictable, low overhead and yet dynamic, memory allocation strategy for embedded systems with scratch-pad memory. A scratch-pad is a fast compiler-managed SRAM memory that replaces the hardware-managed cache. It is motivated by its better real-time guarantees v ..."
Abstract
-
Cited by 73 (4 self)
- Add to MetaCart
In this research we propose a highly predictable, low overhead and yet dynamic, memory allocation strategy for embedded systems with scratch-pad memory. A scratch-pad is a fast compiler-managed SRAM memory that replaces the hardware-managed cache. It is motivated by its better real-time guarantees vs cache and by its significantly lower overheads in energy consumption, area and overall runtime, even with a simple allocation scheme. Scratch-pad allocation methods primarily are of two types. First, software-caching schemes emulate the workings of a hardware cache in software. Instructions are inserted before each load/store to check the software-maintained cache tags. Such methods incur large overheads in runtime, code size, energy consumption and SRAM space for tags and deliver poor real-time guarantees, just like hardware caches. A second category of algorithms partitions variables at compile-time into the two banks. However, a drawback of such static allocation schemes is that they do not account for dynamic program behavior. We propose a dynamic allocation methodology for global and stack data and
Dynamic Allocation for Scratch-Pad Memory using Compile-Time Decisions
- the ACM Transactions on Embedded Computing Systems (TECS
, 2006
"... In this research we propose a highly predictable, low overhead and yet dynamic, memory allocation strategy for embedded systems with scratch-pad memory. A scratch-pad is a fast compiler-managed SRAM memory that replaces the hardware-managed cache. It is motivated by its better real-time guarantees v ..."
Abstract
-
Cited by 45 (3 self)
- Add to MetaCart
In this research we propose a highly predictable, low overhead and yet dynamic, memory allocation strategy for embedded systems with scratch-pad memory. A scratch-pad is a fast compiler-managed SRAM memory that replaces the hardware-managed cache. It is motivated by its better real-time guarantees vs cache and by its significantly lower overheads in energy consumption, area and overall runtime, even with a simple allocation scheme. Scratch-pad allocation primarily methods are of two types. First, software-caching schemes emulate the workings of a hardware cache in software. Instructions are inserted before each load/store to check the softwaremaintained cache tags. Such methods incur large overheads in runtime, code size, energy consumption and SRAM space for tags and deliver poor real-time guarantees just like hardware caches. A second category of algorithms partitions variables at compile-time into the two banks. However, a drawback of such static allocation schemes is that they do not account for dynamic program behavior. It is easy to see why a data allocation that never changes at runtime cannot achieve the full locality benefits of a cache. We propose a dynamic allocation methodology for global and stack data and program code that, (i) accounts for changing program requirements at runtime (ii) has no software-caching tags (iii) requires no run-time checks (iv) has extremely low overheads, and (v) yields 100 % predictable memory access times. In this method data
Heap Data Allocation To Scratch-Pad Memory In Embedded Systems
, 2007
"... This thesis presents the first-ever compile-time method for allocating a portion of a program’s dynamic data to scratch-pad memory. A scratch-pad is a fast directly addressed compiler-managed SRAM memory that replaces the hardware-managed cache. It is motivated by its better real-time guarantees vs ..."
Abstract
-
Cited by 43 (5 self)
- Add to MetaCart
(Show Context)
This thesis presents the first-ever compile-time method for allocating a portion of a program’s dynamic data to scratch-pad memory. A scratch-pad is a fast directly addressed compiler-managed SRAM memory that replaces the hardware-managed cache. It is motivated by its better real-time guarantees vs cache and by its signifi-cantly lower overheads in access time, energy consumption, area and overall runtime. Dynamic data refers to all objects allocated at run-time in a program, as opposed to static data objects which are allocated at compile-time. Existing compiler methods for allocating data to scratch-pad are able to place only code, global and stack data (static data) in scratch-pad memory; heap and recursive-function objects(dynamic data) are allocated entirely in DRAM, resulting in poor performance for these dy-namic data types. Runtime methods based on software caching can place data in scratch-pad, but because of their high overheads from software address translation, they have not been successful, especially for dynamic data. In this thesis we present a dynamic yet compiler-directed allocation method for dynamic data that for the first time, (i) is able to place a portion of the dynamic
Memory allocation for embedded systems with a compile-time-unknown scratch-pad size
- the ACM Transactions on Embedded Computing Systems (TECS
, 2005
"... ABSTRACT This paper presents the first memory allocation scheme for embedded systems having scratch-pad memory whose size is unknown at compile time. A scratch-pad memory (SPM) is a fast compiler-managed SRAM that replaces the hardware-managed cache. Its uses are motivated by its better real-time gu ..."
Abstract
-
Cited by 27 (1 self)
- Add to MetaCart
(Show Context)
ABSTRACT This paper presents the first memory allocation scheme for embedded systems having scratch-pad memory whose size is unknown at compile time. A scratch-pad memory (SPM) is a fast compiler-managed SRAM that replaces the hardware-managed cache. Its uses are motivated by its better real-time guarantees as compared to cache and by its significantly lower overheads in energy consumption, area and access time. Existing data allocation schemes for SPM all require that the SPM size be known at compile-time. Unfortunately, the resulting executable is tied to that size of SPM and is not portable to processor implementations having a different SPM size. Such portability would be valuable in situations where programs for an embedded system are not burned into the system at the time of manufacture, but rather are downloaded onto it during deployment, either using a network or portable media such as memory sticks. Such postdeployment code updates are common in distributed networks and in personal hand-held devices. The presence of different SPM sizes in different devices is common because of the evolution in VLSI technology across years. The result is that SPM cannot be used in such situations with downloaded code. To overcome this limitation, this work presents a compiler method whose resulting executable is portable across SPMs of any size. The executable at run-time places frequently used objects in SPM; it considers code, global variables and stack variables for placement in SPM. The allocation is decided by modified loader software before the program is first run and once the SPM size can be discovered. The loader then modifies the program binary based on the decided allocation. To keep the overhead low, much of the pre-processing for the allocation is done at compile-time. Results show that our benchmarks average a 36 % speed increase versus an all-DRAM allocation, while the optimal static allocation scheme, which knows the SPM size at compile-time and is thus an un-achievable upper-bound, is only slightly faster (41 % faster than all-DRAM). Results also show that the overhead from our embedded loader averages about 1 % in both code-size and run-time of our benchmarks.
Dynamic data scratchpad memory management for a memory subsystem with an mmu
- In LCTES ’07: Proceedings of the 2007 ACM SIGPLAN/SIGBED Conference on Languages, Compilers, and Tools
"... In this paper, we propose a dynamic scratchpad memory (SPM) management technique for a horizontally-partitioned memory subsystem with an MMU. The memory subsystem consists of a relatively cheap direct-mapped data cache and SPM. Our technique loads required global data and stack pages into the SPM on ..."
Abstract
-
Cited by 11 (4 self)
- Add to MetaCart
(Show Context)
In this paper, we propose a dynamic scratchpad memory (SPM) management technique for a horizontally-partitioned memory subsystem with an MMU. The memory subsystem consists of a relatively cheap direct-mapped data cache and SPM. Our technique loads required global data and stack pages into the SPM on demand when a function is called. A scratchpad memory manager loads/unloads the data pages and maintains a page table for the MMU. Our approach is based on post-pass analysis and optimization techniques, and it handles the whole program including libraries. The data page mapping is determined by solving an integer linear programming (ILP) formulation that approximates our demand paging technique. The ILP model uses a dynamic call graph annotated with the number of memory accesses and/or cache misses obtained by profiling. We evaluate our technique on thirteen embedded applications. We compare the results to a reference system with a 4-way set associative data cache and the ideal case with the same 4-way cache and SPM, where all global and stack data is placed in the SPM. On average, our approach reduces the total system energy consumption by 8.1 % with no performance degradation. This is equivalent to exploiting 60 % of the room available in energy reduction between the reference case and the ideal case. Categories and Subject Descriptors C.4 [Performance of Systems]: Design Studies; D.3.4 [Programming Languages]: Processors—code generation, compilers, optimization; D.4.2 [Operating
Recursive Function Data Allocation to Scratch-Pad Memory
"... ABSTRACT This paper presents the first automatic scheme to allocate local (stack) data in recursive functions to scratch-pad memory (SPM) in embedded systems. A scratch-pad is a fast directly addressed compiler-managed SRAM memory that replaces the hardware-managed cache. It is motivated by its sign ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
(Show Context)
ABSTRACT This paper presents the first automatic scheme to allocate local (stack) data in recursive functions to scratch-pad memory (SPM) in embedded systems. A scratch-pad is a fast directly addressed compiler-managed SRAM memory that replaces the hardware-managed cache. It is motivated by its significantly lower access time, energy consumption, real-time bounds, area and overall runtime. Existing compiler methods for allocating data to scratch-pad are able to place only code, global, heap and non-recursive stack data in scratch-pad memory; stack data for recursive functions is allocated entirely in DRAM, resulting in poor performance. In this paper we present a dynamic yet compiler-directed allocation method for recursive function stack data that for the first time, is able to place a portion of recursive stack data in scratch-pad. It has almost no software-caching overhead, and is able to move recursive function data back and forth between scratchpad and DRAM to better track the program’s locality characteristics. With our method, all code, global, stack and heap variables can share the same scratch-pad. When compared to placing all recursive function data in DRAM and all other variables in scratch-pad, our results show that our method reduces the average runtime of our benchmarks by 29.3%, and the average power consumption by 31.1%, for the same size of scratch-pad fixed at 5 % of total data size. Furthermore, significant savings were observed when comparing our method against cache-based alternatives for SPM allocation. Finally, we show results that analyze the effects of profile variation on our allocation approach and present a modified version of our method which minimizes variation for profile-based allocations. 1
Scratchpad Allocation for Concurrent Embedded Software
"... Software-controlled scratchpad memory is increasingly employed in embedded systems as it offers better timing predictability compared to caches. Previous scratchpad allocation algorithms typically consider single process applications. But embedded applications are mostly multi-tasking with real-time ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
Software-controlled scratchpad memory is increasingly employed in embedded systems as it offers better timing predictability compared to caches. Previous scratchpad allocation algorithms typically consider single process applications. But embedded applications are mostly multi-tasking with real-time constraints, where the scratchpad memory space has to be shared among interacting processes that may preempt each other. In this paper, we develop a novel dynamic scratchpad allocation technique that takes these process interferences into account to improve the performance and predictability of the memory system. We model the application as a Message Sequence Chart (MSC) to best capture the interprocess interactions. Our goal is to optimize the worst-case response time (WCRT) of the application through runtime reloading of the scratchpad memory content at appropriate execution points. We propose an iterative allocation algorithm that consists of two critical steps: (1) analyze the MSC along with the existing allocation to determine potential interference patterns, and (2) exploit this interference information to tune the scratchpad reloading points and content so as to best improve the WCRT. We evaluate our memory allocation scheme on a real-world embedded application controlling an Unmanned Aerial Vehicle (UAV). Categories and Subject Descriptors C.3 [Special-purpose and Application-based Systems]: Real-time
A Survey of Scratch-Pad Memory Management Techniques for low-power and-energy
"... Abstract. Scratch-Pad Memories (SPMs) are considered to be effective in helping reduce memory energy consumption. However, the variety of SPM management techniques complicates the choice of the right one to implement. In this paper, we first give a synthesis on existing SPM management techniques for ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
Abstract. Scratch-Pad Memories (SPMs) are considered to be effective in helping reduce memory energy consumption. However, the variety of SPM management techniques complicates the choice of the right one to implement. In this paper, we first give a synthesis on existing SPM management techniques for low-power and-energy outlining their comparative advantages, drawbacks and trade-offs. Then, we propose a new general classification which encompasses most existing research works. This classification has the advantage of clearly exhibiting lesser explored techniques, hence providing hints for future research. 1
Scratch-Pad Memory Allocation without Compiler Support for Java Applications
"... ABSTRACT This paper presents the first scratch-pad memory allocation scheme that requires no compiler support for interpreted-language based applications. A scratch-pad memory (SPM) is a fast compilermanaged SRAM that replaces the hardware-managed cache. Its uses are motivated by its better real-tim ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
(Show Context)
ABSTRACT This paper presents the first scratch-pad memory allocation scheme that requires no compiler support for interpreted-language based applications. A scratch-pad memory (SPM) is a fast compilermanaged SRAM that replaces the hardware-managed cache. Its uses are motivated by its better real-time guarantees as compared to cache and by its significantly lower overheads in energy consumption, area and access time. Interpreted languages are languages such as Java that are interpreted by a run-time environment instead of being executed directly on hardware. All existing memory allocation schemes for SPM require compiler analysis to develop the allocation strategy. Specifically, existing allocation schemes for Java-based applications determine the allocations at compile-time. They then annotate the Java bytecodes with these allocation decisions for the Java Virtual Machine (JVM) to implement the actual allocation at runtime. These existing allocation schemes tie the resulting bytecode to specific SPM sizes, therefore preventing the applications from being portable to different SPM sizes. Further, existing methods do not work for unmodified third-party bytecodes produced by compilers other than their specialized compilers. In this paper, we propose the first ever SPM allocation scheme that is completely implemented inside the JVM. Our method requires no compiler support and works for unmodified bytecodes from any source. Moreover, unlike existing methods, it preserves the portability of bytecode to any SPM size. We investigate our method on the Sun Hotspot JVM on which we achieve a 27.8 % improvement on runtime and 21.8 % on energy saving versus not using the SPM – the only existing alternative for unmodified bytecodes. 1
Optimal stack frame placement and transfer for energy reduction targeting embedded processors with scratch-pad memories
"... Abstract—Memory accesses are a major cause of energy con-sumption for embedded systems and the stack is a frequent target for data accesses. This paper presents a fully software technique which aims at reducing the energy consumption related to the stack by allocating and transferring frames or part ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
(Show Context)
Abstract—Memory accesses are a major cause of energy con-sumption for embedded systems and the stack is a frequent target for data accesses. This paper presents a fully software technique which aims at reducing the energy consumption related to the stack by allocating and transferring frames or part of frames between a scratch-pad memory and the main memory. The technique utilizes an integer linear formulation of the problem in order to find at compile time the optimal management for the frames. The technique is also extended to integrate existing methods which deal with static memory objects and others which deal with recursive functions. Experimental results show that our technique effectively exploits an available scratch-pad memory space which is only one half of what the stack requires to reduce the stack-related energy consumption by more than 90 % for several applications and on an average of 84 % compared to the case where all the frames of the stack are placed into the main memory. I.