Results 1 - 10
of
17
WCET centric data allocation to scratchpad memory
- In RTSS
, 2005
"... Scratchpad memory is a popular choice for on-chip storage in real-time embedded systems. The allocation of code/data to scratchpad memory is performed at compile time leading to predictable memory access latencies. Current scratchpad memory allocation techniques improve the average-case execution ti ..."
Abstract
-
Cited by 18 (3 self)
- Add to MetaCart
Scratchpad memory is a popular choice for on-chip storage in real-time embedded systems. The allocation of code/data to scratchpad memory is performed at compile time leading to predictable memory access latencies. Current scratchpad memory allocation techniques improve the average-case execution time of tasks. For hard real-time systems, on the other hand, worst case execution time (WCET) is a key metric. In this paper, we propose scratchpad allocation techniques for data memory that aim to minimize a task’s WCET. We first develop an integer linear programming (ILP) based solution which constructs the optimal allocation assuming that all program paths are feasible. Next, we employ branch-and-bound search to more accurately construct the optimal allocation by exploiting infeasible path information. However, the branch-and-bound search is too time-consuming in practice. Therefore, we design fast heuristic searches that achieve near-optimal allocations for all our benchmarks. 1.
Memory allocation for embedded systems with a compile-time-unknown scratch-pad size
- the ACM Transactions on Embedded Computing Systems (TECS
, 2005
"... ABSTRACT This paper presents the first memory allocation scheme for embedded systems having scratch-pad memory whose size is unknown at compile time. A scratch-pad memory (SPM) is a fast compiler-managed SRAM that replaces the hardware-managed cache. Its uses are motivated by its better real-time gu ..."
Abstract
-
Cited by 12 (1 self)
- Add to MetaCart
ABSTRACT This paper presents the first memory allocation scheme for embedded systems having scratch-pad memory whose size is unknown at compile time. A scratch-pad memory (SPM) is a fast compiler-managed SRAM that replaces the hardware-managed cache. Its uses are motivated by its better real-time guarantees as compared to cache and by its significantly lower overheads in energy consumption, area and access time. Existing data allocation schemes for SPM all require that the SPM size be known at compile-time. Unfortunately, the resulting executable is tied to that size of SPM and is not portable to processor implementations having a different SPM size. Such portability would be valuable in situations where programs for an embedded system are not burned into the system at the time of manufacture, but rather are downloaded onto it during deployment, either using a network or portable media such as memory sticks. Such postdeployment code updates are common in distributed networks and in personal hand-held devices. The presence of different SPM sizes in different devices is common because of the evolution in VLSI technology across years. The result is that SPM cannot be used in such situations with downloaded code. To overcome this limitation, this work presents a compiler method whose resulting executable is portable across SPMs of any size. The executable at run-time places frequently used objects in SPM; it considers code, global variables and stack variables for placement in SPM. The allocation is decided by modified loader software before the program is first run and once the SPM size can be discovered. The loader then modifies the program binary based on the decided allocation. To keep the overhead low, much of the pre-processing for the allocation is done at compile-time. Results show that our benchmarks average a 36 % speed increase versus an all-DRAM allocation, while the optimal static allocation scheme, which knows the SPM size at compile-time and is thus an un-achievable upper-bound, is only slightly faster (41 % faster than all-DRAM). Results also show that the overhead from our embedded loader averages about 1 % in both code-size and run-time of our benchmarks.
Scratchpad memory management for portable systems with a memory management unit
- Management Unit”, Conf. Embedded Software, 2006
, 2006
"... In this paper, we present a dynamic scratchpad memory allocation strategy targeting a horizontally partitioned memory subsystem for contemporary embedded processors. The memory subsystem is equipped with a memory management unit (MMU), and physically addressed scratchpad memory (SPM) is mapped into ..."
Abstract
-
Cited by 10 (4 self)
- Add to MetaCart
In this paper, we present a dynamic scratchpad memory allocation strategy targeting a horizontally partitioned memory subsystem for contemporary embedded processors. The memory subsystem is equipped with a memory management unit (MMU), and physically addressed scratchpad memory (SPM) is mapped into the virtual address space. A small minicache is added to further reduce energy consumption and improve performance. Using the MMU’s page fault exception mechanism, we track page accesses and copy frequently executed code sections into the SPM before they are executed. Because the minimal transfer unit between the external memory and the SPM is a single memory page, good code placement is of great importance for the success of our method. Based on profiling information, our postpass optimizer divides the application binary into pageable, cacheable, and uncacheable regions. The latter two are placed at fixed locations in the external memory, and only pageable code is copied on demand to the SPM from the external memory. Pageable code is grouped into sections whose sizes are equal to the physical page size of the MMU. We discuss code grouping techniques and also analyze the effect of the minicache on execution time and energy consumption. We evaluate our SPM allocation strategy with twelve embedded applications, including MPEG-4. Compared to a fully-cached configuration, on average we achieve a 12 % improvement in runtime performance and a 33 % reduction in energy consumption by the memory system.
Dynamic Allocation for Scratch-Pad Memory using Compile-Time Decisions
- the ACM Transactions on Embedded Computing Systems (TECS
, 2006
"... In this research we propose a highly predictable, low overhead and yet dynamic, memory allocation strategy for embedded systems with scratch-pad memory. A scratch-pad is a fast compiler-managed SRAM memory that replaces the hardware-managed cache. It is motivated by its better real-time guarantees v ..."
Abstract
-
Cited by 8 (2 self)
- Add to MetaCart
In this research we propose a highly predictable, low overhead and yet dynamic, memory allocation strategy for embedded systems with scratch-pad memory. A scratch-pad is a fast compiler-managed SRAM memory that replaces the hardware-managed cache. It is motivated by its better real-time guarantees vs cache and by its significantly lower overheads in energy consumption, area and overall runtime, even with a simple allocation scheme. Scratch-pad allocation primarily methods are of two types. First, software-caching schemes emulate the workings of a hardware cache in software. Instructions are inserted before each load/store to check the softwaremaintained cache tags. Such methods incur large overheads in runtime, code size, energy consumption and SRAM space for tags and deliver poor real-time guarantees just like hardware caches. A second category of algorithms partitions variables at compile-time into the two banks. However, a drawback of such static allocation schemes is that they do not account for dynamic program behavior. It is easy to see why a data allocation that never changes at runtime cannot achieve the full locality benefits of a cache. We propose a dynamic allocation methodology for global and stack data and program code that, (i) accounts for changing program requirements at runtime (ii) has no software-caching tags (iii) requires no run-time checks (iv) has extremely low overheads, and (v) yields 100 % predictable memory access times. In this method data
ABSTRACT Integrated Scratchpad Memory Optimization and Task Scheduling for MPSoC Architectures
"... Multiprocessor system-on-chip (MPSoC) is an integrated circuit containing multiple instruction-set processors on a single chip that implements most of the functionality of a complex electronic system. An MPSoC architecture is, in general, customized for an embedded application. A critical component ..."
Abstract
-
Cited by 8 (1 self)
- Add to MetaCart
Multiprocessor system-on-chip (MPSoC) is an integrated circuit containing multiple instruction-set processors on a single chip that implements most of the functionality of a complex electronic system. An MPSoC architecture is, in general, customized for an embedded application. A critical component of this customization process is the on-chip memory system configuration. Embedded systems increasingly employ software-controlled scratchpad memory (SPM) due to its inherent advantages in terms of area, energy, and timing predictability compared to caches. An applicationspecific flexible partitioning of the on-chip SPM budget among the processors is critical for performance optimization. Moreover, scheduling the tasks of an application on to the processors and partitioning the SPM are inter-dependent even though these steps are decoupled in the traditional design space exploration process. In this work, we design an integrated task mapping, scheduling, SPM partitioning, and data allocation technique based on Integer Linear Programming (ILP) formulation. Our ILP formulation explores the optimal performance limit and shows that integrated task scheduling and SPM optimization improves performance by up to 80 % for embedded applications. Categories and Subject Descriptors C.3 [Special-purpose and Application-based Systems]: Real-time
Dynamic Scratchpad Memory Management for Code in Portable Systems with an MMU
"... In this work, we present a dynamic memory allocation technique for a novel, horizontally partitioned memory subsystem targeting contemporary embedded processors with a memory management unit (MMU). We propose to replace the on-chip instruction cache with a scratchpad memory (SPM) and a small minicac ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
In this work, we present a dynamic memory allocation technique for a novel, horizontally partitioned memory subsystem targeting contemporary embedded processors with a memory management unit (MMU). We propose to replace the on-chip instruction cache with a scratchpad memory (SPM) and a small minicache. Serializing the address translation with the actual memory access enables the memory system to access either only the SPM or the minicache. Independent of the SPM size and based solely on profiling information, a postpass optimizer classifies the code of an application binary into a pageable and a cacheable code region. The latter is placed at a fixed location in the external memory and cached by the minicache. The former, the pageable code region, is copied on demand to the SPM before execution. Both the pageable code region and the SPM are logically divided into pages the size of an MMU memory page. Using the MMU’s pagefault exception mechanism, a runtime scratchpad memory manager (SPMM) tracks page accesses and copies frequently executed code pages to the SPM before they get executed. In order to minimize the number of page transfers from the external memory to the SPM, good code placement techniques become more important with increasing sizes of the MMU pages. We discuss code-grouping techniques and provide an analysis of the effect of the MMU’s page size on execution time, energy consumption, and external
Scratchpad Allocation for Concurrent Embedded Software
"... Software-controlled scratchpad memory is increasingly employed in embedded systems as it offers better timing predictability compared to caches. Previous scratchpad allocation algorithms typically consider single process applications. But embedded applications are mostly multi-tasking with real-time ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
Software-controlled scratchpad memory is increasingly employed in embedded systems as it offers better timing predictability compared to caches. Previous scratchpad allocation algorithms typically consider single process applications. But embedded applications are mostly multi-tasking with real-time constraints, where the scratchpad memory space has to be shared among interacting processes that may preempt each other. In this paper, we develop a novel dynamic scratchpad allocation technique that takes these process interferences into account to improve the performance and predictability of the memory system. We model the application as a Message Sequence Chart (MSC) to best capture the interprocess interactions. Our goal is to optimize the worst-case response time (WCRT) of the application through runtime reloading of the scratchpad memory content at appropriate execution points. We propose an iterative allocation algorithm that consists of two critical steps: (1) analyze the MSC along with the existing allocation to determine potential interference patterns, and (2) exploit this interference information to tune the scratchpad reloading points and content so as to best improve the WCRT. We evaluate our memory allocation scheme on a real-world embedded application controlling an Unmanned Aerial Vehicle (UAV). Categories and Subject Descriptors C.3 [Special-purpose and Application-based Systems]: Real-time
Recursive Function Data Allocation to Scratch-Pad Memory
"... ABSTRACT This paper presents the first automatic scheme to allocate local (stack) data in recursive functions to scratch-pad memory (SPM) in embedded systems. A scratch-pad is a fast directly addressed compiler-managed SRAM memory that replaces the hardware-managed cache. It is motivated by its sign ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
ABSTRACT This paper presents the first automatic scheme to allocate local (stack) data in recursive functions to scratch-pad memory (SPM) in embedded systems. A scratch-pad is a fast directly addressed compiler-managed SRAM memory that replaces the hardware-managed cache. It is motivated by its significantly lower access time, energy consumption, real-time bounds, area and overall runtime. Existing compiler methods for allocating data to scratch-pad are able to place only code, global, heap and non-recursive stack data in scratch-pad memory; stack data for recursive functions is allocated entirely in DRAM, resulting in poor performance. In this paper we present a dynamic yet compiler-directed allocation method for recursive function stack data that for the first time, is able to place a portion of recursive stack data in scratch-pad. It has almost no software-caching overhead, and is able to move recursive function data back and forth between scratchpad and DRAM to better track the program’s locality characteristics. With our method, all code, global, stack and heap variables can share the same scratch-pad. When compared to placing all recursive function data in DRAM and all other variables in scratch-pad, our results show that our method reduces the average runtime of our benchmarks by 29.3%, and the average power consumption by 31.1%, for the same size of scratch-pad fixed at 5 % of total data size. Furthermore, significant savings were observed when comparing our method against cache-based alternatives for SPM allocation. Finally, we show results that analyze the effects of profile variation on our allocation approach and present a modified version of our method which minimizes variation for profile-based allocations. 1
Efficient dynamic heap allocation of scratch-pad memory
- In ISMM ’08: Proceedings of the 7th international symposium on Memory management
, 2008
"... An increasing number of processor architectures support scratchpad memory – software managed on-chip memory. Scratch-pad memory provides low latency data storage, like on-chip caches, but under explicit software control. The simple design and predictable nature of scratchpad memories has seen them i ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
An increasing number of processor architectures support scratchpad memory – software managed on-chip memory. Scratch-pad memory provides low latency data storage, like on-chip caches, but under explicit software control. The simple design and predictable nature of scratchpad memories has seen them incorporated into a number of embedded and real-time system processors. They are also employed by multi-core architectures to isolate processor core local data and act as low latency inter-core shared memory. Managing scratch-pad memory by hand is time consuming, error prone and potentially wasteful; tools that automatically manage this memory are essential for its use by general purpose software. While there has been promising work in compile time allocation of scratch-pad memory, there will always be applications which require run-time allocation. Modern dynamic memory management
A Survey of Scratch-Pad Memory Management Techniques for low-power and-energy
"... Abstract. Scratch-Pad Memories (SPMs) are considered to be effective in helping reduce memory energy consumption. However, the variety of SPM management techniques complicates the choice of the right one to implement. In this paper, we first give a synthesis on existing SPM management techniques for ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Abstract. Scratch-Pad Memories (SPMs) are considered to be effective in helping reduce memory energy consumption. However, the variety of SPM management techniques complicates the choice of the right one to implement. In this paper, we first give a synthesis on existing SPM management techniques for low-power and-energy outlining their comparative advantages, drawbacks and trade-offs. Then, we propose a new general classification which encompasses most existing research works. This classification has the advantage of clearly exhibiting lesser explored techniques, hence providing hints for future research. 1

