Results 1 -
3 of
3
Automatic generation of application-specific architectures for heterogeneous multiprocessor system-on-chip
- Proc. of DAC 2001
, 2001
"... We present a design flow for the generation of application-specific multiprocessor architectures. In the flow, architectural parameters are first extracted from a high-level system specification. Parameters are used to instantiate architectural components, such as processors, memory modules and comm ..."
Abstract
-
Cited by 45 (11 self)
- Add to MetaCart
We present a design flow for the generation of application-specific multiprocessor architectures. In the flow, architectural parameters are first extracted from a high-level system specification. Parameters are used to instantiate architectural components, such as processors, memory modules and communication networks. The flow includes the automatic generation of communication coprocessor that adapts the processor to the communication network in an application-specific way. Experiments with two system examples show the effectiveness of the presented design flow. 1.
Hardware Support for Real-Time Embedded Multiprocessor System-on-a-Chip Memory Management
- PROCEEDINGS OF THE TENTH INTERNATIONAL SYMPOSIUM ON HARDWARE/SOFTWARE CODESIGN (CODES'02
, 2002
"... The aggressive evolution of the semiconductor industry -- smaller process geometries, higher densities, and greater chip complexity -- has provided design engineers the means to create complex, high-performance Systems-on-a-Chip (SoC) designs. Such SoC designs typically have more than one processor ..."
Abstract
-
Cited by 10 (2 self)
- Add to MetaCart
The aggressive evolution of the semiconductor industry -- smaller process geometries, higher densities, and greater chip complexity -- has provided design engineers the means to create complex, high-performance Systems-on-a-Chip (SoC) designs. Such SoC designs typically have more than one processor and huge memory, all on the same chip. Dealing with the global onchip memory allocation/de-allocation in a dynamic yet deterministic way is an important issue for the upcoming billion transistor multiprocessor SoC designs. To achieve this, we propose a memory management hierarchy we call Two-Level Memory Management. To implement this memory management scheme -- which presents a paradigm shift in the way designers look at on-chip dynamic memory allocation -- we present a System-on-a-Chip Dynamic Memory Management Unit (SoCDMMU) for allocation of the global on-chip memory, which we refer to as Level Two memory management (Level One is the operating system management of memory allocated to a particular on-chip Processing Element). In this way, processing elements (heterogeneous or non-heterogeneous hardware or software) in an SoC can request and be granted portions of the global memory in a fast and deterministic time (for an example of a four processing element SoC, the dynamic memory allocation of the global onchip memory takes sixteen cycles per allocation/deallocation in the worst case). In this paper, we show how to modify an existing Real-Time Operating System (RTOS) to support the new proposed SoCDMMU. Our example shows a multiprocessor SoC that utilizes the SoCDMMU has 440% overall speedup of the application transition time over fully shared memory that does not utilize the SoCDMMU.
Reconfigurable Stream Processors for Wireless Base-Stations
, 2003
"... The need to support evolving standards, rapid prototyping and fast time-to-market are some of the key reasons for desiring programmability in future wireless base-stations. However, supporting highly complex signal processing algorithms for multiple users at high data rates (in Mbps), requiring bill ..."
Abstract
-
Cited by 3 (3 self)
- Add to MetaCart
The need to support evolving standards, rapid prototyping and fast time-to-market are some of the key reasons for desiring programmability in future wireless base-stations. However, supporting highly complex signal processing algorithms for multiple users at high data rates (in Mbps), requiring billions of operations per second, while providing power efficiency present challenges in attaining that goal. This paper demonstrates the viability of stream processors for physical layer base-station processing by demonstrating a fully loaded 3G base-station for 32 users meeting 128 Kbps/user (rate 1/2 constraint 9 coded data rate) at estimated power consumption of 8.2 W. However, when the system load decreases and the amount of data parallelism reduces, clusters of ALUs in the stream processor remain unutilized and waste power. We provide reconfiguration support in stream processors that allows us to turn off clusters dynamically as the data parallelism reduces. Support is provided to turn off ALUs as well such that only the minimum number of ALUs in active clusters needed to meet performance requirements are scheduled. When the system load changes from a fully loaded 32-user base-station at constraint length 9 to say, a more typical 16 active users at constraint length 7, the power consumption can reduce to 2.23 W with 5.06 W power savings due to frequency scaling and a further 0.91 W due to hardware reconfiguration. Thus, by providing real-time and power efficient support in fully programmable hardware, the architecture and algorithm development for wireless communication systems can be made independent and limited versions of future systems can be deployed which can co-exist with current systems until programmable architecture research rises to meet the challenge.

