Results 1 - 10
of
29
Allocation and scheduling for mpsocs via decomposition and no-good generation
- In Procs. of the 11th Intern. Conference on Principles and Practice of Constraint Programming - CP 2005
, 2005
"... This paper proposes a decomposition approach to the allocation and scheduling of a multi-task application on a multiprocessor system-on-chip (MPSoCs) [Wolf, 2004]. This is currently one of the most critical problems in electronic design automation for Very-Large Scale Integrated (VLSI) circuits. Wit ..."
Abstract
-
Cited by 27 (13 self)
- Add to MetaCart
(Show Context)
This paper proposes a decomposition approach to the allocation and scheduling of a multi-task application on a multiprocessor system-on-chip (MPSoCs) [Wolf, 2004]. This is currently one of the most critical problems in electronic design automation for Very-Large Scale Integrated (VLSI) circuits. With the limits of chip integration reaching beyond one billion of elementary devices, current advanced integrated hardware platforms for high-end consumer application (e.g. multimedia-enabled phones) contain multiple processors and memories, as well as complex on-chip interconnects. The hardware resources in these MPSoCs need to be optimally allocated and scheduled under tight throughput constraints when executing a target software workload (e.g. a video decoder). The multi-processor system
Codex-dp: Co-design of Communicating Systems Using Dynamic Programming
, 1998
"... In this paper, we present a novel algorithm based on dynamic programming with binning to find, subject to a given deadline, the minimum-cost coarse-grain hardware/software partitioning and mapping of communicating processes in a generalized task graph. The task graph includes computational processes ..."
Abstract
-
Cited by 19 (0 self)
- Add to MetaCart
(Show Context)
In this paper, we present a novel algorithm based on dynamic programming with binning to find, subject to a given deadline, the minimum-cost coarse-grain hardware/software partitioning and mapping of communicating processes in a generalized task graph. The task graph includes computational processes which communicate with each other by means of blocking/nonblocking communication mechanisms at times including, but also other than, the beginning or end of their lifetime. The proposed algorithm has been implemented. Experimental results are reported and discussed. 1
Contention-aware Application Mapping for Network-on-Chip Communication Architectures
"... Abstract- In this paper, we analyze the impact of network contention on the application mapping for tile-based Networkon-Chip (NoC) architectures. Our main theoretical contribution consists of an integer linear programming (ILP) formulation of the contention-aware application mapping problem which a ..."
Abstract
-
Cited by 12 (1 self)
- Add to MetaCart
(Show Context)
Abstract- In this paper, we analyze the impact of network contention on the application mapping for tile-based Networkon-Chip (NoC) architectures. Our main theoretical contribution consists of an integer linear programming (ILP) formulation of the contention-aware application mapping problem which aims at minimizing the inter-tile network contention. To solve the scalability problem caused by ILP formulation, we propose a linear programming (LP) approach followed by an mapping heuristic. Taken together, they provide near-optimal solutions while reducing the runtime significantly. Experimental results show that, compared to other existing mapping approaches based on communication energy minimization, our contention-aware mapping technique achieves a significant decrease in packet latency (and implicitly, a throughput increase) with a negligible communication energy overhead. I.
Kuchcinski K.: Design Space Exploration in System Level Synthesis under Memory
- Constraints, 25th Euromicro Conference, Workshop on Digital System Design
"... ..."
(Show Context)
Synthesis of Application Specific Multiprocessor Architectures for Process Networks
- In Proc. 17th International Conference on VLSI Design
, 2004
"... In this paper, we address the problem of synthesis of application specific multiprocessor SoC architectures for process networks of streaming applications. An application is modeled as Kahn Process Network (KPN) which makes the parallelism present in the application explicit. The synthesis process i ..."
Abstract
-
Cited by 7 (3 self)
- Add to MetaCart
(Show Context)
In this paper, we address the problem of synthesis of application specific multiprocessor SoC architectures for process networks of streaming applications. An application is modeled as Kahn Process Network (KPN) which makes the parallelism present in the application explicit. The synthesis process involves selection of computation modules, memory modules, communication architecture and mapping of processes of KPN on compute units and channels on memory modules. Our solution minimizes hardware cost while taking into account the performance constraints. One of the salient features of our work is that it takes into account the additional overheads because of data communication conflicts. Our method uses average processing requirements of KPN to handle data dependent behavior of processes and cycles within the KPN. In contrast to others, we do not perform static scheduling, only mapping and synthesis is done. 1.
Automated Mapping for Heterogeneous Multiprocessor Embedded Systems
, 2007
"... All rights reserved. ..."
(Show Context)
Low contention mapping of real-time tasks onto tilepro 64 core processors
- 2009 15th IEEE Real-Time and Embedded Technology and Applications Symposium
"... Predictability of task execution is paramount for real-time systems so that upper bounds of execution times can be determined via static timing analysis. Static timing analysis on network-on-chip (NoC) processors may result in unsafe underestimations when the underlying communication paths are not c ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
(Show Context)
Predictability of task execution is paramount for real-time systems so that upper bounds of execution times can be determined via static timing analysis. Static timing analysis on network-on-chip (NoC) processors may result in unsafe underestimations when the underlying communication paths are not considered. This stems from contention on the underlying network when data from multiple sources share parts of a routing path in the NoC. Contention analysis must be performed to provide safe and reliable bounds. In addition, the overhead incurred by contention due to interprocess communication (IPC) can be reduced by mapping tasks to cores in such a way that contention is minimized. This paper makes several contributions to increase predictability of real-time tasks on NoC architectures. First, we contribute a constraint solver that exhaustively maps realtime tasks onto cores to minimize contention and improve predictability. Second, we develop a novel TDMA-like approach to map communication traces into time frames to ensure separation of analysis for temporally disjoint communication. Third, we contribute a novel multi-heuristic approximation, HSolver, for rapid discovery of low contention solutions. HSolver reduces contention by up to 70 % when compared with naïve and constrained exhaustive solutions. We evaluate our experiments using a micro-benchmark of task system IPC on the TilePro64, a real, physical NoC processor with 64 cores. To the best of our knowledge, this is the first work to consider IPC for worst-case time frames to simplify analysis and to measure the impact on actual hardware for NoC-based real-time multicore systems. 1.
Improved constraints for multiprocessor system scheduling
- In DATE
, 2002
"... MILP-based models are useful for finding optimal schedules and for proving their optimality. Because of the problem complexity, model improvements have to be investigated. We analyze the constraints necessary for precluding resource conflicts, present novel formulations, and evaluate them. The effic ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
(Show Context)
MILP-based models are useful for finding optimal schedules and for proving their optimality. Because of the problem complexity, model improvements have to be investigated. We analyze the constraints necessary for precluding resource conflicts, present novel formulations, and evaluate them. The efficiency of the solution process can be improved significantly by selecting the proper formulation. The scheduling problem Scheduling applications in computer science range from HL-Synthesis through Real-Time Embedded Systems to HW/SW-Co-design. We deal with a STATIC scheduling problem for a heterogeneous multiprocessor system, where TASKS have to be assigned to PROCESSING MODULES and to be ordered. COMMUNICATIONS and BUS STRUCTURE
Automated task allocation on single chip, hardware multithreaded, multiprocessor systems
- Workshop on Embedded Parallel Architectures (WEPA-1
, 2004
"... The mapping of application functionality onto multiple multithreaded processing elements of a high performance embedded system is currently a slow and arduous task for application developers. Previous attempts at automation have either ignored hardware support for multithreading and focused on sched ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
(Show Context)
The mapping of application functionality onto multiple multithreaded processing elements of a high performance embedded system is currently a slow and arduous task for application developers. Previous attempts at automation have either ignored hardware support for multithreading and focused on scheduling, or have overlooked the architectural peculiarities of these systems. This work attempts to fill the void by formulating and solving the mapping problem for these architectures. In particular, the task allocation problem for a popular multithreaded, multiprocessor embedded system, the Intel IXP1200 network processor, is encoded into a 0-1 Integer Linear Programming problem. This method proves to be computationally efficient and produces results that are within 5 % of aggregate egress bandwidths achieved by hand-tuned implementations on two representative
HW/SW Codesign Incorporating Edge Delays Using Dynamic Programming
- Proceedings of the Euromicro Symposium on Digital System Design, IEEE Computer Society
, 2003
"... We present an algorithm based on dynamic programming to perform the HW/SW partitioning and scheduling of a given task graph for minimum latency subject to resource constraint. The major contribution of this paper is to consider the edge communication delays in the dynamic programming solution of the ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
(Show Context)
We present an algorithm based on dynamic programming to perform the HW/SW partitioning and scheduling of a given task graph for minimum latency subject to resource constraint. The major contribution of this paper is to consider the edge communication delays in the dynamic programming solution of the problem. The algorithm has a polynomial run time complexity on trees. We also introduce a pruning technique to reduce the runtime of the worst-case scenario of directed acyclic graphs (DAGs). The algorithm has been implemented and the results are reported. A very fast quality heuristic is also proposed and implemented to provide good solutions in negligible run time. 1.