Results 1 -
7 of
7
Topology virtualization for throughput maximization on many-core platforms,”
- in Parallel and Distributed Systems (ICPADS), 2012 IEEE 18th International Conference on,
, 2012
"... Abstract-As transistor's feature size continues to scale down into the deep sub-micron domain, IC chip performance variation caused by manufacturing process becomes un-negligible and can cause significant discrepancies between an application's nominal design and its actual realization on ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
(Show Context)
Abstract-As transistor's feature size continues to scale down into the deep sub-micron domain, IC chip performance variation caused by manufacturing process becomes un-negligible and can cause significant discrepancies between an application's nominal design and its actual realization on individual manycore platforms. In this paper, we study the problem on how to reduce the total schedule length of a task graph when realizing its nominal design on individual Network-on-Chip(NoC) based many-core platform with faulty cores. Different from traditional approaches to re-define the mapping/scheduling decisions in the nominal design, our methods judiciously mirror the physical architecture of each individual platform to the logical platform, based on which the nominal design is conducted. To facilitate the phyical/logic architecture virtualization, we develop a performance metric based on the opportunity cost, a concept borrowed from the economics field. Three virtualization heuristics are presented in this paper. Our experimental results show that the proposed approach can achieve up to 30% with an average 15% performance improvement by taking advantage of the heterogeneity of each individual platform.
Heterogeneity Exploration for Peak Temperature Reduction on Multi-Core Platforms
"... Abstract-As IC technology continues to evolve and more transistors are integrated into a single chip, high chip temperature due to high power density not only increases packaging/cooling cost, but also severely degrades reliability and the performance of computing systems. In the meantime, as trans ..."
Abstract
- Add to MetaCart
(Show Context)
Abstract-As IC technology continues to evolve and more transistors are integrated into a single chip, high chip temperature due to high power density not only increases packaging/cooling cost, but also severely degrades reliability and the performance of computing systems. In the meantime, as transistor feature size continues to shrink, it becomes difficult to precisely control the manufacturing process. The manufacturing variations can cause significant differences from core to core and chip to chip. We believe that the heterogeneity due to manufacturing variations, if handled properly, can in fact improve the design objectives of real-time applications. In this paper, we study the problem on how to reduce the peak temperature of a real-time application by judiciously mirroring the physical architecture of an individual device to the logical architecture where the application was initially designed upon. We develop three computationally efficient algorithms for deploying applications to individual devices. Our simulation study has clearly shown that, by taking advantage of the uniqueness of each individual physical chip, the proposed approaches significantly reduce the peak temperature. The experiments also show that these approaches are efficient and have low operational cost.
2011 21st International Conference on Field Programmable Logic and Applications IMPLICATIONS OF RELIABILITY ENHANCEMENT ACHIEVED BY FAULT AVOIDANCE ON DYNAMICALLY RECONFIGURABLE ARCHITECTURES
"... Fault avoidance methods on dynamically reconfigurable devices have been proposed to extend device life-time, while their quantitative comparison has not been sufficiently presented. This paper shows results of quantitative life-time evaluation by simulating fault avoidance procedures of representati ..."
Abstract
- Add to MetaCart
(Show Context)
Fault avoidance methods on dynamically reconfigurable devices have been proposed to extend device life-time, while their quantitative comparison has not been sufficiently presented. This paper shows results of quantitative life-time evaluation by simulating fault avoidance procedures of representative five methods under the same conditions of wearout scenario, application and device architecture. Experimental results reveal 1) MTTF is highly correlated with the number of avoided faults, 2) there is the efficiency difference of spare usage in five fault avoidance methods, and 3) spares should be prevented from wear-out not to spoil lifetime enhancement. 1.
Performance-Asymmetry-Aware Topology Virtualization for Defect-tolerant NoC-based Many-core Processors
"... Topology virtualization techniques are proposed for NoC-based many-core processors with core-level redundancy to iso-late hardware changes caused by on-chip defective cores. Prior work focuses on homogeneous cores with symmetric perfor-mance and optimizes on-chip communication only. However, core-to ..."
Abstract
- Add to MetaCart
(Show Context)
Topology virtualization techniques are proposed for NoC-based many-core processors with core-level redundancy to iso-late hardware changes caused by on-chip defective cores. Prior work focuses on homogeneous cores with symmetric perfor-mance and optimizes on-chip communication only. However, core-to-core performance asymmetry due to manufacturing pro-cess variations poses new challenges for constructing virtual topologies. Lower performance cores may scatter over a vir-tual topology, while operating systems typically allocate tasks to continuous cores. As a result, parallel applications are prob-ably assigned to a region containing many slower cores that become bottlenecks. To tackle the above problem, in this paper we present a novel performance-asymmetry-aware reconfigura-tion algorithm Bubble-Up based on a new metric called core fragmentation factor (CFF). Bubble-Up can arrange cores with similar performance closer, yet maintaining reasonable hop dis-tances between virtual neighbors, thus accelerating applica-tions with higher degree of parallelism, without changing ex-isting allocation strategies for OS. Experimental results show its effectiveness. 1
Hungarian Algorithm Based Virtualization to Maintain Application Timing Similarity for
"... Abstract—Homogeneous manycore processors are emerging in broad application areas, including those with timing require-ments, such as real-time and embedded applications. Typically, these processors employ Network-on-Chip (NoC) as the com-munication infrastructure and core-level redundancy is often u ..."
Abstract
- Add to MetaCart
(Show Context)
Abstract—Homogeneous manycore processors are emerging in broad application areas, including those with timing require-ments, such as real-time and embedded applications. Typically, these processors employ Network-on-Chip (NoC) as the com-munication infrastructure and core-level redundancy is often used as an effective approach to improve the yield of manycore chips. For a given application’s task graph and a task to core mapping strategy, the traffic pattern on the NoC is known a priori. However, when defective cores are replaced by redundant ones, the NoC topology changes. As a result, a fine-tuned program based on timing parameters given by one topology may not meet the expected timing behavior under the new one. To address this issue, a timing similarity metric is introduced to evaluate timing resemblances between different NoC topologies. Based on this metric, a Hungarian method based algorithm is developed to reconfigure a defect-tolerant manycore platform and form a unified application specific virtual core topology of which the timing variations caused by such reconfiguration are minimized. Our case studies indicate that the proposed metric is able to accurately measure the timing differences between different NoC topologies. The standard deviation between the calculated difference using the metric and the difference obtained through simulation is less than 6.58%. Our case studies also indicate that the developed Hungarian method based algorithm using the metric performs close to the optimal solution in comparison to random defect-redundant core assignments. I.
A Novel Approach Using a Minimum Cost Maximum Flow Algorithm for Fault-Tolerant Topology Reconfiguration in NoC Architectures
"... Abstract- An approach using a minimum cost maximum flow algorithm is proposed for fault-tolerant topology reconfiguration in a Network-on-Chip system. Topology reconfiguration is converted into a network flow problem by constructing a directed graph with capacity constraints. A cost factor is consid ..."
Abstract
- Add to MetaCart
(Show Context)
Abstract- An approach using a minimum cost maximum flow algorithm is proposed for fault-tolerant topology reconfiguration in a Network-on-Chip system. Topology reconfiguration is converted into a network flow problem by constructing a directed graph with capacity constraints. A cost factor is considered to differentiate between processing elements. This approach maximizes the use of spare cores to repair faulty systems, with minimal impact on area, throughput and delay. It also provides a transparent virtual topology to alleviate the burden for operating systems. I.
1 A Fault Tolerant NoC Architecture Using Quad-Spare Mesh Topology and Dynamic Reconfiguration
"... Network-on-Chip (NoC) is widely used as a communication scheme in modern many-core systems. To guarantee the reliability of communication, effective fault tolerant techniques are critical for an NoC. In this paper, a novel fault tolerant architecture employing redundant routers is proposed to mainta ..."
Abstract
- Add to MetaCart
(Show Context)
Network-on-Chip (NoC) is widely used as a communication scheme in modern many-core systems. To guarantee the reliability of communication, effective fault tolerant techniques are critical for an NoC. In this paper, a novel fault tolerant architecture employing redundant routers is proposed to maintain the functionality of a network in the presence of failures. This architecture consists of a mesh of 2×2 router blocks with a spare router placed in the center of each block. This spare router provides a viable alternative when a router fails in a block. The proposed fault-tolerant architecture is therefore referred to as a quad-spare mesh. The quad-spare mesh can be dynamically reconfigured by changing control signals without altering the underlying topology. This dynamic reconfiguration and its corresponding routing algorithm are demonstrated in detail. Since the topology after reconfiguration is consistent with the original error-free 2D mesh, the proposed design is transparent to operating systems and application software. Experimental results show that the proposed design achieves significant improvements on reliability compared with those reported in the literature. Comparing the error-free system with a single router failure case, the throughput only decreases by 5.19 % and latency increases by 2.40%, with about 45.9% hardware redundancy.