Results 1 - 10
of
62
Processor Affinity and MPI Performance on SMP-CMP Clusters
"... with multi-core Chip-Multiprocessors (CMP), also known as SMP-CMP clusters, are becoming ubiquitous today. For Message Passing interface (MPI) programs, such clusters have a multilayer hierarchical communication structure: the performance of intra-node communication is usually higher than that of in ..."
Abstract
- Add to MetaCart
with multi-core Chip-Multiprocessors (CMP), also known as SMP-CMP clusters, are becoming ubiquitous today. For Message Passing interface (MPI) programs, such clusters have a multilayer hierarchical communication structure: the performance of intra-node communication is usually higher than
1Performance Analysis and Modeling of a Computational Biology Code on CMP Clusters (revised July 2008)
"... Abstract — The current trend in parallel computing systems is shifting towards cluster systems with CMPs (chip mul-tiprocessors). Further, the CMPs are usually configured hierarchically (e.g., multiple CMPs compose a multi-chip module and multiple multi-chip modules compose a node) to compose a node ..."
Abstract
- Add to MetaCart
Abstract — The current trend in parallel computing systems is shifting towards cluster systems with CMPs (chip mul-tiprocessors). Further, the CMPs are usually configured hierarchically (e.g., multiple CMPs compose a multi-chip module and multiple multi-chip modules compose a node) to compose a
Computer Science- Research and Development manuscript No. (will be inserted by the editor) Predictive Analysis of a Hydrodynamics Application on Large-Scale CMP Clusters
"... Abstract We present the development of a predictive performance model for the high-performance computing code Hydra, a hydrodynamics benchmark developed and maintained by the United Kingdom Atomic Weapons Establishment (AWE). The developed model elucidates the parallel computation of Hydra, with whi ..."
Abstract
- Add to MetaCart
, with which it is possible to predict its run-time and scaling performance on varying large-scale chip multiprocessor (CMP) clusters. A key feature of the model is its granularity; with the model we are able to separate the contributing costs, including computation, point-topoint communications, collectives
Performance analysis and optimization of parallel scientific applications on CMP cluster systems. Scalable Computing: Practice and Experience, 10(1):188–195, 2009. submitted manuscript has been created by UChicago Argonne, LLC, Operator of Argonne Nationa
"... Abstract. Chip multiprocessors (CMP) are widely used for high performance computing. Further, these CMPs are being configured in a hierarchical manner to compose a node in a cluster system. A major challenge to be addressed is efficient use of such cluster systems for large-scale scientific applicat ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
Abstract. Chip multiprocessors (CMP) are widely used for high performance computing. Further, these CMPs are being configured in a hierarchical manner to compose a node in a cluster system. A major challenge to be addressed is efficient use of such cluster systems for large-scale scientific
Thread clustering: Sharing-aware scheduling on SMP-CMP-SMT multiprocessors
- in EuroSys
, 2007
"... The major chip manufacturers have all introduced chip multiprocessing (CMP) and simultaneous multithreading (SMT) technology into their processing units. As a result, even low-end computing systems and game consoles have become shared memory multiprocessors with L1 and L2 cache sharing within a chip ..."
Abstract
-
Cited by 86 (4 self)
- Add to MetaCart
The major chip manufacturers have all introduced chip multiprocessing (CMP) and simultaneous multithreading (SMT) technology into their processing units. As a result, even low-end computing systems and game consoles have become shared memory multiprocessors with L1 and L2 cache sharing within a
Adaptive Loop Tiling for a Multi-Cluster CMP
"... Abstract. Loop tiling is a fundamental optimization for improving data locality. Selecting the right tile size combined with the parallelization of loops can provide additional performance increases in the modern of Chip MultiProcessor (CMP) architectures. This paper presents a runtime optimization ..."
Abstract
- Add to MetaCart
system which automatically parallelizes loops and searches empirically for the best tile sizes on a scalable multi-cluster CMP. The system is built on top of a virtual machine and targets the runtime parallelization and optimization of Java programs. Experimental results show that runtime parallelization
Understanding the Energy Efficiency of SMT and CMP with Multiclustering
- in Proceedings of the 2005 International Symposium on Low Power Electronics and Design
, 2005
"... In this paper we study the energy efficiency of SMT and CMP with multiclustering. Through a detailed design space exploration, we show that clustering closes the energy effi-ciency gap between SMT and CMP at equal performance points. Specifically, we show that the energy efficiency of CMP compared t ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
In this paper we study the energy efficiency of SMT and CMP with multiclustering. Through a detailed design space exploration, we show that clustering closes the energy effi-ciency gap between SMT and CMP at equal performance points. Specifically, we show that the energy efficiency of CMP compared
Dynamic Power Aware Packet Processing with CMP
"... Network processors implemented as systems-on-chip with multiple processors and peripherals offer a reliable means of scaling network with high link capacities. As more and more co-processors and peripherals are integrated, the power requirement also dramatically increases. Therefore it is essential ..."
Abstract
- Add to MetaCart
to efficiently parallelize the subsystems to maximize the packet processing capacities while maintaining low power consumption. In this project, we propose a power aware packet processing architecture with chip-multiprocessor (CMP), which consists of a number of processor clusters (or arrays). Each array
PERFORMANCE ANALYSIS AND COMPARISON OF MPI, OPENMP AND HYBRID NPB-MZ 1 Performance Analysis and Comparison of MPI, OpenMP and Hybrid NPB-MZ
"... Abstract—Chip multiprocessors (CMP) are w idely used for high performance computing and are being configured in a hierarchical manner to compose a node in a parallel system. CMP clusters provide a natural programming paradigm for hybrid programs. Can current hybrid parallel programming paradigms suc ..."
Abstract
- Add to MetaCart
Abstract—Chip multiprocessors (CMP) are w idely used for high performance computing and are being configured in a hierarchical manner to compose a node in a parallel system. CMP clusters provide a natural programming paradigm for hybrid programs. Can current hybrid parallel programming paradigms
Exploring instruction caching strategies for tightly-coupled shared-memory clusters
- in System on Chip (SoC), 2011 International Symposium on
"... Abstract—Several Chip-Multiprocessor designs today leverage tightly-coupled computing clusters as a building block. These clusters consist of a fairly large number N of simple cores, featuring fast communication through a shared multibanked L1 data memory and ≈ 1 Instruction-Per-Cycle (IPC) per core ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
core. Thus, aggregated I-fetch bandwidth approaches f ∗ N, where f is the cluster clock frequency. An effective instruction cache architecture is key to support this I-fetch bandwidth. In this paper we compare two main architectures for instruction caching targeting tightly coupled CMP clusters: (i
Results 1 - 10
of
62