Results 11  20
of
23
Parallel Software for Inductance Extraction
"... The next generation VLSI circuits will be designed with millions of densely packed interconnect segments on a single chip. Inductive effects between these segments begin to dominate signal delay as the clock frequency is increased. Modern parasitic extraction tools to estimate the onchip inductive ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
The next generation VLSI circuits will be designed with millions of densely packed interconnect segments on a single chip. Inductive effects between these segments begin to dominate signal delay as the clock frequency is increased. Modern parasitic extraction tools to estimate the onchip inductive effects with high accuracy have had limited impact due to large computational and storage requirements. This paper describes a parallel software package for inductance extraction called ParIS, which is capable of analyzing interconnect configurations involving several conductors within reasonable time. The main component of the software is a novel preconditioned iterative method that is used to solve a dense complex linear system of equations. The linear system represents the inductive coupling between filaments that are used to discretize the conductors. A variant of the Fast Multipole Method is used to compute dense matrixvector products with the coefficient matrix. ParIS uses a twotier parallel formulation that allows mixed mode parallelization using both MPI and OpenMP. An MPI process is associated with each conductor. The computation within a conductor is parallelized using OpenMP. The parallel efficiency and scalability of the software is demonstrated through experiments on the IBM p690 and Intel and AMD Linux clusters. These experiments highlight the portability and efficiency of the software on multiprocessors with shared, distributed, and distributedshared memory architectures.
Direct Nbody Kernels for Multicore Platforms
"... Abstract—We present an interarchitectural comparison of single and doubleprecision direct nbody implementations on modern multicore platforms, including those based on the Intel Nehalem and AMD Barcelona systems, the SonyToshibaIBM PowerXCell/8i processor, and NVIDA Tesla C870 and C1060 GPU sy ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
Abstract—We present an interarchitectural comparison of single and doubleprecision direct nbody implementations on modern multicore platforms, including those based on the Intel Nehalem and AMD Barcelona systems, the SonyToshibaIBM PowerXCell/8i processor, and NVIDA Tesla C870 and C1060 GPU systems. We compare our implementations across platforms on a variety of proxy measures, including performance, coding complexity, and energy efficiency. I.
Multiset Graph Partitioning
 Math.MethodsOper.Res
, 2001
"... . Optimality conditions are given for a quadratic programming formulation of the multiset graph partitioning problem. These conditions are related to the structure of the graph and properties of the weights. Key words. graph partitioning, mincut, maxcut, quadratic programming, optimality conditio ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
. Optimality conditions are given for a quadratic programming formulation of the multiset graph partitioning problem. These conditions are related to the structure of the graph and properties of the weights. Key words. graph partitioning, mincut, maxcut, quadratic programming, optimality conditions AMS(MOS) subject classications. 90C35, 90C27, 90C20 This work was supported by the National Science Foundation. 1 1.
Definition of a New Circular SpaceFilling Curve  βΩIndexing
"... This technical report presents the definition of a circular Hilbertlike spacefilling curve. Preliminary evaluations in a simulation environment have shown good locality preserving properties. The results are compared with known bounds for other indexing schemes: Hilbert, Lebesgue, and HInde ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
This technical report presents the definition of a circular Hilbertlike spacefilling curve. Preliminary evaluations in a simulation environment have shown good locality preserving properties. The results are compared with known bounds for other indexing schemes: Hilbert, Lebesgue, and HIndexing. We evaluated partitions induced by the indexing schemes and uses the diameter and the surface as measures. For both we present worst case and average case results.
Parallel Performance of Hierarchical Multipole Algorithms for Inductance Extraction ⋆
"... Abstract. Parasitic extraction techniques are used to estimate signal delay in VLSI chips. Inductance extraction is a critical component of the parasitic extraction process in which onchip inductive effects are estimated with high accuracy. In earlier work [1], we described a parallel software pack ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
Abstract. Parasitic extraction techniques are used to estimate signal delay in VLSI chips. Inductance extraction is a critical component of the parasitic extraction process in which onchip inductive effects are estimated with high accuracy. In earlier work [1], we described a parallel software package for inductance extraction called ParIS, which uses a novel preconditioned iterative method to solve the dense, complex linear system of equations arising in these problems. The most computationally challenging task in ParIS involves computing dense matrixvector products efficiently via hierarchical multipolebased approximation techniques. This paper presents a comparative study of two such techniques: a hierarchical algorithm called Hierarchical Multipole Method (HMM) and the wellknown Fast Multipole Method (FMM). We investigate the performance of parallel MPIbased implementations of these algorithms on a Linux cluster. We analyze the impact of various algorithmic parameters and identify regimes where HMM is expected to outperform FMM on uniprocessor as well as multiprocessor platforms. 1
SIAM J. SCI. COMPUT. Vol. 30, No. 5, pp. 2675–2708 c ○ 2008 Society for Industrial and Applied Mathematics BOTTOMUP CONSTRUCTION AND 2:1 BALANCE REFINEMENT OF LINEAR OCTREES IN PARALLEL ∗
"... Abstract. In this article, we propose new parallel algorithms for the construction and 2:1 balance refinement of large linear octrees on distributed memory machines. Such octrees are used in many problems in computational science and engineering, e.g., object representation, image analysis, unstruct ..."
Abstract
 Add to MetaCart
Abstract. In this article, we propose new parallel algorithms for the construction and 2:1 balance refinement of large linear octrees on distributed memory machines. Such octrees are used in many problems in computational science and engineering, e.g., object representation, image analysis, unstructured meshing, finite elements, adaptive mesh refinement, and Nbody simulations. Fixedsize scalability and isogranular analysis of the algorithms using an MPIbased parallel implementation was performed on a variety of input data and demonstrated good scalability for different processor counts (1 to 1024 processors) on the Pittsburgh Supercomputing Center’s TCS1 AlphaServer. The results are consistent for different data distributions. Octrees with over a billion octants were constructed and balanced in less than a minute on 1024 processors. Like other existing algorithms for constructing and balancing octrees, our algorithms have O(N log N) work and O(N) storage complexity. Under reasonable assumptions on the distribution of octants and the work per octant, the parallel time complexity is O ( N np number of processors. log( N np)+np log np), where N is the size of the final linear octree and np is the
Contents lists available at ScienceDirect Journal of Computational Physics
"... journal homepage: www.elsevier.com/locate/jcp ..."
Mathematical and Numerical Aspects of the Adaptive Fast Multipole PoissonBoltzmann Solver
"... Abstract. This paper summarizes the mathematical and numerical theories and computational elements of the adaptive fast multipole PoissonBoltzmann (AFMPB) solver. We introduce and discuss the following components in order: the PoissonBoltzmann model, boundary integral equation reformulation, surfa ..."
Abstract
 Add to MetaCart
Abstract. This paper summarizes the mathematical and numerical theories and computational elements of the adaptive fast multipole PoissonBoltzmann (AFMPB) solver. We introduce and discuss the following components in order: the PoissonBoltzmann model, boundary integral equation reformulation, surface mesh generation, the nodepatch discretization approach, Krylov iterative methods, the new version of fast multipole methods (FMMs), and a dynamic prioritization technique for scheduling parallel operations. For each component, we also remark on feasible approaches for further improvements in efficiency, accuracy and applicability of the AFMPB solver to largescale longtime molecular dynamics simulations. The potential of the solver is demonstrated with preliminary numerical results.
Parallel Algorithms for Inductance Extraction of VLSI Circuits ∗
"... Inductance extraction involves estimating the mutual inductance in a VLSI circuit. Due to increasing clock speed and diminishing feature sizes of modern VLSI circuits, the effects of inductance are increasingly felt during the testing and verification stages. Hence, there is a need for fast and accu ..."
Abstract
 Add to MetaCart
Inductance extraction involves estimating the mutual inductance in a VLSI circuit. Due to increasing clock speed and diminishing feature sizes of modern VLSI circuits, the effects of inductance are increasingly felt during the testing and verification stages. Hence, there is a need for fast and accurate inductance extraction software. A generalized approach for inductance extraction requires the solution of a dense complex symmetric linear system that models mutual inductive effects among circuit elements. Iterative methods are used to solve the system without explicit computation of the matrix itself. Fast hierarchical techniques are used to compute approximate matrixvector products with the dense system matrix. This work presents an overview of a new parallel software package for inductance extraction of large VLSI circuits. The technique uses a combination of the solenoidal basis method and effective preconditioning schemes to solve the linear system. Fast Multipole Method (FMM) is used to compute approximate matrixvector products with the inductance matrix. By formulating the preconditioner as a dense matrix similar to the coefficient matrix, we are able to use FMM for the preconditioning step as well. A twotier parallelization scheme allows an efficient parallel implementation using both OpenMP and MPI directives simultaneously. The experiments conducted on various multiprocessor machines demonstrate the portability and parallel performance of the software.