Results 1 - 10
of
11
Solving Large-Scale Sparse Semidefinite Programs for Combinatorial Optimization
- SIAM JOURNAL ON OPTIMIZATION
, 1998
"... We present a dual-scaling interior-point algorithm and show how it exploits the structure and sparsity of some large scale problems. We solve the positive semidefinite relaxation of combinatorial and quadratic optimization problems subject to boolean constraints. We report the first computational re ..."
Abstract
-
Cited by 98 (10 self)
- Add to MetaCart
We present a dual-scaling interior-point algorithm and show how it exploits the structure and sparsity of some large scale problems. We solve the positive semidefinite relaxation of combinatorial and quadratic optimization problems subject to boolean constraints. We report the first computational results of interior-point algorithms for approximating the maximum cut semidefinite programs with dimension up-to 3000.
Sparse Gaussian Elimination on High Performance Computers
, 1996
"... This dissertation presents new techniques for solving large sparse unsymmetric linear systems on high performance computers, using Gaussian elimination with partial pivoting. The efficiencies of the new algorithms are demonstrated for matrices from various fields and for a variety of high performan ..."
Abstract
-
Cited by 33 (5 self)
- Add to MetaCart
This dissertation presents new techniques for solving large sparse unsymmetric linear systems on high performance computers, using Gaussian elimination with partial pivoting. The efficiencies of the new algorithms are demonstrated for matrices from various fields and for a variety of high performance machines. In the first part we discuss optimizations of a sequential algorithm to exploit the memory hierarchies that exist in most RISC-based superscalar computers. We begin with the left-looking supernode-column algorithm by Eisenstat, Gilbert and Liu, which includes Eisenstat and Liu's symmetric structural reduction for fast symbolic factorization. Our key contribution is to develop both numeric and symbolic schemes to perform supernodepanel updates to achieve better data reuse in cache and floating-point register...
Direct Methods
, 1998
"... We review current methods for the direct solution of sparse linear equations. We discuss basic concepts such as fill-in, sparsity orderings, indirect addressing and compare general sparse codes with codes for dense systems. We examine methods for greatly increasing the efficiency when the matrix is ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
We review current methods for the direct solution of sparse linear equations. We discuss basic concepts such as fill-in, sparsity orderings, indirect addressing and compare general sparse codes with codes for dense systems. We examine methods for greatly increasing the efficiency when the matrix is symmetric positive definite. We consider frontal and multifrontal methods emphasizing how they can take advantage of vectorization, RISC architectures, and parallelism. Some comparisons are made with other techniques and the availability of software for the direct solution of sparse equations is discussed.
Two-dimensional Block Partitionings for the Parallel Sparse Cholesky Factorization: the Fan-in Method
, 1997
"... This paper presents a discussion on 2D block mappings for the sparse Cholesky factorization on parallel MIMD architectures with distributed memory. It introduces the fan-in algorithm in a general manner and proposes several mapping strategies. The grid mapping with row balancing, inspired from Rothb ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
This paper presents a discussion on 2D block mappings for the sparse Cholesky factorization on parallel MIMD architectures with distributed memory. It introduces the fan-in algorithm in a general manner and proposes several mapping strategies. The grid mapping with row balancing, inspired from Rothberg's work [21, 22] proved to be more robust than the original fan-out algorithm. Even more efficient is the proportional mapping, as show the experiments on a 32 processors IBM SP1 and on a Cray T3D. Subforest-tosubcube mappings are also considered and give good results on the T3D.
Use of Computational Kernels in Full and Sparse Linear Solvers, Efficient Code Design on High-Performance RISC Processors
- RISC processors, inVector and Parallel Processing { VECPAR'96
, 1997
"... . We believe that the availability of portable and efficient serial and parallel numerical libraries that can be used as building blocks is extremely important for both simplifying application software development and improving reliability. This is illustrated by considering the solution of full ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
. We believe that the availability of portable and efficient serial and parallel numerical libraries that can be used as building blocks is extremely important for both simplifying application software development and improving reliability. This is illustrated by considering the solution of full and sparse linear systems. We describe successive layers of computational kernels such as the BLAS, the sparse BLAS, blocked algorithms for factorizing full systems, direct and iterative methods for sparse linear systems. We also show how the architecture of the today's powerful RISC processors may influence efficient code design. 1 Introduction One of the common problems for application scientists is to exploit as efficiently as possible the hardware of high-performance computers (either serial or parallel) without totally rewriting or redesigning existing codes and algorithms. We believe that the availability of portable and efficient serial and parallel numerical libraries that ca...
A FRISCH-NEWTON ALGORITHM FOR SPARSE QUANTILE REGRESSION
"... Abstract. Recent experience has shown that interior-point methods using a log barrier approach are far superior to classical simplex methods for computing solutions to large parametric quantile regression problems. In many large empirical applications, the design matrix has a very sparse structure. ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
Abstract. Recent experience has shown that interior-point methods using a log barrier approach are far superior to classical simplex methods for computing solutions to large parametric quantile regression problems. In many large empirical applications, the design matrix has a very sparse structure. A typical example is the classical fixed-effect model for panel data where the parametric dimension of the model can be quite large, but the number of non-zero elements is quite small. Adopting recent developments in sparse linear algebra we introduce a modified version of the Frisch-Newton algorithm for quantile regression described in Portnoy and Koenker (1997). The new algorithm substantially reduces the storage (memory) requirements and increases computational speed. The modified algorithm also facilitates the development of nonparametric quantile regression methods. The pseudo design matrices employed in nonparametric quantile regression smoothing are inherently sparse in both the fidelity and roughness penalty components. Exploiting the sparse structure of these problems opens up a whole range of new possibilities for multivariate smoothing on large data sets via ANOVA-type decomposition and partial linear models. 1.
Task Scheduling Using a Block Dependency DAG for Block-Oriented Sparse Cholesky Factorization
- in: Proceedings of 14-th ACM Symposium on Applied Computing
, 2000
"... Block-oriented sparse Cholesky factorization decomposes a sparse matrix into rectangular sub-blocks; each block can then be handled as a computational unit in order to increase data reuse in a hierarchical memory system. Also, the factorization method increases the degree of concurrency with the red ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Block-oriented sparse Cholesky factorization decomposes a sparse matrix into rectangular sub-blocks; each block can then be handled as a computational unit in order to increase data reuse in a hierarchical memory system. Also, the factorization method increases the degree of concurrency with the reduction of communication volumes so that it performs more efficiently on a distributed-memory multiprocessor system than the customary column-oriented factorization method. But until now, mapping of blocks to processors has been designed for load balance with restricted communication patterns. In this paper, we represent tasks using a block dependency DAG that shows the execution behavior of block sparse Cholesky factorization in a distributedmemory system. Since the characteristics of tasks for the block Cholesky factorization are different from those of the conventional parallel task model, we propose a new task scheduling algorithm using a block dependency DAG. The proposed algorithm consi...
Matrix Reordering Effects on a Parallel Frontal Solver for Large Scale Process Simulation
- Computers and Chemical Engineering
, 1998
"... For the simulation and optimization of large scale chemical processes, the overall computing time is often dominated by the time needed to solve a large sparse system of linear equations. A parallel frontal solver can be used to significantly reduce the wallclock time required to solve these linear ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
For the simulation and optimization of large scale chemical processes, the overall computing time is often dominated by the time needed to solve a large sparse system of linear equations. A parallel frontal solver can be used to significantly reduce the wallclock time required to solve these linear equation systems using parallel/vector supercomputers. This is done by exploiting both multiprocessing and vector processing, using a multifrontal-type approach in which frontal elimination is used for the partial factorization of each front. However, the algorithm is based on a bordered block-diagonal matrix form and thus its performance depends on the extent to which the matrix can be reordered to this form. Various approaches to achieving this ordering are discussed here. The performance of these different matrix reordering strategies for achieving the bordered block-diagonal form is then considered. Results, including a visualization of the different matrix orderings on one problem, are ...
DISTRIBUTED PARALLEL COMPUTING APPLIED TO NUMERICAL SIMULATION OF PETROLEUM RESERVOIRS
"... The successful characterization and management of petroleum fields depends strongly on the knowledge of the hydrocarbons volumes in place and the flow conditions of the phases (water, oil and gas). These data are the support for the economic and strategic decisions, like drilling new wells or the fi ..."
Abstract
- Add to MetaCart
The successful characterization and management of petroleum fields depends strongly on the knowledge of the hydrocarbons volumes in place and the flow conditions of the phases (water, oil and gas). These data are the support for the economic and strategic decisions, like drilling new wells or the field abandonment. Several analytical models are available, but its applicability is restricted to small models, due to the complexity and mathematic effort required in most of the practical applications. So the solution for intermediate and large models is the numerical simulation. Other requirements remain with the simulation of large models: the need of adequate computation resources and the very long simulation time. Models with more than 100,000 blocks need a big amount of memory and usually require a large wall clock time. Another critical situation is when a specific procedure demands a huge number of simulations like history matching, uncertainty analysis and optimization of production strategy. So, the total computational time to solve these procedures could be very large. Parallel computing could handle both situations: large execution wall clock time and procedures with several simulations without losing information and avoiding expensive cost of
SCIENTIFIC COMPUTING
, 2002
"... We have defined and analyzed a semi-Toeplitz preconditioner for timedependent and steady-state convection-diffusion problems. Analytic expressions for the eigenvalues of the preconditioned systems are obtained. An asymptotic analysis shows that the eigenvalue spectrum of the timedependent problem is ..."
Abstract
- Add to MetaCart
We have defined and analyzed a semi-Toeplitz preconditioner for timedependent and steady-state convection-diffusion problems. Analytic expressions for the eigenvalues of the preconditioned systems are obtained. An asymptotic analysis shows that the eigenvalue spectrum of the timedependent problem is reduced to two eigenvalues when the number of grid points go to infinity. The numerical experiments sustain the results of the theoretical analysis, and the preconditioner exhibits a robust behavior for stretched grids. A semi-Toeplitz preconditioner for the linearized Navier–Stokes equations for compressible flow is proposed and tested. The preconditioner is applied to the linear system of equations to be solved in each time step of an implicit method. The equations are solved with flat plate boundary conditions and are linearized around the Blasius solution. The grids are stretched in the normal direction to the plate and the quotient between the time step and the space step is varied. The preconditioner works well in all tested cases and outperforms the method without preconditioning both in number of iterations and execution time.

