Results 1  10
of
13
Solving LargeScale Sparse Semidefinite Programs for Combinatorial Optimization
 SIAM JOURNAL ON OPTIMIZATION
, 1998
"... We present a dualscaling interiorpoint algorithm and show how it exploits the structure and sparsity of some large scale problems. We solve the positive semidefinite relaxation of combinatorial and quadratic optimization problems subject to boolean constraints. We report the first computational re ..."
Abstract

Cited by 116 (11 self)
 Add to MetaCart
We present a dualscaling interiorpoint algorithm and show how it exploits the structure and sparsity of some large scale problems. We solve the positive semidefinite relaxation of combinatorial and quadratic optimization problems subject to boolean constraints. We report the first computational results of interiorpoint algorithms for approximating the maximum cut semidefinite programs with dimension upto 3000.
Sparse Gaussian Elimination on High Performance Computers
, 1996
"... This dissertation presents new techniques for solving large sparse unsymmetric linear systems on high performance computers, using Gaussian elimination with partial pivoting. The efficiencies of the new algorithms are demonstrated for matrices from various fields and for a variety of high performan ..."
Abstract

Cited by 36 (6 self)
 Add to MetaCart
This dissertation presents new techniques for solving large sparse unsymmetric linear systems on high performance computers, using Gaussian elimination with partial pivoting. The efficiencies of the new algorithms are demonstrated for matrices from various fields and for a variety of high performance machines. In the first part we discuss optimizations of a sequential algorithm to exploit the memory hierarchies that exist in most RISCbased superscalar computers. We begin with the leftlooking supernodecolumn algorithm by Eisenstat, Gilbert and Liu, which includes Eisenstat and Liu's symmetric structural reduction for fast symbolic factorization. Our key contribution is to develop both numeric and symbolic schemes to perform supernodepanel updates to achieve better data reuse in cache and floatingpoint register...
Direct Methods
, 1998
"... We review current methods for the direct solution of sparse linear equations. We discuss basic concepts such as fillin, sparsity orderings, indirect addressing and compare general sparse codes with codes for dense systems. We examine methods for greatly increasing the efficiency when the matrix is ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
We review current methods for the direct solution of sparse linear equations. We discuss basic concepts such as fillin, sparsity orderings, indirect addressing and compare general sparse codes with codes for dense systems. We examine methods for greatly increasing the efficiency when the matrix is symmetric positive definite. We consider frontal and multifrontal methods emphasizing how they can take advantage of vectorization, RISC architectures, and parallelism. Some comparisons are made with other techniques and the availability of software for the direct solution of sparse equations is discussed.
Twodimensional Block Partitionings for the Parallel Sparse Cholesky Factorization: the Fanin Method
, 1997
"... This paper presents a discussion on 2D block mappings for the sparse Cholesky factorization on parallel MIMD architectures with distributed memory. It introduces the fanin algorithm in a general manner and proposes several mapping strategies. The grid mapping with row balancing, inspired from Rothb ..."
Abstract

Cited by 4 (1 self)
 Add to MetaCart
This paper presents a discussion on 2D block mappings for the sparse Cholesky factorization on parallel MIMD architectures with distributed memory. It introduces the fanin algorithm in a general manner and proposes several mapping strategies. The grid mapping with row balancing, inspired from Rothberg's work [21, 22] proved to be more robust than the original fanout algorithm. Even more efficient is the proportional mapping, as show the experiments on a 32 processors IBM SP1 and on a Cray T3D. Subforesttosubcube mappings are also considered and give good results on the T3D.
A FRISCHNEWTON ALGORITHM FOR SPARSE QUANTILE REGRESSION
"... Abstract. Recent experience has shown that interiorpoint methods using a log barrier approach are far superior to classical simplex methods for computing solutions to large parametric quantile regression problems. In many large empirical applications, the design matrix has a very sparse structure. ..."
Abstract

Cited by 3 (2 self)
 Add to MetaCart
Abstract. Recent experience has shown that interiorpoint methods using a log barrier approach are far superior to classical simplex methods for computing solutions to large parametric quantile regression problems. In many large empirical applications, the design matrix has a very sparse structure. A typical example is the classical fixedeffect model for panel data where the parametric dimension of the model can be quite large, but the number of nonzero elements is quite small. Adopting recent developments in sparse linear algebra we introduce a modified version of the FrischNewton algorithm for quantile regression described in Portnoy and Koenker (1997). The new algorithm substantially reduces the storage (memory) requirements and increases computational speed. The modified algorithm also facilitates the development of nonparametric quantile regression methods. The pseudo design matrices employed in nonparametric quantile regression smoothing are inherently sparse in both the fidelity and roughness penalty components. Exploiting the sparse structure of these problems opens up a whole range of new possibilities for multivariate smoothing on large data sets via ANOVAtype decomposition and partial linear models. 1.
Use of Computational Kernels in Full and Sparse Linear Solvers, Efficient Code Design on HighPerformance RISC Processors
 RISC processors, inVector and Parallel Processing { VECPAR'96
, 1997
"... . We believe that the availability of portable and efficient serial and parallel numerical libraries that can be used as building blocks is extremely important for both simplifying application software development and improving reliability. This is illustrated by considering the solution of full ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
. We believe that the availability of portable and efficient serial and parallel numerical libraries that can be used as building blocks is extremely important for both simplifying application software development and improving reliability. This is illustrated by considering the solution of full and sparse linear systems. We describe successive layers of computational kernels such as the BLAS, the sparse BLAS, blocked algorithms for factorizing full systems, direct and iterative methods for sparse linear systems. We also show how the architecture of the today's powerful RISC processors may influence efficient code design. 1 Introduction One of the common problems for application scientists is to exploit as efficiently as possible the hardware of highperformance computers (either serial or parallel) without totally rewriting or redesigning existing codes and algorithms. We believe that the availability of portable and efficient serial and parallel numerical libraries that ca...
Task Scheduling Using a Block Dependency DAG for BlockOriented Sparse Cholesky Factorization
 in: Proceedings of 14th ACM Symposium on Applied Computing
, 2000
"... Blockoriented sparse Cholesky factorization decomposes a sparse matrix into rectangular subblocks; each block can then be handled as a computational unit in order to increase data reuse in a hierarchical memory system. Also, the factorization method increases the degree of concurrency with the red ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Blockoriented sparse Cholesky factorization decomposes a sparse matrix into rectangular subblocks; each block can then be handled as a computational unit in order to increase data reuse in a hierarchical memory system. Also, the factorization method increases the degree of concurrency with the reduction of communication volumes so that it performs more efficiently on a distributedmemory multiprocessor system than the customary columnoriented factorization method. But until now, mapping of blocks to processors has been designed for load balance with restricted communication patterns. In this paper, we represent tasks using a block dependency DAG that shows the execution behavior of block sparse Cholesky factorization in a distributedmemory system. Since the characteristics of tasks for the block Cholesky factorization are different from those of the conventional parallel task model, we propose a new task scheduling algorithm using a block dependency DAG. The proposed algorithm consi...
Matrix Reordering Effects on a Parallel Frontal Solver for Large Scale Process Simulation
 Computers and Chemical Engineering
, 1998
"... For the simulation and optimization of large scale chemical processes, the overall computing time is often dominated by the time needed to solve a large sparse system of linear equations. A parallel frontal solver can be used to significantly reduce the wallclock time required to solve these linear ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
For the simulation and optimization of large scale chemical processes, the overall computing time is often dominated by the time needed to solve a large sparse system of linear equations. A parallel frontal solver can be used to significantly reduce the wallclock time required to solve these linear equation systems using parallel/vector supercomputers. This is done by exploiting both multiprocessing and vector processing, using a multifrontaltype approach in which frontal elimination is used for the partial factorization of each front. However, the algorithm is based on a bordered blockdiagonal matrix form and thus its performance depends on the extent to which the matrix can be reordered to this form. Various approaches to achieving this ordering are discussed here. The performance of these different matrix reordering strategies for achieving the bordered blockdiagonal form is then considered. Results, including a visualization of the different matrix orderings on one problem, are ...
DISTRIBUTED PARALLEL COMPUTING APPLIED TO NUMERICAL SIMULATION OF PETROLEUM RESERVOIRS
"... The successful characterization and management of petroleum fields depends strongly on the knowledge of the hydrocarbons volumes in place and the flow conditions of the phases (water, oil and gas). These data are the support for the economic and strategic decisions, like drilling new wells or the fi ..."
Abstract
 Add to MetaCart
The successful characterization and management of petroleum fields depends strongly on the knowledge of the hydrocarbons volumes in place and the flow conditions of the phases (water, oil and gas). These data are the support for the economic and strategic decisions, like drilling new wells or the field abandonment. Several analytical models are available, but its applicability is restricted to small models, due to the complexity and mathematic effort required in most of the practical applications. So the solution for intermediate and large models is the numerical simulation. Other requirements remain with the simulation of large models: the need of adequate computation resources and the very long simulation time. Models with more than 100,000 blocks need a big amount of memory and usually require a large wall clock time. Another critical situation is when a specific procedure demands a huge number of simulations like history matching, uncertainty analysis and optimization of production strategy. So, the total computational time to solve these procedures could be very large. Parallel computing could handle both situations: large execution wall clock time and procedures with several simulations without losing information and avoiding expensive cost of
SCIENTIFIC COMPUTING
, 2002
"... We have defined and analyzed a semiToeplitz preconditioner for timedependent and steadystate convectiondiffusion problems. Analytic expressions for the eigenvalues of the preconditioned systems are obtained. An asymptotic analysis shows that the eigenvalue spectrum of the timedependent problem is ..."
Abstract
 Add to MetaCart
We have defined and analyzed a semiToeplitz preconditioner for timedependent and steadystate convectiondiffusion problems. Analytic expressions for the eigenvalues of the preconditioned systems are obtained. An asymptotic analysis shows that the eigenvalue spectrum of the timedependent problem is reduced to two eigenvalues when the number of grid points go to infinity. The numerical experiments sustain the results of the theoretical analysis, and the preconditioner exhibits a robust behavior for stretched grids. A semiToeplitz preconditioner for the linearized Navierâ€“Stokes equations for compressible flow is proposed and tested. The preconditioner is applied to the linear system of equations to be solved in each time step of an implicit method. The equations are solved with flat plate boundary conditions and are linearized around the Blasius solution. The grids are stretched in the normal direction to the plate and the quotient between the time step and the space step is varied. The preconditioner works well in all tested cases and outperforms the method without preconditioning both in number of iterations and execution time.