Results 1 - 10
of
63
Highly scalable parallel algorithms for sparse matrix factorization
- IEEE Transactions on Parallel and Distributed Systems
, 1994
"... In this paper, we describe a scalable parallel algorithm for sparse matrix factorization, analyze their performance and scalability, and present experimental results for up to 1024 processors on a Cray T3D parallel computer. Through our analysis and experimental results, we demonstrate that our algo ..."
Abstract
-
Cited by 100 (29 self)
- Add to MetaCart
In this paper, we describe a scalable parallel algorithm for sparse matrix factorization, analyze their performance and scalability, and present experimental results for up to 1024 processors on a Cray T3D parallel computer. Through our analysis and experimental results, we demonstrate that our algorithm substantially improves the state of the art in parallel direct solution of sparse linear systems—both in terms of scalability and overall performance. It is a well known fact that dense matrix factorization scales well and can be implemented efficiently on parallel computers. In this paper, we present the first algorithm to factor a wide class of sparse matrices (including those arising from two- and three-dimensional finite element problems) that is asymptotically as scalable as dense matrix factorization algorithms on a variety of parallel architectures. Our algorithm incurs less communication overhead and is more scalable than any previously known parallel formulation of sparse matrix factorization. Although, in this paper, we discuss Cholesky factorization of symmetric positive definite matrices, the algorithms can be adapted for solving sparse linear least squares problems and for Gaussian elimination of diagonally dominant matrices that are almost symmetric in structure. An implementation of our sparse Cholesky factorization algorithm delivers up to 20 GFlops on a Cray T3D for medium-size structural engineering and linear programming problems. To the best of our knowledge,
Approximate Inverse Preconditioners Via Sparse-Sparse Iterations
, 1998
"... . The standard incomplete LU (ILU) preconditioners often fail for general sparse indefinite matrices because they give rise to `unstable' factors L and U . In such cases, it may be attractive to approximate the inverse of the matrix directly. This paper focuses on approximate inverse preconditioners ..."
Abstract
-
Cited by 71 (17 self)
- Add to MetaCart
. The standard incomplete LU (ILU) preconditioners often fail for general sparse indefinite matrices because they give rise to `unstable' factors L and U . In such cases, it may be attractive to approximate the inverse of the matrix directly. This paper focuses on approximate inverse preconditioners based on minimizing kI \Gamma AMkF , where AM is the preconditioned matrix. An iterative descent-type method is used to approximate each column of the inverse. For this approach to be efficient, the iteration must be done in sparse mode, i.e., with `sparse-matrix by sparse-vector' operations. Numerical dropping is applied to maintain sparsity; compared to previous methods, this is a natural way to determine the sparsity pattern of the approximate inverse. This paper describes Newton, `global' and column-oriented algorithms, and discusses options for initial guesses, self-preconditioning, and dropping strategies. Some limited theoretical results on the properties and convergence of approxima...
Reachability is harder for directed than for undirected finite graphs
- Journal of Symbolic Logic
, 1990
"... Abstract. Although it is known that reachability in undirected finite graphs can be expressed by an existential monadic second-order sentence, our main result is that this is not the case for directed finite graphs (even in the presence of certain “built-in ” relations, such as the successor relatio ..."
Abstract
-
Cited by 69 (8 self)
- Add to MetaCart
Abstract. Although it is known that reachability in undirected finite graphs can be expressed by an existential monadic second-order sentence, our main result is that this is not the case for directed finite graphs (even in the presence of certain “built-in ” relations, such as the successor relation). The proof makes use of Ehrenfeucht-Frai’sse games, along with probabilistic arguments. However, we show that for directed finite graphs with degree at most k, reachability is expressible by an existential monadic second-order sentence. $1. Introduction. If s and t denote distinguished points in a directed (resp. undirected) graph, then we say that a graph is (s, t)-connected if there is a directed (undirected) path from s to t. We sometimes refer to the problem of deciding whether a given directed (undirected) graph with two given points sand t is (s, t)-connected as the directed (undirected) reachability problem.
How Good is Recursive Bisection?
- SIAM J. Sci. Comput
, 1995
"... . The most commonly used p-way partitioning method is recursive bisection (RB). It first divides a graph or a mesh into two equal sized pieces, by a "good" bisection algorithm, and then recursively divides the two pieces. Ideally, we would like to use an optimal bisection algorithm. Because the opti ..."
Abstract
-
Cited by 62 (4 self)
- Add to MetaCart
. The most commonly used p-way partitioning method is recursive bisection (RB). It first divides a graph or a mesh into two equal sized pieces, by a "good" bisection algorithm, and then recursively divides the two pieces. Ideally, we would like to use an optimal bisection algorithm. Because the optimal bisection problem, that partitions a graph into two equal sized subgraphs to minimize the number of edges cut, is NP-complete, practical RB algorithms use more efficient heuristics in place of an optimal bisection algorithm. Most such heuristics are designed to find the best possible bisection within allowed time. We show that the recursive bisection method, even when an optimal bisection algorithm is assumed, may produce a p-way partition that is very far way from the optimal one. Our negative result is complemented by two positive ones: First we show that for some important classes of graphs that occur in practical applications, such as well-shaped finite element and finite difference...
A New Polynomial Factorization Algorithm and its Implementation
- Journal of Symbolic Computation
, 1996
"... We consider the problem of factoring univariate polynomials over a finite field. We demonstrate that the new baby step/giant step factoring method, recently developed by Kaltofen & Shoup, can be made into a very practical algorithm. We describe an implementation of this algorithm, and present the re ..."
Abstract
-
Cited by 61 (5 self)
- Add to MetaCart
We consider the problem of factoring univariate polynomials over a finite field. We demonstrate that the new baby step/giant step factoring method, recently developed by Kaltofen & Shoup, can be made into a very practical algorithm. We describe an implementation of this algorithm, and present the results of empirical tests comparing this new algorithm with others. When factoring polynomials modulo large primes, the algorithm allows much larger polynomials to be factored using a reasonable amount of time and space than was previously possible. For example, this new software has been used to factor a "generic" polynomial of degree 2048 modulo a 2048-bit prime in under 12 days on a Sun SPARC-station 10, using 68 MB of main memory. 1 Introduction We consider the problem of factoring a univariate polynomial of degree n over the field F p of p elements, where p is prime. This problem has been well-studied, and many algorithms for its solution have been proposed. In general, the running tim...
Subquadratic-time factoring of polynomials over finite fields
- Math. Comp
, 1998
"... Abstract. New probabilistic algorithms are presented for factoring univariate polynomials over finite fields. The algorithms factor a polynomial of degree n over a finite field of constant cardinality in time O(n 1.815). Previous algorithms required time Θ(n 2+o(1)). The new algorithms rely on fast ..."
Abstract
-
Cited by 56 (11 self)
- Add to MetaCart
Abstract. New probabilistic algorithms are presented for factoring univariate polynomials over finite fields. The algorithms factor a polynomial of degree n over a finite field of constant cardinality in time O(n 1.815). Previous algorithms required time Θ(n 2+o(1)). The new algorithms rely on fast matrix multiplication techniques. More generally, to factor a polynomial of degree n over the finite field Fq with q elements, the algorithms use O(n 1.815 log q) arithmetic operations in Fq. The new “baby step/giant step ” techniques used in our algorithms also yield new fast practical algorithms at super-quadratic asymptotic running time, and subquadratic-time methods for manipulating normal bases of finite fields. 1.
Nearly Optimal Algorithms For Canonical Matrix Forms
, 1993
"... A Las Vegas type probabilistic algorithm is presented for finding the Frobenius canonical form of an n x n matrix T over any field K. The algorithm requires O~(MM(n)) = MM(n) (log n) ^ O(1) operations in K, where O(MM(n)) operations in K are sufficient to multiply two n x n matrices over K. This nea ..."
Abstract
-
Cited by 55 (11 self)
- Add to MetaCart
A Las Vegas type probabilistic algorithm is presented for finding the Frobenius canonical form of an n x n matrix T over any field K. The algorithm requires O~(MM(n)) = MM(n) (log n) ^ O(1) operations in K, where O(MM(n)) operations in K are sufficient to multiply two n x n matrices over K. This nearly matches the lower bound of \Omega(MM(n)) operations in K for this problem, and improves on the O(n^4) operations in K required by the previously best known algorithms. We also demonstrate a fast parallel implementation of our algorithm for the Frobenius form, which is processor-efficient on a PRAM. As an application we give an algorithm to evaluate a polynomial g(x) in K[x] at T which requires only O~(MM(n)) operations in K when deg g < n^2. Other applications include sequential and parallel algorithms for computing the minimal and characteristic polynomials of a matrix, the rational Jordan form of a matrix, for testing whether two matrices are similar, and for matrix powering, which are substantially faster than those previously known.
On-Line Learning of Linear Functions
- Computational Complexity
, 1991
"... this paper, we present near-optimal strategies for combining opinions in situations like this. In more abstract terms, we study the on-line learning of linear functions. We assume that learning proceeds in a sequence of trials. At trial number t the learning algorithm (the advisor) is presented with ..."
Abstract
-
Cited by 40 (18 self)
- Add to MetaCart
this paper, we present near-optimal strategies for combining opinions in situations like this. In more abstract terms, we study the on-line learning of linear functions. We assume that learning proceeds in a sequence of trials. At trial number t the learning algorithm (the advisor) is presented with an instance ~x t 2 [0; 1]
Sublinear-Time Parallel Algorithms for Matching and Related Problems
, 1988
"... This paper presents the first sublinear-time deterministic parallel algorithms for bipartite matching and several related problems, including maximal node-disjoint paths, depth-first search, and flows in zero-one networks. Our results are based on a better understanding of the combinatorial struc ..."
Abstract
-
Cited by 33 (6 self)
- Add to MetaCart
This paper presents the first sublinear-time deterministic parallel algorithms for bipartite matching and several related problems, including maximal node-disjoint paths, depth-first search, and flows in zero-one networks. Our results are based on a better understanding of the combinatorial structure of the above problems, which leads to new algorithmic techniques. In particular, we show how to use maximal matching to extend, in parallel, a current set of nodedisjoint paths and how to take advantage of the parallelism that arises when a large number of nodes are "active" during an execution of a push-relabel network flow algorithm. We also show how to apply our techniques to design parallel algorithms for the weighted versions of the above problems. In particular, we present sublinear-time deterministic parallel algorithms for finding a minimum-weight bipartite matching and for finding a minimum-cost flow in a network with zero-one capacities, if the weights are polynomially ...
Approximate Inverse Preconditioners for General Sparse Matrices
, 1994
"... The standard Incomplete LU (ILU) preconditioners often fail for general sparse indefinite matrices because they give rise to `unstable' factors L and U . In such cases, it may be attractive to approximate the inverse of the matrix directly. This paper focuses on approximate inverse preconditioner ..."
Abstract
-
Cited by 24 (6 self)
- Add to MetaCart
The standard Incomplete LU (ILU) preconditioners often fail for general sparse indefinite matrices because they give rise to `unstable' factors L and U . In such cases, it may be attractive to approximate the inverse of the matrix directly. This paper focuses on approximate inverse preconditioners based on minimizing kI \GammaAM k F , where AM is the preconditioned matrix. An iterative descent-type method is used to approximate each column of the inverse. For this approach to be efficient, the iteration must be done in sparse mode, i.e., with `sparse-matrix by sparse-vector' operations. Numerical dropping is applied to each column to maintain sparsity in the approximate inverse. Compared to previous methods, this is a natural way to determine the sparsity pattern of the approximate inverse. This paper discusses options such as Newton and `global' iteration, self-preconditioning, dropping strategies, and factorized forms. The performance of the options are compared on standar...

