Results 11  20
of
73
An improved Newton iteration for the generalized inverse of a matrix, with applications
 SIAM J. Sci. Stat. Comput
, 1991
"... 94035. An Improved Newton Iteration for the Generalized Inverse of a Matrix, with Applications _" ..."
Abstract

Cited by 27 (14 self)
 Add to MetaCart
94035. An Improved Newton Iteration for the Generalized Inverse of a Matrix, with Applications _"
A study of Coppersmith's block Wiedemann algorithm using matrix polynomials
 LMCIMAG, REPORT # 975 IM
, 1997
"... We analyse a randomized block algorithm proposed by Coppersmith for solving large sparse systems of linear equations, Aw = 0, over a finite field K =GF(q). It is a modification of an algorithm of Wiedemann. Coppersmith has given heuristic arguments to understand why the algorithm works. But it was a ..."
Abstract

Cited by 24 (8 self)
 Add to MetaCart
We analyse a randomized block algorithm proposed by Coppersmith for solving large sparse systems of linear equations, Aw = 0, over a finite field K =GF(q). It is a modification of an algorithm of Wiedemann. Coppersmith has given heuristic arguments to understand why the algorithm works. But it was an open question to prove that it may produce a solution, with positive probability, for small finite fields e.g. for K =GF(2). We answer this question nearly completely. The algorithm uses two random matrices X and Y of dimensions m \Theta N and N \Theta n. Over any finite field, we show how the parameters m and n of the algorithm may be tuned so that, for any input system, a solution is computed with high probability. Conversely, for certain particular input systems, we show that the conditions on the input parameters may be relaxed to ensure the success. We also improve the probability bound of Kaltofen in the case of large cardinality fields. Lastly, for the sake of completeness of the...
Approximate Inverse Preconditioners for General Sparse Matrices
, 1994
"... The standard Incomplete LU (ILU) preconditioners often fail for general sparse indefinite matrices because they give rise to `unstable' factors L and U . In such cases, it may be attractive to approximate the inverse of the matrix directly. This paper focuses on approximate inverse preconditioner ..."
Abstract

Cited by 24 (6 self)
 Add to MetaCart
The standard Incomplete LU (ILU) preconditioners often fail for general sparse indefinite matrices because they give rise to `unstable' factors L and U . In such cases, it may be attractive to approximate the inverse of the matrix directly. This paper focuses on approximate inverse preconditioners based on minimizing kI \GammaAM k F , where AM is the preconditioned matrix. An iterative descenttype method is used to approximate each column of the inverse. For this approach to be efficient, the iteration must be done in sparse mode, i.e., with `sparsematrix by sparsevector' operations. Numerical dropping is applied to each column to maintain sparsity in the approximate inverse. Compared to previous methods, this is a natural way to determine the sparsity pattern of the approximate inverse. This paper discusses options such as Newton and `global' iteration, selfpreconditioning, dropping strategies, and factorized forms. The performance of the options are compared on standar...
Parallel Matrix Multiplication on a Linear Array with a Reconfigurable Pipelined Bus System
 IEEE Transactions on Computers
, 1997
"... The known fast sequential algorithms for multiplying two N \Theta N matrices (over an arbitrary ring) have time complexity O(N ff ), where 2 ! ff ! 3. The current best value of ff is less than 2.3755. We show that for all 1 p N ff , multiplying two N \Theta N matrices can be performed on a p ..."
Abstract

Cited by 22 (6 self)
 Add to MetaCart
The known fast sequential algorithms for multiplying two N \Theta N matrices (over an arbitrary ring) have time complexity O(N ff ), where 2 ! ff ! 3. The current best value of ff is less than 2.3755. We show that for all 1 p N ff , multiplying two N \Theta N matrices can be performed on a pprocessor linear array with a reconfigurable pipelined bus system (LARPBS) in O ` N ff p + ` N 2 p 2=ff ' log p ' time. This is currently the fastest parallelization of the best known sequential matrix multiplication algorithm on a distributed memory parallel system. In particular, for all 1 p N 2:3755 , multiplying two N \Theta N matrices can be performed on a pprocessor LARPBS in O ` N 2:3755 p + ` N 2 p 0:8419 ' log p ' time, and linear speedup can be achieved for p as large as O(N 2:3755 =(log N) 6:3262 ). Furthermore, multiplying two N \ThetaN matrices can be performed on an LARPBS with O(N ff ) processors in O(log N) time. This compares favorably with...
Efficient Matrix Preconditioners for Black Box Linear Algebra
 LINEAR ALGEBRA AND APPLICATIONS 343–344 (2002), 119–146. SPECIAL ISSUE ON STRUCTURED AND INFINITE SYSTEMS OF LINEAR EQUATIONS
, 2001
"... The main idea of the "black box" approach in exact linear algebra is to reduce matrix problems to the computation of minimum polynomials. In most cases preconditioning is necessary to obtain the desired result. Here, good preconditioners will be used to ensure geometrical / algebraic properties on m ..."
Abstract

Cited by 22 (16 self)
 Add to MetaCart
The main idea of the "black box" approach in exact linear algebra is to reduce matrix problems to the computation of minimum polynomials. In most cases preconditioning is necessary to obtain the desired result. Here, good preconditioners will be used to ensure geometrical / algebraic properties on matrices, rather than numerical ones, so we do not address a condition number. We o#er a review of problems for which (algebraic) preconditioning is used, provide a bestiary of preconditioning problems, and discuss several preconditioner types to solve these problems. We present new conditioners, including conditioners to preserve low displacement rank for Toeplitzlike matrices. We also provide new analyses of preconditioner performance and results on the relations among preconditioning problems and with linear algebra problems. Thus improvements are offered for the e#ciency and applicability of preconditioners. The focus is on linear algebra problems over finite fields, but most results are valid for entries from arbitrary fields.
Fast Computation of Special Resultants
, 2006
"... We propose fast algorithms for computing composed products and composed sums, as well as diamond products of univariate polynomials. These operations correspond to special multivariate resultants, that we compute using power sums of roots of polynomials, by means of their generating series. ..."
Abstract

Cited by 20 (8 self)
 Add to MetaCart
We propose fast algorithms for computing composed products and composed sums, as well as diamond products of univariate polynomials. These operations correspond to special multivariate resultants, that we compute using power sums of roots of polynomials, by means of their generating series.
Computing Popov and Hermite forms of polynomial matrices
 In International Symposium on Symbolic and Algebmic Computation, Zutich, .%isse
, 1996
"... For a polynomial matrix P(z) of degree d in M~,~(K[z]) where K is a commutative field, a reduction to the Hermite normal form can be computed in O (ndM(n) + M(nd)) arithmetic operations if M(n) is the time required to multiply two n x n matrices over K. Further, a reduction can be computed using O(l ..."
Abstract

Cited by 19 (10 self)
 Add to MetaCart
For a polynomial matrix P(z) of degree d in M~,~(K[z]) where K is a commutative field, a reduction to the Hermite normal form can be computed in O (ndM(n) + M(nd)) arithmetic operations if M(n) is the time required to multiply two n x n matrices over K. Further, a reduction can be computed using O(log~+ ’ (ml)) pamlel arithmetic steps and O(L(nd) ) processors if the same processor bound holds with time O (logX (rid)) for determining the lexicographically first maximal linearly independent subset of the set of the columns of an nd x nd matrix over K. These results are obtamed by applying in the matrix case, the techniques used in the scalar case of the gcd of polynomials.
Fast and processor efficient parallel matrix multiplication algorithms on a linear array with a reconfigurable pipelined bus system
 IEEE Trans. on Parallel and Distributed Systems
, 1998
"... Abstract—We present efficient parallel matrix multiplication algorithms for linear arrays with reconfigurable pipelined bus systems (LARPBS). Such systems are able to support a large volume of parallel communication of various patterns in constant time. An LARPBS can also be reconfigured into many i ..."
Abstract

Cited by 18 (9 self)
 Add to MetaCart
Abstract—We present efficient parallel matrix multiplication algorithms for linear arrays with reconfigurable pipelined bus systems (LARPBS). Such systems are able to support a large volume of parallel communication of various patterns in constant time. An LARPBS can also be reconfigured into many independent subsystems and, thus, is able to support parallel implementations of divideandconquer computations like Strassen’s algorithm. The main contributions of the paper are as follows: We develop five matrix multiplication algorithms with varying degrees of parallelism on the LARPBS computing model, namely, MM1, MM2, MM3, and compound algorithms &1 (�) and &2 (δ). Algorithm &1 (�) has adjustable time complexity in sublinear level. Algorithm &2 (δ) implies that it is feasible to achieve sublogarithmic time using o(N 3) processors for matrix multiplication on a realistic system. Algorithms MM3, &1 (�), and &2 (δ) all have o(N 3) cost and, hence, are very processor efficient. Algorithms MM1, MM3, and &1 (�) are generalpurpose matrix multiplication algorithms, where the array elements are in any ring. Algorithms MM2 and &2 (δ) are applicable to array elements that are integers of bounded magnitude, or floatingpoint values of bounded precision and magnitude, or Boolean values. Extension of algorithms MM2 and &2 (δ) to unbounded integers and reals are also discussed.
Geometric separators for finiteelement meshes
 SIAM J. Sci. Comput
, 1998
"... Abstract. We propose a class of graphs that would occur naturally in finiteelement and finitedifference problems and we prove a bound on separators for this class of graphs. Graphs in this class are embedded in ddimensional space in a certain manner. For ddimensional graphs our separator bound is ..."
Abstract

Cited by 18 (0 self)
 Add to MetaCart
Abstract. We propose a class of graphs that would occur naturally in finiteelement and finitedifference problems and we prove a bound on separators for this class of graphs. Graphs in this class are embedded in ddimensional space in a certain manner. For ddimensional graphs our separator bound is O(n (d−1)/d), which is the best possible bound. We also propose a simple randomized algorithm to find this separator in O(n) time. This separator algorithm can be used to partition the mesh among processors of a parallel computer and can also be used for the nested dissection sparse elimination algorithm.
Constructing Trees in Parallel
, 1989
"... O(log = log n processor as well as O(log n) = log n processor CREW deterministic parallel algorithms are presented for constructing Huffman codes from a given list of frequencies. The time can be reduced to O(logn(log log n) ) on a CRCW model, using only n processors. Also presented i ..."
Abstract

Cited by 17 (1 self)
 Add to MetaCart
O(log = log n processor as well as O(log n) = log n processor CREW deterministic parallel algorithms are presented for constructing Huffman codes from a given list of frequencies. The time can be reduced to O(logn(log log n) ) on a CRCW model, using only n processors. Also presented is an optimal O(log n) time, O(n= log n) processor EREW parallel algorithm for constructing a tree given a list of leaf depths when the depths are monotonic.