Results 1  10
of
29
Applied Numerical Linear Algebra
 Society for Industrial and Applied Mathematics
, 1997
"... We survey general techniques and open problems in numerical linear algebra on parallel architectures. We rst discuss basic principles of parallel processing, describing the costs of basic operations on parallel machines, including general principles for constructing e cient algorithms. We illustrate ..."
Abstract

Cited by 525 (26 self)
 Add to MetaCart
We survey general techniques and open problems in numerical linear algebra on parallel architectures. We rst discuss basic principles of parallel processing, describing the costs of basic operations on parallel machines, including general principles for constructing e cient algorithms. We illustrate these principles using current architectures and software systems, and by showing how one would implement matrix multiplication. Then, we present direct and iterative algorithms for solving linear systems of equations, linear least squares problems, the symmetric eigenvalue problem, the nonsymmetric eigenvalue problem, and the singular value decomposition. We consider dense, band and sparse matrices.
An Updated Set of Basic Linear Algebra Subprograms (BLAS)
 ACM Transactions on Mathematical Software
, 2001
"... This paper summarizes the BLAS Technical Forum Standard, a speci #cation of a set of kernel routines for linear algebra, historically called the Basic Linear Algebra Subprograms and commonly known as the BLAS. The complete standard can be found in #1#, and on the BLAS Technical Forum webpage #http: ..."
Abstract

Cited by 72 (7 self)
 Add to MetaCart
This paper summarizes the BLAS Technical Forum Standard, a speci #cation of a set of kernel routines for linear algebra, historically called the Basic Linear Algebra Subprograms and commonly known as the BLAS. The complete standard can be found in #1#, and on the BLAS Technical Forum webpage #http:##www.netlib.org#blas#blastforum##
Faster Numerical Algorithms via Exception Handling
, 1993
"... this paper we explore the use of this paradigm in the design of numerical algorithms. We exploit the fact that there are numerical algorithms that run quickly and usually give the right answer as well as other, slower, algorithms that are always right. By "right answer" we mean that the algorithm is ..."
Abstract

Cited by 44 (8 self)
 Add to MetaCart
this paper we explore the use of this paradigm in the design of numerical algorithms. We exploit the fact that there are numerical algorithms that run quickly and usually give the right answer as well as other, slower, algorithms that are always right. By "right answer" we mean that the algorithm is stable, or that it computes the exact answer for a problem that is a slight perturbation of its input [9]; this is all we can reasonably ask of most algorithms. To take advantage of the faster but occasionally unstable algorithms, we will use the following paradigm: (1) Use the fast algorithm to compute an answer; this will usually be done stably. (2) uickly and reliably assess the accuracy of the computed answer. (3) In the unlikely event the answer is not accurate enough, recompute it slowly but accurately.
Enhanced Word Clustering for Hierarchical Text Classification
, 2002
"... In this paper we propose a new informationtheoretic divisive algorithm for word clustering applied to text classification. In previous work, such "distributional clustering" of features has been found to achieve improvements over feature selection in terms of classification accuracy, especially at ..."
Abstract

Cited by 44 (2 self)
 Add to MetaCart
In this paper we propose a new informationtheoretic divisive algorithm for word clustering applied to text classification. In previous work, such "distributional clustering" of features has been found to achieve improvements over feature selection in terms of classification accuracy, especially at lower number of features [2, 28]. However the existing clustering techniques are agglomerative in nature and result in (i) suboptimal word clusters and (ii) high computational cost. In order to explicitly capture the optimality of word clusters in an information theoretic framework, we first derive a global criterion for feature clustering. We then present a fast, divisive algorithm that monotonically decreases this objective function value, thus converging to a local minimum. We show that our algorithm minimizes the "withincluster JensenShannon divergence" while simultaneously maximizing the "betweencluster JensenShannon divergence". In comparison to the previously proposed agglomerative strategies our divisive algorithm achieves higher classification accuracy especially at lower number of features. We further show that feature clustering is an effective technique for building smaller class models in hierarchical classification. We present detailed experimental results using Naive Bayes and Support Vector Machines on the 20 Newsgroups data set and a 3level hierarchy of HTML documents collected from Dmoz Open Directory.
Orthogonal Eigenvectors and Relative Gaps
, 2002
"... Let LDLt be the triangular factorization of a real symmetric n\Theta n tridiagonal matrix so that L is a unit lower bidiagonal matrix, D is diagonal. Let (*; v) be an eigenpair, * 6 = 0, with the property that both * and v are determined to high relative accuracy by the parameters in L and D. Suppo ..."
Abstract

Cited by 38 (16 self)
 Add to MetaCart
Let LDLt be the triangular factorization of a real symmetric n\Theta n tridiagonal matrix so that L is a unit lower bidiagonal matrix, D is diagonal. Let (*; v) be an eigenpair, * 6 = 0, with the property that both * and v are determined to high relative accuracy by the parameters in L and D. Suppose also that the relative gap between * and its nearest neighbor _ in the spectrum exceeds 1=n; nj * \Gamma _j? j*j. This paper presents a new O(n) algorithm and a proof that, in the presence of roundoff error, the algorithm computes an approximate eigenvector ^v that is accurate to working precision: j sin "(v; ^v)j = O(n"), where " is the roundoff unit. It follows that ^v is numerically orthogonal to all the other eigenvectors. This result forms part of a program to compute numerically orthogonal eigenvectors without resorting to the GramSchmidt process. The contents of this paper provide a highlevel description and theoretical justification for LAPACK (version 3.0) subroutine DLAR1V.
A New O(n²) Algorithm for the Symmetric Tridiagonal Eigenvalue/Eigenvector Problem
 In progress
, 1997
"... ..."
A Serial Implementation of Cuppen's Divide and Conquer Algorithm for the Symmetric Eigenvalue Problem
, 1994
"... This report discusses a serial implementation of Cuppen's divide and conquer algorithm for computing all eigenvalues and eigenvectors of a real symmetric matrix T = Q Q T. This method is compared with the LAPACK implementations of QR, bisection/inverse iteration, and rootfree QR/inverse iteration t ..."
Abstract

Cited by 24 (0 self)
 Add to MetaCart
This report discusses a serial implementation of Cuppen's divide and conquer algorithm for computing all eigenvalues and eigenvectors of a real symmetric matrix T = Q Q T. This method is compared with the LAPACK implementations of QR, bisection/inverse iteration, and rootfree QR/inverse iteration to nd all of the eigenvalues and eigenvectors. On a DEC Alpha using optimized Basic Linear Algebra Subroutines (BLAS), divide and conquer was uniformly the fastest algorithm by a large margin for large tridiagonal eigenproblems. When Fortran BLAS were used, bisection/inverse iteration was somewhat faster (up to a factor of 2) for very large matrices (n 500) without clustered eigenvalues. When eigenvalues were clustered, divide and conquer was up to 80 times faster. The speedups over QR were so large in the tridiagonal case that the overall problem, including reduction to tridiagonal form, sped up by a factor of 2.5 over QR for n 500. Nearly universally, the matrix of eigenvectors generated by divide and con
On The Correctness Of Some BisectionLike Parallel Eigenvalue Algorithms In Floating Point Arithmetic
 Electronic Trans. Num. Anal
, 1995
"... Bisection is a parallelizable method for finding the eigenvalues of real symmetric tridiagonal matrices, or more generally symmetric acyclic matrices. ..."
Abstract

Cited by 13 (4 self)
 Add to MetaCart
Bisection is a parallelizable method for finding the eigenvalues of real symmetric tridiagonal matrices, or more generally symmetric acyclic matrices.
Reliable Computation of the Condition Number of a Tridiagonal Matrix in O(n) Time
, 1997
"... We present one more algorithm to compute the condition number (for inversion) of a n \Theta n tridiagonal matrix J in O(n) time. Previous O(n) algorithms for this task given by Higham in [17] are based on the tempting compact representation of the upper (lower) triangle of J \Gamma1 as the upper (lo ..."
Abstract

Cited by 13 (1 self)
 Add to MetaCart
We present one more algorithm to compute the condition number (for inversion) of a n \Theta n tridiagonal matrix J in O(n) time. Previous O(n) algorithms for this task given by Higham in [17] are based on the tempting compact representation of the upper (lower) triangle of J \Gamma1 as the upper (lower) triangle of a rankone matrix. However, they suffer from severe overflow and underflow problems, especially on diagonally dominant matrices. Our new algorithm avoids these problems and is as efficient as the earlier algorithms. Keywords. Tridiagonal matrix, inverse, condition number, norm, overflow, underflow. AMS subject classifications. 15A12, 15A60, 65F35. 1 Introduction When solving a linear system Bx = r we are interested in knowing how accurate the solution is. This question is often answered by showing that the solution computed in finite precision is exact for a matrix "close" to B, and then measuring how sensitive the solution is to a small perturbation. The condition numb...