Results 1 
8 of
8
Tuning Strassen's Matrix Multiplication for Memory Efficiency
 IN PROCEEDINGS OF SC98 (CDROM
, 1998
"... Strassen's algorithm for matrix multiplication gains its lower arithmetic complexity at the expense of reduced locality of reference, which makes it challenging to implement the algorithm efficiently on a modern machine with a hierarchical memory system. We report on an implementation of thi ..."
Abstract

Cited by 38 (4 self)
 Add to MetaCart
Strassen's algorithm for matrix multiplication gains its lower arithmetic complexity at the expense of reduced locality of reference, which makes it challenging to implement the algorithm efficiently on a modern machine with a hierarchical memory system. We report on an implementation of this algorithm that uses several unconventional techniques to make the algorithm memoryfriendly. First, the algorithm internally uses a nonstandard array layout known as Morton order that is based on a quadtree decomposition of the matrix. Second, we dynamically select the recursion truncation point to minimize padding without affecting the performance of the algorithm, which we can do by virtue of the cache behavior of the Morton ordering. Each technique is critical for performance, and their combination as done in our code multiplies their effectiveness. Performance comparisons of our implementation with that of competing implementations show that our implementation often outperforms th...
Matrix Algorithms using Quadtrees
 IN PROC. ATABLE92
, 1992
"... Many scheduling and synchronization problems for largescale multiprocessing can be overcome using functional (or applicative) programming. With this observation, it is strange that so much attention within the functional programming community has focused on the "aggregate update problem" ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
Many scheduling and synchronization problems for largescale multiprocessing can be overcome using functional (or applicative) programming. With this observation, it is strange that so much attention within the functional programming community has focused on the "aggregate update problem" [10]: essentially how to implement FORTRAN arrays. This situation is strange because inplace updating of aggregates belongs more to uniprocessing than to mathematics. Several years ago functional style drew me to treatment of ddimensional arrays as 2^dary trees; in particular, matrices become quaternary trees or quadtrees. This convention yields efficient recopyingcumupdate of any array; recursive, algebraic decomposition of conventional arithmetic algorithms; and uniform representations and algorithms for both dense and sparse matrices. For instance, any nonsingular subtree is a candidate as the pivot block for Gaussian elimination; the restriction actually helps identification of pivot b...
PivotFree Block Matrix Inversion
"... We present a pivotfree deterministic algorithm for the inversion of block matrices. The method is based on the MoorePenrose inverse and is applicable over certain general classes of rings. This improves on previous methods that required at least one invertible ondiagonal block, and that otherwise ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
We present a pivotfree deterministic algorithm for the inversion of block matrices. The method is based on the MoorePenrose inverse and is applicable over certain general classes of rings. This improves on previous methods that required at least one invertible ondiagonal block, and that otherwise required row or columnbased pivoting, disrupting the block structure. Our method is applicable to any invertible matrix and does not require any particular blocks to invertible. This is achieved at the cost of two additional specialized matrix multiplications and, in some cases, requires the inversion to be performed in an extended ring.
Block Matrix Inversion
"... Motivation • Matrices may be represented as quadtrees, using a recursive 2 × 2 block structure with allzero matrices given by a null pointer. This representation has been studied earlier in the context of computer algebra [1]. • This representation is convenient for reasonably efficient storage in ..."
Abstract
 Add to MetaCart
Motivation • Matrices may be represented as quadtrees, using a recursive 2 × 2 block structure with allzero matrices given by a null pointer. This representation has been studied earlier in the context of computer algebra [1]. • This representation is convenient for reasonably efficient storage in diverse cases, when it is not known whether matrices will be dense, sparse or structured. • It is also convenient for reasonably efficient communication in diverse cases, when matrices may be accessed by row, column or randomly. • This representation naturally supports asymptotically fast block algorithms, such as Strassen’s matrix multiplication [2]. • We are interested in measuring the performance of this representation for use in generic libraries [3, 4].
Efficient Parallel Solutions of Indexed Recurrences with Linear Combinations
, 1997
"... We consider a certain generalization of the well known 2nd order linear recurrences X i = a i \Delta X i\Gamma1 + b i \Delta X i\Gamma2 i = 1 : : : n to indexed recurrences with linear combinations X g(i) = a i \Delta X f(i) + b i \Delta X h(i) , where g(i); f(i); h(i) are arbitrary functions from ..."
Abstract
 Add to MetaCart
We consider a certain generalization of the well known 2nd order linear recurrences X i = a i \Delta X i\Gamma1 + b i \Delta X i\Gamma2 i = 1 : : : n to indexed recurrences with linear combinations X g(i) = a i \Delta X f(i) + b i \Delta X h(i) , where g(i); f(i); h(i) are arbitrary functions from [1; : : : ; n] to [1; : : : ; m]. The problem is to find an efficient parallel algorithm that can compete with the sequential execution of the loop for i = 1; : : : ; n do X [g(i)] = a i \Delta X [f(i)] + b i \Delta X [h(i)] : which solve the above recurrence generalization. Such an algorithm (that uses only O(n) work) can be used for automatic parallelization of sequential loops, which in many practical cases fit to the form of the above generalized recurrence. A natural solution is to transform the above sequential loop into a set of matrix multiplications and use the associative property of matrix multiplications to compute the result in log n parallel steps. We show that unlike the ca...
iii
"... The Office of Graduate Studies has verified and approved the above named committee members. ii ACKNOWLEDGEMENTS This thesis owes its existence to my major professor Dr. Robert van Engelen who showed faith in me and gave me a great opportunity to do research under him. It is because of his guidance a ..."
Abstract
 Add to MetaCart
The Office of Graduate Studies has verified and approved the above named committee members. ii ACKNOWLEDGEMENTS This thesis owes its existence to my major professor Dr. Robert van Engelen who showed faith in me and gave me a great opportunity to do research under him. It is because of his guidance and endurance that I was able to bring this thesis to the shape it currently is in. I would also like to thank Dr. Lois Hawkes and Dr. Xin Yuan for serving in my graduate committe and provide me valuable guidance in my academic coursework. I feel indebted my friend and senior Prasad Kulkarni. It wouldn’t have been possible to come to U.S.A. for graduate studies and complete the term of studies without his guidance, support and encouragement. My parent and sister have also been a great source of strength and support to me. I am grateful to them for their perseverance, without which I couldn’t have imagined what life would have been for me.
TermPropagation over Structured Data using . . .
, 2010
"... In dieser Diplomarbeit stellen wir pest (termpropagation using eigenvector computation over structured data) vor, einen neuen Ansatz für approximative Suche auf strukturierten Daten. pest nutzt die Struktur der Daten aus, um Termgewichte zwischen Objekten, die miteinander in Beziehung stehen, zu pr ..."
Abstract
 Add to MetaCart
In dieser Diplomarbeit stellen wir pest (termpropagation using eigenvector computation over structured data) vor, einen neuen Ansatz für approximative Suche auf strukturierten Daten. pest nutzt die Struktur der Daten aus, um Termgewichte zwischen Objekten, die miteinander in Beziehung stehen, zu propagieren. Dabei gehen wir speziell auf solche strukturierten Daten ein, bei denen sinnvolle Antworten bereits durch die anwendungsbezogene Semantik gegeben sind, z.B. bei Seiten in Wikis oder Personen in sozialen Netzwerken. Die pestMatrix verallgemeinert die bei PageRank verwendete GoogleMatrix durch Faktoren, die von den Termgewichten abhängen, und ermöglicht die unterschiedlich starke Gewichtung (semantischer) Ähnlichkeit für verschiedene Beziehungen in den Daten, z.B. Freund vs. Arbeitskollege in einem sozialen Netzwerk. Die Eigenvektoren dieser pestMatrix stellen die Verteilung der Terme nach der Propagation dar. Die Eigenvektoren aller Terme zusammen bilden einen VektorraumIndex, der die Struktur der Daten einbezieht und der mit Standardtechniken des Information Retrieval gehandhabt werden kann. In umfassenden Experimenten mit einem Wiki aus dem wirklichen Leben zeigen wir, wie pest die Qualität der Suchergebnisse im Vergleich zu mehreren existierenden Ansätzen verbessert. Außerdem stellen wir anhand von Experimenten mit einer sozialen LesezeichenPlattform dar, wie
BLOCK RECURSIVE COMPUTATION OF GENERALIZED INVERSES ∗
"... Abstract. A fully block recursive method for computing outer generalized inverses of given square matrix is introduced. The method is applicable even in the case when some of main diagonal minors of A are singular or A is singular. Computational complexity of the method is not harder than the matrix ..."
Abstract
 Add to MetaCart
Abstract. A fully block recursive method for computing outer generalized inverses of given square matrix is introduced. The method is applicable even in the case when some of main diagonal minors of A are singular or A is singular. Computational complexity of the method is not harder than the matrix multiplication, under the assumption that the Strassen matrix inversion algorithm is used. A partially recursive algorithm for computing various classes of generalized inverses is also developed. This method can be efficiently used for the acceleration of the known methods for computing generalized inverses. Key words. MoorePenroseinverse, Outerinverses, BanachiewiczSchur form,Strassen method. AMS subject classifications. 15A09.