• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

Accessing Multiple Sequences Through Set Associative Caches (1999)

by Peter Sanders
Venue:In Proc
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 11
Next 10 →

Fast Priority Queues for Cached Memory

by Peter Sanders - ACM Journal of Experimental Algorithmics , 1999
"... This paper advocates the adaption of external memory algorithms to this purpose. This idea and the practical issues involved are exemplified by engineering a fast priority queue suited to external memory and cached memory that is based on k-way merging. It improves previous external memory algorithm ..."
Abstract - Cited by 46 (6 self) - Add to MetaCart
This paper advocates the adaption of external memory algorithms to this purpose. This idea and the practical issues involved are exemplified by engineering a fast priority queue suited to external memory and cached memory that is based on k-way merging. It improves previous external memory algorithms by constant factors crucial for transferring it to cached memory. Running in the cache hierarchy of a workstation the algorithm is at least two times faster than an optimized implementation of binary heaps and 4-ary heaps for large inputs

Towards a Theory of Cache-Efficient Algorithms

by Sandeep Sen, Siddhartha Chatterjee, Neeraj Dumir , 1999
"... We describe a model that enables us to analyze the running time of an algorithm in a computer with a memory hierarchy with limited associativity, in terms of various cache parameters. Our model, an extension of Aggarwal and Vitter's I/O model, enables us to establish useful relationships between the ..."
Abstract - Cited by 43 (3 self) - Add to MetaCart
We describe a model that enables us to analyze the running time of an algorithm in a computer with a memory hierarchy with limited associativity, in terms of various cache parameters. Our model, an extension of Aggarwal and Vitter's I/O model, enables us to establish useful relationships between the cache complexity and the I/O complexity of computations. As a corollary, we obtain cache-optimal algorithms for some fundamental problems like sorting, FFT, and an important subclass of permutations in the single-level cache model. We also show that ignoring associativity concerns could lead to inferior performance, by analyzing the average-case cache behavior of mergesort. We further extend our model to multiple levels of cache with limited associativity and present optimal algorithms for matrix transpose and sorting. Our techniques may be used for systematic exploitation of the memory hierarchy starting from the algorithm design stage, and dealing with the hitherto unresolved problem of l...

Efficient Sorting Using Registers and Caches

by Lars Arge, Jeff Chase, Jeffrey S. Vitter, Rajiv Wickremesinghe - in Proceedings of the 4th Workshop on Algorithm Engineering (WAE 2000 , 2000
"... Modern computer systems have increasingly complex memory systems.Common machine models for algorithm analysis do not reflect many of the features... ..."
Abstract - Cited by 18 (5 self) - Add to MetaCart
Modern computer systems have increasingly complex memory systems.Common machine models for algorithm analysis do not reflect many of the features...

Adapting Radix Sort to the Memory Hierarchy

by Naila Rahman, Rajeev Raman - In ALENEX, Workshop on Algorithm Engineering and Experimentation , 2000
"... this paper, we focus on one such: the integer sorting algorithm least signicant bit (LSB) radix sort. LSB radix sort sorts w-bit integer keys with an r-bit radix in O(dw=re(n+2 ..."
Abstract - Cited by 13 (2 self) - Add to MetaCart
this paper, we focus on one such: the integer sorting algorithm least signicant bit (LSB) radix sort. LSB radix sort sorts w-bit integer keys with an r-bit radix in O(dw=re(n+2

Efficient sorting using registers and caches

by Rajiv Wickremesinghe, Lars Arge, Jeff Chase, Jeffrey Scott Vitter - WAE, WORKSHOP ON ALGORITHM ENGINEERING , LECTURE NOTES IN COMPUTER SCIENCE , 2000
"... Modern computer systems have increasingly complex memory systems. Common machine models for algorithm analysis do not reflect many of the features of these systems, e.g., large register sets, lockup-free caches, cache hierarchies, associativity, cache line fetching, and streaming behavior. Inadequat ..."
Abstract - Cited by 7 (0 self) - Add to MetaCart
Modern computer systems have increasingly complex memory systems. Common machine models for algorithm analysis do not reflect many of the features of these systems, e.g., large register sets, lockup-free caches, cache hierarchies, associativity, cache line fetching, and streaming behavior. Inadequate models lead to poor algorithmic choices and an incomplete understanding of algorithm behavior on real machines. A key step toward developing better models is to quantify the performance effects of features not reflected in the models. This paper explores the effect of memory system features on sorting performance. We introduce a new cache-conscious sorting algorithm, R-merge, which achieves better performance in practice over algorithms that are superior in the theoretical models. R-merge is designed to minimize memory stall cycles rather than cache misses by considering features common to many system designs.

Algorithm Engineering for Parallel Computation

by David A. Bader, Bernard M. E. Moret, Peter Sanders , 2002
"... ..."
Abstract - Cited by 7 (4 self) - Add to MetaCart
Abstract not found

Scanning Multiple Sequences Via Cache Memory

by Kurt Mehlhorn, Peter Sanders - Algorithmica , 2003
"... We consider the simple problem of scanning multiple sequences. There are k sequences of total length N which are to be scanned concurrently. One pointer into each sequence is maintained and an adversary specifies which pointer is to be advanced. The concept of scanning multiple sequence is ubiquitou ..."
Abstract - Cited by 5 (0 self) - Add to MetaCart
We consider the simple problem of scanning multiple sequences. There are k sequences of total length N which are to be scanned concurrently. One pointer into each sequence is maintained and an adversary specifies which pointer is to be advanced. The concept of scanning multiple sequence is ubiquitous in algorithms designed for hierarchical memory.

Random Arc Allocation and Applications to Disks, Drums and DRAMs

by Peter Sanders, Peter S, Berthold Vöcking , 2001
"... The paper considers a generalization of the well known random placement of balls into bins. ..."
Abstract - Cited by 1 (0 self) - Add to MetaCart
The paper considers a generalization of the well known random placement of balls into bins.

Tail bounds and expectations for random arc allocation and applications

by Peter S, Berthold Vöcking - Combinatorics, Probability and Computing
"... The paper considers a generalization of the well known random placement of balls into bins. Given n circular arcs of lengths αi, 0 ¡ ¢ i n we study the maximum number of overlapping arcs on a circle if the starting points of the arcs are chosen randomly. We give almost exact tail bounds on the maxim ..."
Abstract - Cited by 1 (0 self) - Add to MetaCart
The paper considers a generalization of the well known random placement of balls into bins. Given n circular arcs of lengths αi, 0 ¡ ¢ i n we study the maximum number of overlapping arcs on a circle if the starting points of the arcs are chosen randomly. We give almost exact tail bounds on the maximum overlap of the arcs. These tail bounds yield a complete characterization of the expected maximum overlap that is tight up to constant factors in the lower order terms. We illustrate the strength of our results by presenting new performance guarantees for several application: Minimizing rotational delays of disks, scheduling accesses to parallel disks and allocating memory to limit cache interference misses.

Tail bounds and expectations for random arc allocation and applications

by Peter Sanders, Berthold Vöcking - COMBINATORICS, PROBABILITY AND COMPUTING , 2002
"... The paper considers a generalization of the well known random placement of balls into bins. Given n circular arcs of lengths αi, 0 ¡ ¢ i n we study the maximum number of overlapping arcs on a circle if the starting points of the arcs are chosen randomly. We give almost exact tail bounds on the maxim ..."
Abstract - Cited by 1 (0 self) - Add to MetaCart
The paper considers a generalization of the well known random placement of balls into bins. Given n circular arcs of lengths αi, 0 ¡ ¢ i n we study the maximum number of overlapping arcs on a circle if the starting points of the arcs are chosen randomly. We give almost exact tail bounds on the maximum overlap of the arcs. These tail bounds yield a complete characterization of the expected maximum overlap that is tight up to constant factors in the lower order terms. We illustrate the strength of our results by presenting new performance guarantees for several application: Minimizing rotational delays of disks, scheduling accesses to parallel disks and allocating memory to limit cache interference misses.
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University