## Speeding up External Mergesort (0)

Venue: | IEEE Transactions on Knowledge and Data Engineering |

Citations: | 21 - 0 self |

### BibTeX

@ARTICLE{Zheng_speedingup,

author = {Luoquan Zheng and Per-Åke Larson and Per- Ake Larson},

title = {Speeding up External Mergesort},

journal = {IEEE Transactions on Knowledge and Data Engineering},

year = {},

volume = {8}

}

### OpenURL

### Abstract

External mergesort is normally implemented so that each run is stored contiguously on disk and blocks of data are read exactly in the order they are needed during merging. We investigate two ideas for improving the performance of external mergesort: interleaved layout and a new reading strategy. Interleaved layout places blocks from different runs in consecutive disk addresses. This is done in the hope that interleaving will reduce seek overhead during merging. The new reading strategy precomputes the order in which data blocks are to be read according to where they are located on disk and when they are needed for merging. Extra buffer space makes it possible to read blocks in an order that reduces seek overhead, instead of reading them exactly in the order they are needed for merging. A detailed simulation model was used to compare the two layout strategies and three reading strategies. The effects of using multiple work disks were also investigated. We found that, in most cases, inte...

### Citations

4375 | Computer Architecture: A Quantitative Approach - Hennessy, Patterson - 1996 |

563 |
The input/output complexity of sorting and related problems
- Aggarwal, Vitter
- 1988
(Show Context)
Citation Context ...parate partition. Combining the outputs of all the processors gives the final results. Quinn [12], Iyer et al. [7] and Varman et al. [15] investigated algorithms for partitioning. Aggarwal and Vitter =-=[1]-=- showed that mergesort is an optimal external sorting method (up to a constant factor) in the total number of I/O operations required. They also studied the use of P disks to obtain I/O concurrency. H... |

139 |
The Art of Computer Programming, Volume 3
- Knuth
- 1973
(Show Context)
Citation Context ...n. Files are often maintained sorted on a key attribute in order to facilitate searching and processing. External sorting refers to sorting more data than can be held in memory at one time. Mergesort =-=[8]-=- is the algorithm most commonly used for external sorting. Mergesort consists two phases: run formation and merging. In the run formation phase, the data to be sorted is divided into smaller sorted se... |

90 |
Parallel Sorting Algorithms
- Akl
- 1985
(Show Context)
Citation Context ...nput buffers and to have two input buffers per run. Since sorting is such a time consuming task, exploiting parallelism is a natural next step. A good introduction to parallel sorting can be found in =-=[2]-=- and [5]. One of the strategies proposed to improve the performance of the merge phase is to have many processors organized into a tree. Each leaf processor merges a portion of the initial runs and pa... |

17 |
Sorting Large Files on a Backend Multiprocessor.Technical Report 86-741,Department
- Beck, Bitton, et al.
- 1986
(Show Context)
Citation Context ...is obtained at the root. Parallelism is exploited both through pipelining merge steps between levels of the tree, and through concurrent merging performed by processors on the same level. Beck et al. =-=[4]-=- implemented this idea on a number of backend processors. Due to hardware limitations, very small buffers (1Kb) were used. They also studied different layout strategies. However, their findings are he... |

15 |
FastSort; A Distributed Single-Input Single-Output External Sort
- Salzberg, Tsukerman, et al.
- 1990
(Show Context)
Citation Context ...ue to hardware limitations, very small buffers (1Kb) were used. They also studied different layout strategies. However, their findings are heavily influenced by the small buffer size. Salzberg et al. =-=[14]-=- also studied this idea applied to a network of loosely-coupled processors where each processor has local disks and a large amount of main memory. They recommend using enough memory and processors so ... |

11 |
Prefetching with multiple disks for external mergesort: simulation and analysis
- Pai, Varman
- 1992
(Show Context)
Citation Context ...e number of I/O operations, it transfers the whole set of data in and out of main memory again. A more practical approach is to use more buffer space to hold the data blocks in memory. Pai and Varman =-=[11]-=- used a Markov chain to model prefetching in a multiple-disk environment. Only the merge phase is studied. Their study is mostly analytical, and the number of I/O operations is the only cost measured.... |

9 |
Parallel sorting algorithms for tightly coupled multiprocessors
- Quinn
- 1988
(Show Context)
Citation Context ...se is to split each sorted run into range-disjoint partitions, and have each processor merge data from a separate partition. Combining the outputs of all the processors gives the final results. Quinn =-=[12]-=-, Iyer et al. [7] and Varman et al. [15] investigated algorithms for partitioning. Aggarwal and Vitter [1] showed that mergesort is an optimal external sorting method (up to a constant factor) in the ... |

8 |
Analysis and Implementation of Parallel External Sorting Algorithms
- Bitton, Design
- 1981
(Show Context)
Citation Context ...fers and to have two input buffers per run. Since sorting is such a time consuming task, exploiting parallelism is a natural next step. A good introduction to parallel sorting can be found in [2] and =-=[5]-=-. One of the strategies proposed to improve the performance of the merge phase is to have many processors organized into a tree. Each leaf processor merges a portion of the initial runs and passes the... |

8 |
Greed Sort: An Optimal External Sorting Algorithm for Multiple Disks
- Nodine, Vitter
- 1990
(Show Context)
Citation Context ...itter and Shriver [17] extended the analysis to the case where the P blocks in a parallel operation are stored on different disks. Under this model, an optimal deterministic algorithm is presented in =-=[10]-=-. The basic idea of the algorithm is to distribute the data blocks from each run over the P disks and, during the merge phase, the block that is the earliest to be used in the merge process from each ... |

7 |
The I/O Performance of Multiway Mergesort and Tag Sort
- Kwan, Baer
- 1985
(Show Context)
Citation Context ... less on external sorting. Early studies of external sorting focused on using tape as the secondary storage device. Knuth [8] provides extensive coverage of the fundamentals of sorting. Kwan and Baer =-=[9]-=- studied the I/O performance of k-way mergesort. Their disk model assumes that seek time is proportional to the seek distance and rotational latency of half a revolution is charged to every disk acces... |

7 |
Merging Multiple Lists on Hierarchical-Memory Multiprocessors
- Varman, Scheufler, et al.
(Show Context)
Citation Context ...e-disjoint partitions, and have each processor merge data from a separate partition. Combining the outputs of all the processors gives the final results. Quinn [12], Iyer et al. [7] and Varman et al. =-=[15]-=- investigated algorithms for partitioning. Aggarwal and Vitter [1] showed that mergesort is an optimal external sorting method (up to a constant factor) in the total number of I/O operations required.... |

4 |
Performance comparison of distributive and mergesort as external sorting algorithms
- Verkamo
- 1989
(Show Context)
Citation Context ...onal latency together dominate the transfer time. Their conclusion is that minimizing the number of merge passes by selecting k as large as possible does not always give the best performance. Verkamo =-=[16]-=- compared the performance of mergesort and distributive sorting. He also pointed out that the internal sort phase and the external merge phase need not use the same amount of memory. A space-time inte... |

3 |
Merging Sorted Runs Using Large
- Salzberg
- 1989
(Show Context)
Citation Context ...le system. Furthermore, we assume that the sort can be completed in a single merge pass. Given today's main memory sizes and the use of disks, it is hardly ever necessary to use multiple merge passes =-=[13]-=-. The rest of the paper is organized as follows. The next section gives a brief summary of previous work of direct relevance to this study. In Section 3, the traditional mergesort algorithm is analyze... |

1 |
Varman P.J., Percentile Finding Algorithm for Multiple Sorted Runs
- Iyer, Ricard
- 1989
(Show Context)
Citation Context ...h sorted run into range-disjoint partitions, and have each processor merge data from a separate partition. Combining the outputs of all the processors gives the final results. Quinn [12], Iyer et al. =-=[7]-=- and Varman et al. [15] investigated algorithms for partitioning. Aggarwal and Vitter [1] showed that mergesort is an optimal external sorting method (up to a constant factor) in the total number of I... |

1 |
Speeding up External Mergesort, Master's thesis, University of Waterloo, A Modeling Disk Access Time In this appendix, we present in detail our model for estimating disk access time. The model relies only on parameters generally provided by disk manufactu
- Zheng
(Show Context)
Citation Context ...R p := M i ; endfor ; Figure 1: Heuristic algorithm for computing read schedules. We actually tested three different heuristic algorithms. The one in Figure 1 consistently outperformed the other two (=-=[18]-=-). The algorithm attempts to minimize the number of seeks, but does not take into account seek distance or rotational delay. M i is moved to an earlier position if it can be read together with some ot... |