## The SBC-Tree: An Index for Run-Length Compressed Sequences

### Cached

### Download Links

Citations: | 4 - 0 self |

### BibTeX

@MISC{Eltabakh_thesbc-tree:,

author = {Mohamed Y. Eltabakh and Wing-kai Hon and Rahul Shah and Walid G. Aref and Jeffrey S. Vitter},

title = {The SBC-Tree: An Index for Run-Length Compressed Sequences},

year = {}

}

### OpenURL

### Abstract

Run-Length-Encoding (RLE) is a data compression technique that is used in various applications, e.g., time series, biological sequences, and multimedia databases. One of the main challenges is how to operate on (e.g., index, search, and retrieve) compressed data without decompressing it. In this paper, we introduce the String B-tree for Compressed sequences, termed the SBC-tree, for indexing and searching RLE-compressed sequences of arbitrary length. The SBCtree is a two-level index structure based on the well-known String B-tree and a 3-sided range query structure [7]. The SBC-tree supports pattern matching queries such as substring matching, prefix matching, and range search operations over RLE-compressed sequences. The SBC-tree has an optimal external-memory space complexity of O(N/B) pages, where N is the total length of the compressed sequences, and B is the disk page size. Substring matching, prefix matching, and range search execute in an optimal O(logB N + |p|+T) I/O operations, where |p | is the