Results 11 - 20
of
22
FM-KZ: an even simpler alphabet-independent FM-index
- Czech Technical University, Prague
, 2006
"... Abstract. In an earlier work [6] we presented a simple FM-index variant, based on the idea of Huffman-compressing the text and then applying the Burrows-Wheeler transform over it. The main drawback of using Huffman was its lack of synchronizing properties, forcing us to supply another bit stream ind ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
Abstract. In an earlier work [6] we presented a simple FM-index variant, based on the idea of Huffman-compressing the text and then applying the Burrows-Wheeler transform over it. The main drawback of using Huffman was its lack of synchronizing properties, forcing us to supply another bit stream indicating the Huffman codeword boundaries. In this way, the resulting index needed O(n(H0 +1)) bits of space but with the constant 2 (concerning the main term). There are several options aiming to mitigate the overhead in space, with various effects on the query handling speed. In this work we propose Kautz-Zeckendorf coding as a both simple and practical replacement for Huffman. We dub the new index FM-KZ. We also present an efficient implementation of the rank operation, which is the main building brick of the FM-KZ. Experimental results show that our index provides an attractive space/time tradeoff in comparison with existing succinct data structures, and in the DNA test it even wins both in search time and space use. An additional asset of our solution is its relative simplicity. 1
On the Channel Capacity of Read/Write Isolated Memory
- Discrete Applied Math
, 1994
"... We apply graph theory to find upper and lower bounds on the channel capacity of a serial, binary, rewritable medium in which consecutive locations may not store 1's, and consecutive locations may not be altered during a single rewriting pass. If the true capacity is close to the upper bound, then a ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
We apply graph theory to find upper and lower bounds on the channel capacity of a serial, binary, rewritable medium in which consecutive locations may not store 1's, and consecutive locations may not be altered during a single rewriting pass. If the true capacity is close to the upper bound, then a trivial code is nearly optimal. 1 Introduction A serial, binary (0,1) memory is said to be read isolated if no two consecutive positions may store 1's; it is said to be write isolated if no two consecutive positions may be changed during rewriting. A read/write isolated memory (RWIM) is a binary, linearly ordered, rewritable storage medium obeying both restrictions. 1.1 Origin of the Problem The first restriction alone, no consecutive 1's, is typical of magnetic recording and has recurred in optical recording. The problem was first studied by Freiman and Wyner [1], and a subcase by Kautz [2]; they showed that the capacity was 0:694 . . . = log 2 OE bits per symbol, where OE is the larger...
On-line multiplication in real and complex base
- Proc. IEEE ARITH 16, I.E.E.E. Computer Society Press
, 2003
"... Multiplication of two numbers represented in base shown to be computable by an on-line algorithm when is a negative integer, a positive non-integer real number, or a complex number of the form ¡£ ¢ ¤ , where ¤ is a positive integer. 1 ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
Multiplication of two numbers represented in base shown to be computable by an on-line algorithm when is a negative integer, a positive non-integer real number, or a complex number of the form ¡£ ¢ ¤ , where ¤ is a positive integer. 1
Error Propagation Assessment of Enumerative Coding Schemes
- Proc. IEEE International Conference on Communications 2
, 1999
"... Introduction The technique of enumerative coding [1] makes it possible to translate source words into codewords and vice versa by invoking an algorithmic procedure rather than performing the translation with a look-up table. The usage of long codewords makes it possible to approach a code rate whic ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Introduction The technique of enumerative coding [1] makes it possible to translate source words into codewords and vice versa by invoking an algorithmic procedure rather than performing the translation with a look-up table. The usage of long codewords makes it possible to approach a code rate which is arbitrarily close to Shannon's noiseless capacity of the constrained channel. The risk of extreme error propagation precluded its usage in practical systems. Single channel bit errors may result in error propagation that could corrupt the entire data in the decoded word, and, of course, the longer the codeword the greater the number of data symbols affected. This article will evaluate the effects of error propagation of enumerative coding, where it is assumed that the constrained code is used in the conventional code configuration. It will be shown that when certain measures are taken, the average error propagation can be controlled to a level which is quite acceptable for many
Codes for Self-Clocking, AC-Coupled Transmission: Aspects of Synthesis and Analysis
- IBM J. Res. Develop
, 1975
"... Abstract: We consider NRZI waveform codes that satisfy a given set of run-length constraints and the upper bound on the accumulated dc charge of the waveform. These constraints enable the codeword to be self-clocking, ac-coupled, and suitable for data processing tape and communication applications. ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Abstract: We consider NRZI waveform codes that satisfy a given set of run-length constraints and the upper bound on the accumulated dc charge of the waveform. These constraints enable the codeword to be self-clocking, ac-coupled, and suitable for data processing tape and communication applications. Various aspects of synthesis and analysis of such codes, called (d, k, C) codes, are illustrated by means of several examples. The choice of the initial state of the encoder is shown to influence the length of the data sequence over which the encoder must look-ahead.
High-Rate Maximum Runlength Constrained Coding Schemes Using Nibble Replacement
"... Summary- We will present coding techniques for the character-constrained channel, where information is conveyed using q-bit characters (nibbles), where w prescribed characters are disallowed. Using codes for the characterconstrained channel, we present simple and systematic constructions of high-rat ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Summary- We will present coding techniques for the character-constrained channel, where information is conveyed using q-bit characters (nibbles), where w prescribed characters are disallowed. Using codes for the characterconstrained channel, we present simple and systematic constructions of high-rate binary maximum runlength constrained codes. The new constructions have the virtue that large look-up tables for encoding and decoding are not required. We will compare the error propagation performance of codes based on the new construction with that of prior art codes. I.
Data Synchronization with Timing
- IEEE Trans. Inform. Theory
"... This paper proposes and analyzes data synchronization techniques that not only resynchronize after encoded bits are corrupted by insertion, deletion or substitution errors, but also produce estimates of the time indices of the decoded data symbols, in order to determine their positions in the origin ..."
Abstract
- Add to MetaCart
This paper proposes and analyzes data synchronization techniques that not only resynchronize after encoded bits are corrupted by insertion, deletion or substitution errors, but also produce estimates of the time indices of the decoded data symbols, in order to determine their positions in the original source sequence. The techniques are based on block codes, and the estimates are of the time indices modulo some integer T , called the timing span, which is desired to be large. Several types of block codes that encode binary data are analyzed on the basis of the maximum attainable timing span for a given coding rate R (or, equivalently, redundancy ae = 1 \Gamma R) and permissible resynchronization delay D. It is found that relatively simple codes can asymptotically attain the maximum timing span among such block codes, which grows exponentially with delay, with exponent D(1 \Gamma R) + o(D). Thus large timing span can be attained with little redundancy and only moderate values of delay. Keywords cascaded codes, comma-free codes, embedded-index codes, natural marker codes, periodic prefix-synchronized (PPS) codes, prefix-synchronized codes, synchronization delay, sync-timing codes, timing span This work was supported by NSF Grants NCR-9415754 and CCR-9815006. Portions were published in the proceedings of the Data Compression Conference, Snowbird Utah, Mar. 1999, and of the 1999 IEEE Information Theory Workshop, Kruger National Park, South Africa, June 1999. 1 I.
Error Popagation Assessment of Enumerative Coding Schemes
, 1999
"... Enumerative coding is an attractive algorithmic procedure for translating long source words into codewords and vice versa. The usage of long codewords makes it possible to approach a code rate which is as close as desired to Shannon's noiseless capacity of the constrained channel. Enumerative encodi ..."
Abstract
- Add to MetaCart
Enumerative coding is an attractive algorithmic procedure for translating long source words into codewords and vice versa. The usage of long codewords makes it possible to approach a code rate which is as close as desired to Shannon's noiseless capacity of the constrained channel. Enumerative encoding is prone to massive error propagation as a single bit error could ruin entire decoded words. This contribution will evaluate the effects of error propagation of the enumerative coding of runlength-limited sequences. Index Terms---Enumerative coding, error propagation, RLL. I.
consisting of two four-state trellises. To this point, though, no new codes comparable to, for example, the rate
"... Enumerative coding is an attractive algorithmic procedure for translating long source words into codewords and vice versa. The usage of long codewords makes it possible to approach a code rate which is as close as desired to Shannon's noiseless capacity of the constrained channel. Enumerative encodi ..."
Abstract
- Add to MetaCart
Enumerative coding is an attractive algorithmic procedure for translating long source words into codewords and vice versa. The usage of long codewords makes it possible to approach a code rate which is as close as desired to Shannon's noiseless capacity of the constrained channel. Enumerative encoding is prone to massive error propagation as a single bit error could ruin entire decoded words. This contribution will evaluate the effects of error propagation of the enumerative coding of runlength-limited sequences. Index Terms---Enumerative coding, error propagation, RLL. I.
On the Capacity of Precision-Resolution Constrained Systems
"... Abstract — Arguably, the most famous constrained system is the (d, k)-RLL (Run-Length Limited), in which a stream of bits obeys the constraint that every two 1’s are separated by at least d 0’s, and there are no more than k consecutive 0’s anywhere in the stream. The motivation for this scheme comes ..."
Abstract
- Add to MetaCart
Abstract — Arguably, the most famous constrained system is the (d, k)-RLL (Run-Length Limited), in which a stream of bits obeys the constraint that every two 1’s are separated by at least d 0’s, and there are no more than k consecutive 0’s anywhere in the stream. The motivation for this scheme comes from the fact that certain sensor characteristics restrict the minimum time between adjacent 1’s or else the two will be merged in the receiver, while a clock drift between transmitter and receiver may cause spurious 0’s or missing 0’s at the receiver if too many appear consecutively. The interval-modulation scheme introduced by Mukhtar and Bruck extends the RLL constraint and implicitly suggests a way of taking advantage of higher-precision clocks. Their work however, deals only with an encoder/decoder construction. In this work we introduce a more general framework which we call the precision-resolution (PR) constrained system. In PR systems, the encoder has precision constraints, while the decoder has resolution constraints. We examine the capacity of PR systems and show the gain in the presence of a high-precision encoder (thus, we place the PR system with integral encoder, (p=1,α,θ)-PR, which turns out to be a simple extension of RLL, and the PR system with infinite-precision encoder, (∞,α,θ)-PR, on two ends of a continuum). We derive an exact expression for their capacity in terms of the precision p, the minimal resolvable measurement at the decoder α, and the decoder resolution factor θ. In an analogy to the RLL terminology these are the clock precision, the minimal time between peaks, and the clock drift. Surprisingly, even with an infinite-precision encoder, the capacity is finite. I.

