## Robust Universal Complete Codes for Transmission and Compression (1996)

Venue: | Discrete Applied Mathematics |

Citations: | 10 - 4 self |

### BibTeX

@ARTICLE{Fraenkel96robustuniversal,

author = {Aviezri S. Fraenkel and Shmuel T. Klein},

title = {Robust Universal Complete Codes for Transmission and Compression},

journal = {Discrete Applied Mathematics},

year = {1996},

volume = {64},

pages = {31--55}

}

### OpenURL

### Abstract

Several measures are defined and investigated, which allow the comparison of codes as to their robustness against errors. Then new universal and complete sequences of variable-length codewords are proposed, based on representing the integers in a binary Fibonacci numeration system. Each sequence is constant and need not be generated for every probability distribution. These codes can be used as alternatives to Huffman codes when the optimal compression of the latter is not required, and simplicity, faster processing and robustness are preferred. The codes are compared on several "real-life" examples. 1. Motivation and Introduction Let A = fA 1 ; A 2 ; \Delta \Delta \Delta ; An g be a finite set of elements, called cleartext elements, to be encoded by a static uniquely decipherable (UD) code. For notational ease, we use the term `code' as abbreviation for `set of codewords'; the corresponding encoding and decoding algorithms are always either given or clear from the context. A code i...

### Citations

1141 | A universal algorithm for sequential data compression
- Ziv, Lempel
- 1977
(Show Context)
Citation Context ...ments to the code is fixed during the encoding of the text [23]. In this paper we restrict attention to static codes, thus excluding adaptive methods [26], and in particular the popular LZ techniques =-=[28]-=-, [29]. Let p i be the probability of occurrence of the element A i . The elements can be single characters, pairs, triplets or any m-gram of characters, they can represent words of a natural language... |

947 |
A Method for the construction of minimum redundancy codes
- Huffman
- 1952
(Show Context)
Citation Context ...ion efficiency. If l i is the length in bits of the binary codeword chosen to represent A i , it is well known that the weighted average length of a codeword, P p i l i , is minimized using Huffman's =-=[18]-=- procedure. However, Huffman codes are extremely error sensitive: a single wrong bit may render the tail of the encoded message following the error useless. As to (ii), a new set of codewords must be ... |

731 | Compression of individual sequences via variable-rate coding
- Ziv, Lempel
- 1978
(Show Context)
Citation Context ...to the code is fixed during the encoding of the text [23]. In this paper we restrict attention to static codes, thus excluding adaptive methods [26], and in particular the popular LZ techniques [28], =-=[29]-=-. Let p i be the probability of occurrence of the element A i . The elements can be single characters, pairs, triplets or any m-gram of characters, they can represent words of a natural language, they... |

617 | Text Compression - Bell, Cleary, et al. - 1990 |

347 |
Universal Codeword Sets and Representations of the Integers
- Elias
- 1975
(Show Context)
Citation Context ...101100111001110011111001110011110011100111 \Delta \Delta \Delta ; and this example can be extended arbitrarily. Thus SF(Ln ()) is not bounded, when the number of codewords tends to infinity. In Elias =-=[6]-=-, a code R = fr 1 ; r 2 ; : : :g is proposed which encodes the cleartext element A i by a logarithmic ramp representation of the integer i. The first element r 1 is 0. Let B(x) denote the standard bin... |

205 |
Theory of codes
- Berstel, Perrin
- 1985
(Show Context)
Citation Context ...s only as suffix is called the set generated by , and will be denoted L(). Note that we have adjoinedsitself to the code defined by Lakshmanan, in order to get better compression. In Berstel & Perrin =-=[3]-=-, L() is called a semaphore code. Various choices ofsare investigated in [13]. Gilbert conjectured that the number G(N) of possible codewords of length N can be maximized by choosing a prefix of the f... |

158 |
Data Compression Methods and Theory
- Storer
- 1988
(Show Context)
Citation Context ...ression efficiency. In Section 5, the codes are compared numerically on various probability distributions of "real-life" alphabets. The broad area of data compression has been ably reviewed =-=in Storer [25]-=- and in Lelewer and Hirschberg [23], and more recently in Williams [26] and Bell, Cleary & Witten [2]; thus we refrain from giving a review here, and cite only those works connected to the present inv... |

126 |
In fornation retrieval: Computational and theoretical aspects
- Heaps
- 1978
(Show Context)
Citation Context ...ity of another variant for the given distribution. The first example is the distribution of the 26 characters in an English text of 100,000 words chosen from many different sources, as given by Heaps =-=[16]-=-. In Table 2, the letters are listed in decreasing probability of occurrence, together with their Huffman code, C 1 , C 2 and C 3 codes. For the Huffman code, the codewords for the letters L and K are... |

57 |
Représentation des nombres naturels par une somme de nombres de Fibonacci ou de nombres de
- Zeckendorf
- 1972
(Show Context)
Citation Context ...w family of codes depends only on the number of items to be encoded and the ordering of their frequencies, not on their exact distribution, and is based on the binary Fibonacci numeration system (see =-=[27]-=-). The corresponding coding algorithms are very simple. Our paper is related to [1], where various representations of the integers, based on Fibonacci numbers of order ms2, are investigated, with an a... |

56 |
Systems of numeration
- Fraenkel
- 1985
(Show Context)
Citation Context ...i\Gamma1 + a (m) i\Gamma2 for i ? 1, for any fixed positive integer m (m = 1 is the Fibonacci case). The resulting codes are (m + 1)-ary codes, and their properties have been investigated by Fraenkel =-=[10]-=-, [11]. 2. Robustness When reliable transmission of a message is needed, error-correcting codes may be used. Often, however, we don't care about single (e.g., transmission or typing-) errors, as long ... |

44 |
Self-synchronizing huffman codes
- Ferguson, Rabinowitz
- 1984
(Show Context)
Citation Context ... might be used to respond to our intuitive notion of robustness, since even among those error-sensitive codes, there are some which are more robust than others. For instance, in Ferguson & Rabinowitz =-=[8]-=-, a method is proposed for certain classes of probability distributions, yielding Huffman codes which are self-synchronizing in a probabilistic sense: each code contains a so-called synchronizing code... |

42 |
Two inequalities implied by unique decipherability
- McMillan
(Show Context)
Citation Context ... code which is not complete can be extended by adjoining more codewords, thus forming a sequence with better compression capabilities. Every UD code C with codeword lengths l i satisfies the McMillan =-=[24]-=- inequality: P i 2 \Gammal is1. Thus a sufficient condition for the completeness of C is P i 2 \Gammal i = 1. In [22], recurrence relations are developed, giving for every fixedsthe number b r () of e... |

39 |
Adaptive Data Compression
- Williams
- 1991
(Show Context)
Citation Context ...tatic if the mapping from the set of cleartext elements to the code is fixed during the encoding of the text [23]. In this paper we restrict attention to static codes, thus excluding adaptive methods =-=[26]-=-, and in particular the popular LZ techniques [28], [29]. Let p i be the probability of occurrence of the element A i . The elements can be single characters, pairs, triplets or any m-gram of characte... |

36 |
Fibonacci codes for synchronization control
- Kautz
- 1965
(Show Context)
Citation Context ... 01011, 000011; : : : g, and the length of C 1 i by l 1 i . The sequence C 1 is one of the possible orderings of L(11). The properties of (generalized) Fibonacci numeration systems were used by Kautz =-=[21]-=- for synchronization control; some fixed-length codes were devised which satisfy the condition that every codeword contains no string of m or more consecutive 1's, for some fixed ms2. The code C 1 ext... |

31 | Efficient decoding of prefix codes
- HIRSCHBERG, LELEWER
- 1990
(Show Context)
Citation Context ...hat we refer here only to the straightforward approach to the decoding of Huffman codes. In certain cases, more sophisticated data structures may be used, which yield more efficient algorithms, as in =-=[17]-=- or [5]. In this section, we study the code L() for the special cases= 11 and show that such a mapping exists, because the code is related to the binary Fibonacci numeration system. This relation has ... |

29 |
Robust transmission of unbounded strings using Fibonacci representations
- Apostolico, Fraenkel
- 1987
(Show Context)
Citation Context ... of their frequencies, not on their exact distribution, and is based on the binary Fibonacci numeration system (see [27]). The corresponding coding algorithms are very simple. Our paper is related to =-=[1]-=-, where various representations of the integers, based on Fibonacci numbers of order ms2, are investigated, with an application to the transmission of unbounded strings. In the present work we assume ... |

22 |
Synchronization of binary messages
- Gilbert
- 1960
(Show Context)
Citation Context ... one codeword is lost. For Un , the first n codewords of U , the average codeword length is L = P n i=1 ip i , thus we get from (1) SF(Un ) = 1 L n X i=1 p i ((i \Gamma 1) + 2) = 1 + 1 L : In Gilbert =-=[13]-=-, the following method for generating block-codes of length N is proposed. These are also called prefix-synchronized codes [14], which are special cases of comma free codes (see e.g. [20]): fix any bi... |

16 |
Novel Compression of Sparse Bit-Strings— Preliminary Report
- Fraenkel, Klein
- 1985
(Show Context)
Citation Context ...ly different nature, provided that there is an unambiguous way to decompose a file into a sequence of these items, in such a way that the file can be reconstructed from this sequence (see for example =-=[12]-=-). We thus think also of applications where n, the size of A, can be large relative to the size of a standard alphabet. Several criteria may govern the choice of a code. We shall concentrate on the fo... |

15 |
All about the Responsa retrieval project – what you always wanted to know but were afraid to ask
- Fraenkel
- 1976
(Show Context)
Citation Context ...e is still an open problem. The second example is the distribution of 30 Hebrew letters (including two kinds of apostrophes and blank) as computed from the data base of the Responsa Retrieval Project =-=[9]-=- of about 40 million Hebrew and Aramaic words. Using the method presented in [8], we constructed a Huffman code for this alphabet with one synchronizing codeword, which appeared with probability 0:003... |

14 |
Economical encoding of comma between strings
- Even, Rodeh
- 1978
(Show Context)
Citation Context ...odeword does not start where it should, and such an error can propagate indefinitely, so that SF(R) is not bounded. The same result holds for a similar logarithmic ramp code discussed in Even & Rodeh =-=[7]-=-. Finally, for a Huffman code H, an error may be self-correcting after a few codewords, even if it is not a fixed length code (see Bookstein & Klein [4]). Nevertheless, it is easy to construct arbitra... |

8 |
Recent results in comma-free codes
- Jiggs
- 1963
(Show Context)
Citation Context ...: In Gilbert [13], the following method for generating block-codes of length N is proposed. These are also called prefix-synchronized codes [14], which are special cases of comma free codes (see e.g. =-=[20]-=-): fix any binary patternsof k ! N bits and consider the set of all strings of the form y = x, where x is a binary string of length N \Gamma k such that the patternsoccurs insx only as prefix and suff... |

7 |
Huffman coding in bit-vector compression
- Jakobsson
- 1978
(Show Context)
Citation Context ...et with one synchronizing codeword, which appeared with probability 0:0035. The third example is of a different kind. A large sparse bit-vector may be compressed in the following way (see for example =-=[19]-=-): the vector is partitioned into k-bit blocks, then the 2 k possible block-patterns are assigned Huffman (or other) codes according to their probability of occurrence. The statistics were collected f... |

5 |
Efficient Variants of Huffman Codes
- Choueka, Klein, et al.
- 1985
(Show Context)
Citation Context ...efer here only to the straightforward approach to the decoding of Huffman codes. In certain cases, more sophisticated data structures may be used, which yield more efficient algorithms, as in [17] or =-=[5]-=-. In this section, we study the code L() for the special cases= 11 and show that such a mapping exists, because the code is related to the binary Fibonacci numeration system. This relation has not bee... |

3 |
Is Huffman coding dead?, Computing 50
- Bookstein, Klein
- 1993
(Show Context)
Citation Context ...logarithmic ramp code discussed in Even & Rodeh [7]. Finally, for a Huffman code H, an error may be self-correcting after a few codewords, even if it is not a fixed length code (see Bookstein & Klein =-=[4]-=-). Nevertheless, it is easy to construct arbitrarily long sequences of codewords which are scrambled by a single error, so that SF(H) is not bounded, when the number of encoded cleartext elements grow... |

3 |
The use and usefulness of numeration systems
- Fraenkel
- 1989
(Show Context)
Citation Context ...a1 + a (m) i\Gamma2 for i ? 1, for any fixed positive integer m (m = 1 is the Fibonacci case). The resulting codes are (m + 1)-ary codes, and their properties have been investigated by Fraenkel [10], =-=[11]-=-. 2. Robustness When reliable transmission of a message is needed, error-correcting codes may be used. Often, however, we don't care about single (e.g., transmission or typing-) errors, as long as the... |

3 |
On universal codeword sets
- Lakshmanan
- 1981
(Show Context)
Citation Context ...and suffix. This allows the receiver of an encoded message to resynchronize (e.g. after a transmission error) by looking for the next appearance of the pattern . Another variant appears in Lakshmanan =-=[22]-=-, who studied variable-length codes. As he did not consider the above synchronization problem, but was interested mainly in UD codes, he defined the set of strings of the form y = x (nowsoccurs as suf... |

1 |
Odlyzko A.M., Maximal prefix-synchronized codes
- Guibas
- 1978
(Show Context)
Citation Context ... (1) SF(Un ) = 1 L n X i=1 p i ((i \Gamma 1) + 2) = 1 + 1 L : In Gilbert [13], the following method for generating block-codes of length N is proposed. These are also called prefix-synchronized codes =-=[14]-=-, which are special cases of comma free codes (see e.g. [20]): fix any binary patternsof k ! N bits and consider the set of all strings of the form y = x, where x is a binary string of length N \Gamma... |

1 |
Coding and Information Theory, 2 n d edition
- Hamming
- 1986
(Show Context)
Citation Context ...-bit blocks. Insertion and deletion errors however have the same devastating effect as for fixed length codes. We thus consider in this sub-section only substitution errors, as is done for example in =-=[15], and defi-=-ne a new sensitivity factor SF 00 similar to SF , but with this restricted interpretation of the word "error". Clearly, SF 00 (C)sSF(C) for any code C. The parameter m can often be chosen so... |