## Efficient Optimal Recompression (1997)

Citations: | 4 - 0 self |

### BibTeX

@MISC{Klein97efficientoptimal,

author = {Shmuel T. Klein},

title = {Efficient Optimal Recompression},

year = {1997}

}

### OpenURL

### Abstract

### Citations

9061 | Introduction to Algorithms
- Cormen, Leiserson, et al.
- 2001
(Show Context)
Citation Context ...vertex n + 1. Dijkstra's algorithm [18] may be used to find the shortest path. Its worst-case complexity varies, depending on the data structures used, from O(|V | 2 ) to O(|E | + |V | log |V |) (see =-=[19]-=-), which would be particularly disturbing for our intended application. However, in our case the directed THE COMPUTER JOURNAL, Vol. 40, No. 2/3, 1997 120 S. T. KLEIN graph contains no cycles, since a... |

1612 | A note on two problems in connexion with graphs
- Dijkstra
(Show Context)
Citation Context ...text, relative to the given dictionary and the given encoding scheme, therefore reduces to the well-known problem of finding the shortest path in G from vertex 1 to vertex n + 1. Dijkstra's algorithm =-=[18]-=- may be used to find the shortest path. Its worst-case complexity varies, depending on the data structures used, from O(|V | 2 ) to O(|E | + |V | log |V |) (see [19]), which would be particularly dist... |

1221 | A universal algorithm for sequential data compression
- Ziv, Lempel
- 1977
(Show Context)
Citation Context ...TION Text compression techniques are often divided into statistical methods, such as Huffman coding [1] or arithmetic coding [2], and dictionary methods, based generally on the work of Ziv and Lempel =-=[3, 4]-=-. The statistical methods assign codewords to the elements making up the text, the lengths of these codewords depending on the frequencies of the corresponding elements. Dictionary methods replace var... |

1053 |
A method for the construction of minimum redundancy codes
- Huffman
- 1952
(Show Context)
Citation Context ...od, and in particular some LZ77 variants. Received June 18, 1996; revised April 18, 1997 1. INTRODUCTION Text compression techniques are often divided into statistical methods, such as Huffman coding =-=[1]-=- or arithmetic coding [2], and dictionary methods, based generally on the work of Ziv and Lempel [3, 4]. The statistical methods assign codewords to the elements making up the text, the lengths of the... |

776 | Compression of individual sequences via variable rate coding
- Lempel, Ziv
- 1978
(Show Context)
Citation Context ...TION Text compression techniques are often divided into statistical methods, such as Huffman coding [1] or arithmetic coding [2], and dictionary methods, based generally on the work of Ziv and Lempel =-=[3, 4]-=-. The statistical methods assign codewords to the elements making up the text, the lengths of these codewords depending on the frequencies of the corresponding elements. Dictionary methods replace var... |

694 |
Arithmetic Coding for Data Compression
- Witten, Neal, et al.
- 1987
(Show Context)
Citation Context ...e LZ77 variants. Received June 18, 1996; revised April 18, 1997 1. INTRODUCTION Text compression techniques are often divided into statistical methods, such as Huffman coding [1] or arithmetic coding =-=[2]-=-, and dictionary methods, based generally on the work of Ziv and Lempel [3, 4]. The statistical methods assign codewords to the elements making up the text, the lengths of these codewords depending on... |

644 |
Modeling for text compression
- Bell, Witten, et al.
- 1989
(Show Context)
Citation Context ...he Dictionnaire Philosophique by Voltaire; for Hebrew, the Brahot tractate of the Babylonian Talmud. The second set consists of nontextual files. The first four are taken from the Calgary corpus (see =-=[20]-=-): paper1, a paper including formatting commands; progl, Lisp source code; trans, transcript of a terminal session; bib, a bibliographic file. The last file in this set is moricons.dll, which is part ... |

358 |
Graph Algorithms
- Even
- 1979
(Show Context)
Citation Context ...ed by the use of variable-length codes, assigning shorter codewords to elements with higher probability of occurrence. A sufficient condition for a code being UD is to choose it as a prefix code (see =-=[17]-=-). The problem is the following: given the dictionary D and the encoding function #, we are looking for the optimal partition of the text string S, i.e., the sequence of indices i 1 , i 2 , . . . is s... |

161 | Data Compression: Methods and Theory - Storer - 1988 |

106 |
Data compression via textual substitution
- Storer, Szymanski
- 1982
(Show Context)
Citation Context ...e original text into a sequence of substrings is a problem common to all dictionarybased compression techniques. An optimal technique for a static dictionary is mentioned in [9]. Storer and Szymanski =-=[10]-=- give an optimal parsing algorithm for the sliding window method, and Hirschberg and Stauffer [11] present parallel algorithms for optimal parsing. Generally, for static dictionary techniques, the par... |

83 | Adding compression to a full-text retrieval system - Zobel, Moffat - 1995 |

66 |
Data Compression with Finite Windows
- Fiala, Greene
- 1989
(Show Context)
Citation Context ...text backwards for each processed character might be prohibitively slow. Many alternatives have been suggested, including, among others, the use of binary trees [5], hashing [6, 7] and Patricia trees =-=[8]-=-. The question of how to parse the original text into a sequence of substrings is a problem common to all dictionarybased compression techniques. An optimal technique for a static dictionary is mentio... |

32 |
An extremely fast ziv-lempel data compression algorithm
- Williams
- 1991
(Show Context)
Citation Context ...hod of scanning the whole text backwards for each processed character might be prohibitively slow. Many alternatives have been suggested, including, among others, the use of binary trees [5], hashing =-=[6, 7]-=- and Patricia trees [8]. The question of how to parse the original text into a sequence of substrings is a problem common to all dictionarybased compression techniques. An optimal technique for a stat... |

19 | Leiserson C.E., and Rivest R.L., Introduction to algorithms - Cormen - 1990 |

16 |
Common phrases and minimum-space text storage
- Wagner
- 1973
(Show Context)
Citation Context ...uestion of how to parse the original text into a sequence of substrings is a problem common to all dictionarybased compression techniques. An optimal technique for a static dictionary is mentioned in =-=[9]-=-. Storer and Szymanski [10] give an optimal parsing algorithm for the sliding window method, and Hirschberg and Stauffer [11] present parallel algorithms for optimal parsing. Generally, for static dic... |

14 | The effect of non-greedy parsing in ziv-lempel compression methods
- Horspool
(Show Context)
Citation Context ... dictionary techniques, the parsing is done by a greedy method, i.e. at any stage, the longest matching element from the dictionary is sought, though non-greedy methods have also been considered (see =-=[12]-=-) and are used, for example, in the popular gzip program. A greedy approach gives good compression [13], and is easy to implement by means of a trie, but is not necessarily optimal. Because the elemen... |

10 |
A comparison of algorithms for data base compression by use of fragments as language elements. Information Storage and Retrieval
- Schuegraf, Heaps
- 1974
(Show Context)
Citation Context ...may be used. In other words, a single decoding routine should be able to process a file, regardless of it having been compressed or recompressed. The method described below has already been mentioned =-=[14, 15]-=-, and achieves optimal recompression in the sense that once the method for encoding the elements is given, it finds the optimal way of parsing the text into such elements. Obviously, different encodin... |

10 | Deerwester S., Storing Text Retrieval Systems on CD-ROM - Klein, Bookstein - 1989 |

9 |
Better OPM/L text compression
- Bell
- 1986
(Show Context)
Citation Context ...he simple method of scanning the whole text backwards for each processed character might be prohibitively slow. Many alternatives have been suggested, including, among others, the use of binary trees =-=[5]-=-, hashing [6, 7] and Patricia trees [8]. The question of how to parse the original text into a sequence of substrings is a problem common to all dictionarybased compression techniques. An optimal tech... |

8 |
A linear algorithm for data compression
- Brent
- 1987
(Show Context)
Citation Context ...hod of scanning the whole text backwards for each processed character might be prohibitively slow. Many alternatives have been suggested, including, among others, the use of binary trees [5], hashing =-=[6, 7]-=- and Patricia trees [8]. The question of how to parse the original text into a sequence of substrings is a problem common to all dictionarybased compression techniques. An optimal technique for a stat... |

8 |
Parsing algorithms for dictionary compression on the PRAM
- Hirschberg, Stauffer
- 1994
(Show Context)
Citation Context ...on techniques. An optimal technique for a static dictionary is mentioned in [9]. Storer and Szymanski [10] give an optimal parsing algorithm for the sliding window method, and Hirschberg and Stauffer =-=[11]-=- present parallel algorithms for optimal parsing. Generally, for static dictionary techniques, the parsing is done by a greedy method, i.e. at any stage, the longest matching element from the dictiona... |

7 |
Raita T. An analysis of the longest match and the greedy heuristics in text encoding
- Katajainen
(Show Context)
Citation Context ... element from the dictionary is sought, though non-greedy methods have also been considered (see [12]) and are used, for example, in the popular gzip program. A greedy approach gives good compression =-=[13]-=-, and is easy to implement by means of a trie, but is not necessarily optimal. Because the elements of the dictionary are often overlapping, a different method of parsing might yield better compressio... |

4 |
An approximation algorithm for space-optimal encoding of a text
- Katajainen, Raita
- 1989
(Show Context)
Citation Context ...may be used. In other words, a single decoding routine should be able to process a file, regardless of it having been compressed or recompressed. The method described below has already been mentioned =-=[14, 15]-=-, and achieves optimal recompression in the sense that once the method for encoding the elements is given, it finds the optimal way of parsing the text into such elements. Obviously, different encodin... |

3 | Data compression with "nite windows - Fiala, Greene - 1989 |