Results 1 - 10
of
123
Good Error-Correcting Codes based on Very Sparse Matrices
, 1999
"... We study two families of error-correcting codes defined in terms of very sparse matrices. "MN" (MacKay--Neal) codes are recently invented, and "Gallager codes" were first investigated in 1962, but appear to have been largely forgotten, in spite of their excellent properties. The decoding of both cod ..."
Abstract
-
Cited by 349 (25 self)
- Add to MetaCart
We study two families of error-correcting codes defined in terms of very sparse matrices. "MN" (MacKay--Neal) codes are recently invented, and "Gallager codes" were first investigated in 1962, but appear to have been largely forgotten, in spite of their excellent properties. The decoding of both codes can be tackled with a practical sum-product algorithm. We prove that these codes are "very good," in that sequences of codes exist which, when optimally decoded, achieve information rates up to the Shannon limit. This result holds not only for the binary-symmetric channel but also for any channel with symmetric stationary ergodic noise. We give experimental results for binary-symmetric channels and Gaussian channels demonstrating that practical performance substantially better than that of standard convolutional and concatenated codes can be achieved; indeed, the performance of Gallager codes is almost as close to the Shannon limit as that of turbo codes. Index Terms--- Error-correctio...
Optimal Prefetching via Data Compression
, 1995
"... Caching and prefetching are important mechanisms for speeding up access time to data on secondary storage. Recent work in competitive online algorithms has uncovered several promising new algorithms for caching. In this paper we apply a form of the competitive philosophy for the first time to the pr ..."
Abstract
-
Cited by 226 (11 self)
- Add to MetaCart
Caching and prefetching are important mechanisms for speeding up access time to data on secondary storage. Recent work in competitive online algorithms has uncovered several promising new algorithms for caching. In this paper we apply a form of the competitive philosophy for the first time to the problem of prefetching to develop an optimal universal prefetcher in terms of fault ratio, with particular applications to large-scale databases and hypertext systems. Our prediction algorithms for prefetching are novel in that they are based on data compression techniques that are both theoretically optimal and good in practice. Intuitively, in order to compress data effectively, you have to be able to predict future data well, and thus good data compressors should be able to predict well for purposes of prefetching. We show for powerful models such as Markov sources and nth order Markov sources that the page fault rates incurred by our prefetching algorithms are optimal in the limit for almost all sequences of page requests.
Space-frequency Quantization for Wavelet Image Coding
, 1997
"... Recently, a new class of image coding algorithms coupling standard scalar quantization of frequency coefficients with tree-structured quantization (related to spatial structures) has attracted wide attention because its good performance appears to confirm the promised efficiencies of hierarchical re ..."
Abstract
-
Cited by 137 (15 self)
- Add to MetaCart
Recently, a new class of image coding algorithms coupling standard scalar quantization of frequency coefficients with tree-structured quantization (related to spatial structures) has attracted wide attention because its good performance appears to confirm the promised efficiencies of hierarchical representation [1, 2]. This paper addresses the problem of how spatial quantization modes and standard scalar quantization can be applied in a jointly optimal fashion in an image coder. We consider zerotree quantization (zeroing out tree-structured sets of wavelet coefficients) and the simplest form of scalar quantization (a single common uniform scalar quantizer applied to all non-zeroed coefficients), and we formalize the problem of optimizing their joint application and we develop an image coding algorithm for solving the resulting optimization problem. Despite the basic form of the two quantizers considered, the resulting algorithm demonstrates coding performance that is competitive (often...
Arithmetic coding revisited
- ACM Transactions on Information Systems
, 1995
"... Over the last decade, arithmetic coding has emerged as an important compression tool. It is now the method of choice for adaptive coding on multisymbol alphabets because of its speed, low storage requirements, and effectiveness of compression. This article describes a new implementation of arithmeti ..."
Abstract
-
Cited by 118 (2 self)
- Add to MetaCart
Over the last decade, arithmetic coding has emerged as an important compression tool. It is now the method of choice for adaptive coding on multisymbol alphabets because of its speed, low storage requirements, and effectiveness of compression. This article describes a new implementation of arithmetic coding that incorporates several improvements over a widely used earlier version by Witten, Neal, and Cleary, which has become a de facto standard. These improvements include fewer multiplicative operations, greatly extended range of alphabet sizes and symbol probabilities, and the use of low-precision arithmetic, permitting implementation by fast shift/add operations. We also describe a modular structure that separates the coding, modeling, and probability estimation components of a compression system. To motivate the improved coder, we consider the needs of a word-based text compression program. We report a range of experimental results using this and other models. Complete source code is available.
Good Codes based on Very Sparse Matrices
- Cryptography and Coding. 5th IMA Conference, number 1025 in Lecture Notes in Computer Science
, 1995
"... . We present a new family of error-correcting codes for the binary symmetric channel. These codes are designed to encode a sparse source, and are defined in terms of very sparse invertible matrices, in such a way that the decoder can treat the signal and the noise symmetrically. The decoding proble ..."
Abstract
-
Cited by 64 (11 self)
- Add to MetaCart
. We present a new family of error-correcting codes for the binary symmetric channel. These codes are designed to encode a sparse source, and are defined in terms of very sparse invertible matrices, in such a way that the decoder can treat the signal and the noise symmetrically. The decoding problem involves only very sparse matrices and sparse vectors, and so is a promising candidate for practical decoding. It can be proved that these codes are `very good', in that sequences of codes exist which, when optimally decoded, achieve information rates up to the Shannon limit. We give experimental results using a free energy minimization algorithm and a belief propagation algorithm for decoding, demonstrating practical performance superior to that of both Bose-Chaudhury-Hocquenghem codes and Reed-Muller codes over a wide range of noise levels. We regret that lack of space prevents presentation of all our theoretical and experimental results. The full text of this paper may be found elsewher...
Novel Cluster-Based Probability Model for Texture Synthesis, Classification, and Compression
- In Visual Communications and Image Processing
, 1993
"... We present a new probabilistic modeling technique for high-dimensional vector sources, and consider its application to the problems of texture synthesis, classification, and compression. Our model combines kernel estimation with clustering, to obtain a semiparametric probability mass function estima ..."
Abstract
-
Cited by 62 (6 self)
- Add to MetaCart
We present a new probabilistic modeling technique for high-dimensional vector sources, and consider its application to the problems of texture synthesis, classification, and compression. Our model combines kernel estimation with clustering, to obtain a semiparametric probability mass function estimate which summarizes --- rather than contains --- the training data. Because the model is cluster based, it is inferable from a limited set of training data, despite the model's high dimensionality. Moreover, its functional form allows recursive implementation that avoids exponential growth in required memory as the number of dimensions increases. Experimental results are presented for each of the three applications considered. 1. INTRODUCTION In many information processing tasks individual data samples exhibit a great deal of statistical interdependence, and should be treated jointly (e.g., in vectors) rather than separately. For some tasks this requires modeling vectors probabilistically....
Cluster-Based Probability Model and Its Application to Image and Texture Processing
, 1997
"... We develop, analyze, and apply a specific form of mixture modeling for density estimation, within the context of image and texture processing. The technique captures much of the higher-order, nonlinear statistical relationships present among vector elements by combining aspects of kernel estimation ..."
Abstract
-
Cited by 45 (2 self)
- Add to MetaCart
We develop, analyze, and apply a specific form of mixture modeling for density estimation, within the context of image and texture processing. The technique captures much of the higher-order, nonlinear statistical relationships present among vector elements by combining aspects of kernel estimation and cluster analysis. Experimental results are presented in the following applications: image restoration, image and texture compression, and texture classification. 1 Introduction In many signal processing tasks, uncertainty plays a fundamental role. Examples of such tasks are compression, detection, estimation, classification, and restoration --- in all of these, the future inputs are not known perfectly at the time of system design, but instead must be characterized only in terms of their "typical," or "likely" behavior, by means of some probabilistic model. Every such system has a probabilistic model, be it explicit or implicit. Often, the level of performance achieved by such a syste...
The Design and Analysis of Efficient Lossless Data Compression Systems
, 1993
"... Our thesis is that high compression efficiency for text and images can be obtained by using sophisticated statistical compression techniques, and that greatly increased speed can be achieved at only a small cost in compression efficiency. Our emphasis is on elegant design and mathematical as well as ..."
Abstract
-
Cited by 43 (0 self)
- Add to MetaCart
Our thesis is that high compression efficiency for text and images can be obtained by using sophisticated statistical compression techniques, and that greatly increased speed can be achieved at only a small cost in compression efficiency. Our emphasis is on elegant design and mathematical as well as empirical analysis. We analyze arithmetic coding as it is commonly implemented and show rigorously that almost no compression is lost in the implementation. We show that high-efficiency lossless compression of both text and grayscale images can be obtained by using appropriate models in conjunction with arithmetic coding. We introduce a four-component paradigm for lossless image compression and present two methods that give state of the art compression efficiency. In the text compression area, we give a small improvement on the preferred method in the literature. We show that we can often obtain significantly improved throughput at the cost of slightly reduced compression. The extra speed c...
On prediction using variable order Markov models
- JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH
, 2004
"... This paper is concerned with algorithms for prediction of discrete sequences over a finite alphabet, using variable order Markov models. The class of such algorithms is large and in principle includes any lossless compression algorithm. We focus on six prominent prediction algorithms, including Cont ..."
Abstract
-
Cited by 42 (1 self)
- Add to MetaCart
This paper is concerned with algorithms for prediction of discrete sequences over a finite alphabet, using variable order Markov models. The class of such algorithms is large and in principle includes any lossless compression algorithm. We focus on six prominent prediction algorithms, including Context Tree Weighting (CTW), Prediction by Partial Match (PPM) and Probabilistic Suffix Trees (PSTs). We discuss the properties of these algorithms and compare their performance using real life sequences from three domains: proteins, English text and music pieces. The comparison is made with respect to prediction quality as measured by the average log-loss. We also compare classification algorithms based on these predictors with respect to a number of large protein classification tasks. Our results indicate that a “decomposed” CTW (a variant of the CTW algorithm) and PPM outperform all other algorithms in sequence prediction tasks. Somewhat surprisingly, a different algorithm, which is a modification of the Lempel-Ziv compression algorithm, significantly outperforms all algorithms on the protein classification problems.

