Results 1 -
7 of
7
Semantically Motivated Improvements for PPM Variants
- The Computer Journal
, 1997
"... This paper explains how to significantly improve the compression performance of any PPM variant ..."
Abstract
-
Cited by 23 (3 self)
- Add to MetaCart
This paper explains how to significantly improve the compression performance of any PPM variant
Code Compression Based on Operand Factorization
- in Proceedings of MICRO{31: The 31th Annual International Symposium on Microarchitecture
, 1998
"... This paper proposes a code compression technique called operand factorization. The central idea of operand factorization is the separation of program expression trees into sequences of tree-patterns (opcodes) and operandpatterns (registers and immediates). Using this technique, we show that tree and ..."
Abstract
-
Cited by 18 (4 self)
- Add to MetaCart
This paper proposes a code compression technique called operand factorization. The central idea of operand factorization is the separation of program expression trees into sequences of tree-patterns (opcodes) and operandpatterns (registers and immediates). Using this technique, we show that tree and operand patterns have exponential frequency distributions. A set of experiments were designed to explore this feature. They reveal an average compression ratio of 43% for SPECInt95 programs. A decompression engine is proposed, which assembles tree and operand patterns into uncompressed instruction sequences. An encoding that improves the design of the decompression engine results in a 48% compression ratio. Compression ratio numbers take into consideration an estimate of the decompression engine size.
On-Line Stochastic Processes in Data Compression
, 1996
"... The ability to predict the future based upon the past in finite-alphabet sequences has many applications, including communications, data security, pattern recognition, and natural language processing. By Shannon's theory and the breakthrough development of arithmetic coding, any sequence, a 1 a 2 \ ..."
Abstract
-
Cited by 14 (6 self)
- Add to MetaCart
The ability to predict the future based upon the past in finite-alphabet sequences has many applications, including communications, data security, pattern recognition, and natural language processing. By Shannon's theory and the breakthrough development of arithmetic coding, any sequence, a 1 a 2 \Delta \Delta \Delta a n , can be encoded in a number of bits that is essentially equal to the minimal information-lossless codelength, P i \Gamma log 2 p(a i ja 1 \Delta \Delta \Delta a i\Gamma1 ). The goal of universal on-line modeling, and therefore of universal data compression, is to deduce the model of the input sequence a 1 a 2 \Delta \Delta \Delta a n that can estimate each p(a i ja 1 \Delta \Delta \Delta a i\Gamma1 ) knowing only a 1 a 2 \Delta \Delta \Delta a i\Gamma1 so that the ex...
MDL-based DCG Induction for NP Identification
, 1999
"... We introduce a learner capable of automatically extend- ing large, manually written natural language Definite Clause Grammars with missing syntactic rules. It is based upon the Minimum Description Length principle, and can be trained upon either just raw text, or else raw text additionally annotated ..."
Abstract
-
Cited by 6 (3 self)
- Add to MetaCart
We introduce a learner capable of automatically extend- ing large, manually written natural language Definite Clause Grammars with missing syntactic rules. It is based upon the Minimum Description Length principle, and can be trained upon either just raw text, or else raw text additionally annotated with parsed corpora. As a demonstration of the learner, we show how full Noun Phrases (NPs that might contain pre or post- modifying phrases and might also be recursively nested) ca be identified in raw text. Preliminary results obtained by varying the amount of syntactic information in the training set suggests that raw text is less useful than additional NP bracketing information. However, using all syntactic information in the training set does not produce a significant improvement over just brack- eting information.
Compressed Code Execution on DSP Architectures
- Proc. of 12th International Symposium on System Synthesis
, 1999
"... Decreasing the program size has become an important goal in the design of embedded systems target to mass production. This problem has led to a number of efforts aimed at designing processors with shorter instruction formats (e.g. ARM Thumb and MIPS16), or that can execute compressed code (e.g. IBM ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
Decreasing the program size has become an important goal in the design of embedded systems target to mass production. This problem has led to a number of efforts aimed at designing processors with shorter instruction formats (e.g. ARM Thumb and MIPS16), or that can execute compressed code (e.g. IBM CodePack PowerPC). Much of this work has been directed towards RISC architectures though. This paper proposes a solution to the problem of executing compressed code on embedded DSPs. The experimental results reveal an average compression ratio of 75% for typical DSP programs running on the TMS320C25 processor. This number includes the size of the decompression engine. Decompression is performed by a state machine that translates codewords into instruction sequences during program execution. The decompression engine is synthesized using the AMS standard cell library and a 0.6m 5V technology. Gate level simulation of the decompression engine reveals minimum operation frequencies of 150MHz.
A generalization and improvement to PPM's blending
, 1997
"... The best-performing method in the data compression literature for computing probability estimates of sequences on-line using a suffix-tree model is the blending technique used by PPM [CW84, MofSO]. Blending can be viewed as a bottom-up recursive procedure for computing a mixture, barring one missing ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
The best-performing method in the data compression literature for computing probability estimates of sequences on-line using a suffix-tree model is the blending technique used by PPM [CW84, MofSO]. Blending can be viewed as a bottom-up recursive procedure for computing a mixture, barring one missing term for each level of the recursion, where a mixture is basically a weighted average of several probability estimates. In [Bun971 we have shown by decomposition into an inheritance weight &{A, B, C, D} and an inheritance evaluation time, Mh, that mixtures generalize the techniques used in DMC variants [CH87], as well as PPM variants, and thus these techniques, along with other variants of mixtures, are interchangeable. Table 1 shows the relative effectiveness of most combinations of mixture weight-ing functions and inheritance evaluation times. Table 2 is a study on the value of using update exclusion, especially in models using state selection. Table 1: How average compression performance on the Calgary Corpus as a whole is affected by varying mixture inheritance times and mixture weight functions, in models with and without (percolating) state selection.

