Results 1  10
of
77
Data Compression
 ACM Computing Surveys
, 1987
"... This paper surveys a variety of data compression methods spanning almost forty years of research, from the work of Shannon, Fano and Huffman in the late 40's to a technique developed in 1986. The aim of data compression is to reduce redundancy in stored or communicated data, thus increasing effectiv ..."
Abstract

Cited by 87 (3 self)
 Add to MetaCart
This paper surveys a variety of data compression methods spanning almost forty years of research, from the work of Shannon, Fano and Huffman in the late 40's to a technique developed in 1986. The aim of data compression is to reduce redundancy in stored or communicated data, thus increasing effective data density. Data compression has important application in the areas of file storage and distributed systems. Concepts from information theory, as they relate to the goals and evaluation of data compression methods, are discussed briefly. A framework for evaluation and comparison of methods is constructed and applied to the algorithms presented. Comparisons of both theoretical and empirical natures are reported and possibilities for future research are suggested. INTRODUCTION Data compression is often referred to as coding, where coding is a very general term encompassing any special representation of data which satisfies a given need. Information theory is defined to be the study of eff...
Compression and Explanation using Hierarchical Grammars
 Computer Journal
, 1997
"... This paper describes an algorithm, called SEQUITUR, that identifies hierarchical structure in ..."
Abstract

Cited by 85 (1 self)
 Add to MetaCart
This paper describes an algorithm, called SEQUITUR, that identifies hierarchical structure in
A New Challenge for Compression Algorithms: Genetic Sequences
 Information Processing & Management
, 1994
"... Universal data compression algorithms fail to compress genetic sequences. It is due to the specificity of this particular kind of "text". We analyze in some details the properties of the sequences, which cause the failure of classical algorithms. We then present a lossless algorithm, biocompress2, ..."
Abstract

Cited by 70 (0 self)
 Add to MetaCart
Universal data compression algorithms fail to compress genetic sequences. It is due to the specificity of this particular kind of "text". We analyze in some details the properties of the sequences, which cause the failure of classical algorithms. We then present a lossless algorithm, biocompress2, to compress the information contained in DNA and RNA sequences, based on the detection of regularities, such as the presence of palindromes. The algorithm combines substitutional and statistical methods, and to the best of our knowledge, lead to the highest compression of DNA. The results, although not satisfactory, gives insight to the necessary correlation between compression and comprehension of genetic sequences. 1 Introduction There are plenty of specific types of data which need to be compressed, for ease of storage and communication. Among them are texts (such as natural language and programs), images, sounds, etc. In this paper, we focus on the compression of a specific kin...
Code Density Optimization for Embedded DSP Processors Using Data Compression Techniques
 Proceedings of the 15th Conference on Advanced Research in VLSI
, 1995
"... We address the problem of code size minimization in VLSI systems with embedded DSP processors. Reducing code size reduces the production cost of embedded systems. We use data compression methods to develop code size minimization strategies. We present a framework for code size minimization where the ..."
Abstract

Cited by 58 (3 self)
 Add to MetaCart
We address the problem of code size minimization in VLSI systems with embedded DSP processors. Reducing code size reduces the production cost of embedded systems. We use data compression methods to develop code size minimization strategies. We present a framework for code size minimization where the compressed data consists of a dictionary and a skeleton. The dictionary can be computed using popular text compression algorithms. We describe two methods to execute the compressed code that have varying performance characteristics and varying degrees of freedom in compressing the code. Experimental results obtained with a TMS320C25 code generator are presented. 1: Introduction An increasingly common microarchitecture for embedded systems is to integrate a microprocessor or microcontroller, a ROM and an ASIC all on a single integrated circuit (Figure 1). Such a microarchitecture can currently be found in such diverse embedded systems as FAX modems, laser printers and cellular telephones....
Data Compression Algorithms for EnergyConstrained Devices in Delay Tolerant Networks
 In Proc. of the ACM Conf. on Embedded Networked Sensor Systems (SenSys
, 2006
"... Sensor networks are fundamentally constrained b y the difficulty and energy expense of delivering information from sensors to sink. Our work has focused on garnerin g additional significant energ y improvements b y d ev isin g computationallyefficient lossless compression algorithms on the source n ..."
Abstract

Cited by 58 (1 self)
 Add to MetaCart
Sensor networks are fundamentally constrained b y the difficulty and energy expense of delivering information from sensors to sink. Our work has focused on garnerin g additional significant energ y improvements b y d ev isin g computationallyefficient lossless compression algorithms on the source node. These reduce the amount of data that must be passed through the network and to the sink, and thus have energy benefits that are multiplicative with the number of hops the data travels through the network. Currently, if sensor system designers want to compress acquired data, they must either develop applicationspecific compression algorithms or use offtheshelf algorithms not designed for resourceconstrained sensor nodes. This paper discusses the design issues involved with implementing, adapting, and customizing compression algorithms specifically geared for sensor nodes. While developing Sensor LZW (SLZW) and some simple, but effective, variations to this algorithm, we show how different amounts of compression can lead to energy savings on both the compressing node and throughout the network and that the savings depends heavily on the radio hardware. To validate and evaluate our work, we apply it to datasets from several different realworld deployments and show that our approaches can reduce energy consumption by up to a factor of 4.5X across the network.
Configuration Compression for Virtex FPGAs
, 2001
"... Although runtime reconfigurable systems have been shown to achieve very high performance, the speedups over traditional microprocessor systems are limited by the cost of configuration of the hardware. Current reconfigurable systems suffer from a significant overhead due to the time it takes to reco ..."
Abstract

Cited by 32 (2 self)
 Add to MetaCart
Although runtime reconfigurable systems have been shown to achieve very high performance, the speedups over traditional microprocessor systems are limited by the cost of configuration of the hardware. Current reconfigurable systems suffer from a significant overhead due to the time it takes to reconfigure their hardware. In order to deal with this overhead, and increase the compute power of reconfigurable systems, it is important to develop hardware and software systems to reduce or eliminate this delay. In this paper, we explore the idea of configuration compression and develop algorithms for reconfigurable systems. These algorithms, targeted to Xilinx Virtex series FPGAs with minimum modification of hardware, can significantly reduce the amount of data needed to transfer during configuration. In this work we have extensively researched the current compression techniques, including the Huffman coding, the Arithmetic coding and LZ coding. We have also developed different algorithms targeting different hardware structures. Our readback algorithm allows certain frames to be reused as a dictionary and sufficiently utilize the regularities within the configuration bitstream. In addition, we have developed frame reordering techniques that better uses the regularities by shuffling the sequence of the configuration. We have also developed the wildcard approach that can be used for true partial reconfiguration. The simulation results demonstrate that a factor of 4 compression ratio can be achieved.
Offline compression by greedy textual substitution
 PROC. IEEE
, 2000
"... Greedy offline textual substitution refers to the following approach to compression or structural inference. Given a long textstring x, a substring w is identified such that replacing all instances of w in x except one by a suitable pair of pointers yields the highest possible contraction of x; the ..."
Abstract

Cited by 25 (1 self)
 Add to MetaCart
Greedy offline textual substitution refers to the following approach to compression or structural inference. Given a long textstring x, a substring w is identified such that replacing all instances of w in x except one by a suitable pair of pointers yields the highest possible contraction of x; the process is then repeated on the contracted textstring until substrings capable of producing contractions can no longer be found. This paper examines computational issues arising in the implementation of this paradigm and describes some applications and experiments.
The Smallest Grammar Problem
 IEEE TRANSACTIONS ON INFORMATION THEORY
, 2005
"... This paper addresses the smallest grammar problem: What is the smallest contextfree grammar that generates exactly one given string σ? This is a natural question about a fundamental object connected to many fields, including data compression, Kolmogorov complexity, pattern identification, and addi ..."
Abstract

Cited by 24 (0 self)
 Add to MetaCart
This paper addresses the smallest grammar problem: What is the smallest contextfree grammar that generates exactly one given string σ? This is a natural question about a fundamental object connected to many fields, including data compression, Kolmogorov complexity, pattern identification, and addition chains. Due to the problem’s inherent complexity, our objective is to find an approximation algorithm which finds a small grammar for the input string. We focus attention on the approximation ratio of the algorithm (and implicitly, worstcase behavior) to establish provable performance guarantees and to address shortcomings in the classical measure of redundancy in the literature. Our first results are a variety of hardness results, most notably that every efficient algorithm for the smallest grammar problem has approximation ratio at least 8569 unless P = NP. 8568 We then bound approximation ratios for several of the bestknown grammarbased compression algorithms, including LZ78, BISECTION, SEQUENTIAL, LONGEST MATCH, GREEDY, and REPAIR. Among these, the best upper bound we show is O(n 1/2). We finish by presenting two novel algorithms with exponentially better ratios of O(log 3 n) and O(log(n/m ∗)), where m ∗ is the size of the smallest grammar for that input. The latter highlights a connection between grammarbased compression and LZ77.
Browsing in Digital Libraries: A PhraseBased Approach
, 1997
"... this article tends to be answered by making a selection of queries more or less haphazardly to gain a feeling for what the collection contains. ..."
Abstract

Cited by 23 (5 self)
 Add to MetaCart
this article tends to be answered by making a selection of queries more or less haphazardly to gain a feeling for what the collection contains.
A Unifying Framework for Compressed Pattern Matching
 In Proc. 6th International Symp. on String Processing and Information Retrieval
, 1999
"... We introduce a general framework which is suitable to capture an essence of compressed pattern matching according to various dictionary based compressions, and propose a compressed pattern matching algorithm for the framework. The goal is to find all occurrences of a pattern in a text without decomp ..."
Abstract

Cited by 22 (6 self)
 Add to MetaCart
We introduce a general framework which is suitable to capture an essence of compressed pattern matching according to various dictionary based compressions, and propose a compressed pattern matching algorithm for the framework. The goal is to find all occurrences of a pattern in a text without decompression, which is one of the most active topics in string matching. Our framework includes such compression methods as LempelZiv family, (LZ77, LZSS, LZ78, LZW), bytepair encoding, and the static dictionary based method. Technically, our pattern matching algorithm extends that for LZW compressed text presented by Amir, Benson and Farach. 1 Introduction Pattern matching is one of the most fundamental operations in string processing. The problem is to find all occurrences of a given pattern in a given text. A lot of classical or advanced pattern matching algorithms have been proposed (see [3, 2]). Data compression is another most important research topic, whose aim is to reduce its space u...