Results 1 -
7 of
7
Towards Compressing Web Graphs
- In Proc. of the IEEE Data Compression Conference (DCC
, 2000
"... In this paper, we consider the problem of compressing graphs of the link structure of the World Wide Web. We provide efficient algorithms for such compression that are motivated by recently proposed random graph models for describing the Web. ..."
Abstract
-
Cited by 68 (1 self)
- Add to MetaCart
In this paper, we consider the problem of compressing graphs of the link structure of the World Wide Web. We provide efficient algorithms for such compression that are motivated by recently proposed random graph models for describing the Web.
Cluster-Based Delta Compression of a Collection of Files
- In Third Int. Conf. on Web Information Systems Engineering
, 2002
"... Delta compression techniques are commonly used to succinctly represent an updated version of a file with respect to an earlier one. In this paper, we study the use of delta compression in a somewhat different scenario, where we wish to compress a large collection of (more or less) related files by p ..."
Abstract
-
Cited by 13 (5 self)
- Add to MetaCart
Delta compression techniques are commonly used to succinctly represent an updated version of a file with respect to an earlier one. In this paper, we study the use of delta compression in a somewhat different scenario, where we wish to compress a large collection of (more or less) related files by performing a sequence of pairwise delta compressions. The problem of finding an optimal delta encoding for a collection of files by taking pairwise deltas can be reduced to the problem of computing a branching of maximum weight in a weighted directed graph, but this solution is inefficient and thus does not scale to larger file collections. This motivates us to propose a framework for cluster-based delta compression that uses text clustering techniques to prune the graph of possible pairwise delta encodings. To demonstrate the efficacy of our approach, we present experimental results on collections of web pages. Our experiments show that cluster-based delta compression of collections provides significant improvements in compression ratio as compared to individually compressing each file or using tar+gzip, at a moderate cost in efficiency.
Algorithms for Delta Compression and Remote File Synchronization
- In Khalid Sayood, editor, Lossless Compression Handbook
, 2002
"... Delta compression and remote file synchronization techniques are concerned with efficient file transfer over a slow communication link in the case where the receiving party already has a similar file (or files). This problem arises naturally, e.g., when distributing updated versions of software o ..."
Abstract
-
Cited by 13 (8 self)
- Add to MetaCart
Delta compression and remote file synchronization techniques are concerned with efficient file transfer over a slow communication link in the case where the receiving party already has a similar file (or files). This problem arises naturally, e.g., when distributing updated versions of software over a network or synchronizing personal files between different accounts and devices. More generally, the problem is becoming increasingly common in many networkbased applications where files and content are widely replicated, frequently modified, and cut and reassembled in different contexts and packagings.
Compression file collections with a TSP-based approach
, 2004
"... Delta compression techniques solve the problem of encoding a given target file with respect to one or more reference files. Recent work in [15, 12, 7] has demonstrated the benefits of using such techniques in the context of file collection compression. In these scenarios, files are often better comp ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
Delta compression techniques solve the problem of encoding a given target file with respect to one or more reference files. Recent work in [15, 12, 7] has demonstrated the benefits of using such techniques in the context of file collection compression. In these scenarios, files are often better compressed by computing deltas with respect to other similar files from the same collection, as opposed to compressing each file by itself. It is known that the optimal set of such delta encodings, assuming that only a single reference file is used for each target file, can be found by computing an optimal branching on a directed graph. In this paper we propose two techniques for improving the compression of file collections. The first one utilizes deltas computed with respect to more than one file, while the second one improves the compressibility of batched file collections, such as tar archives, using standard compression tools. Both techniques are based on a reduction to the Traveling Sales Person problem on directed weighted graphs. We present experiments demonstrating the benefits of our methods. 1 1
Approximate Maximum Weight Branchings
, 2005
"... We consider a special subgraph of a weighted directed graph: one comprising only the k heaviest edges incoming to each vertex. We show that the maximum weight branching in this subgraph closely approximates the maximum weight branching in the original graph. Specifi-cally, it is within a factor of k ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
We consider a special subgraph of a weighted directed graph: one comprising only the k heaviest edges incoming to each vertex. We show that the maximum weight branching in this subgraph closely approximates the maximum weight branching in the original graph. Specifi-cally, it is within a factor of k k+1. Our interest in finding branchings in this subgraph is motivated by a data compression application in which calculating edge weights is expensive but estimating which are the heaviest k incoming edges is easy. An additional benefit is that since algorithms for finding branchings run in time linear in the number of edges our results imply faster algorithms although we sacrifice optimality by a small factor. We also extend our results to the case of edge-disjoint branching of maximum weight and to maximum weight spanning forests.
Multispectral Image Coding
- Handbook of Image and Video Coding, Academic
"... Multispectral images are a particular class of images that require specialized coding algorithms. In multispectral images, the same spatial region is captured multiple times using different imaging modalities. These modalities often consist of measurements at different optical wavelengths (hence the ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Multispectral images are a particular class of images that require specialized coding algorithms. In multispectral images, the same spatial region is captured multiple times using different imaging modalities. These modalities often consist of measurements at different optical wavelengths (hence the name multispectral), but the same term is sometimes used when the separate image planes are captured from completely different imaging systems. Medical multispectral images, for example, may combine MRI, CT, and X-ray images into a single multilayer data set [10]. Multispectral images are three-dimensional data sets in which the third (spectral) dimension is qualitatively different from the other two. Because of this, a straightforward extension of two-dimensional image compression algorithms is generally not appropriate. Also, unlike most two-dimensional images, multispectral data sets are often not meant to be viewed by humans. Remotely sensed multispectral images, for e...

