Results 1  10
of
10
Online Data Compression in a LogStructured File System
, 1992
"... We have incorporated online data compression into the low levels of a logstructured file system (Rosenblum's Sprite LFS). Each block of data or metadata is compressed as it is written to the disk and decompressed as it is read. The logstructuring overcomes the problems of allocation and frag ..."
Abstract

Cited by 42 (1 self)
 Add to MetaCart
We have incorporated online data compression into the low levels of a logstructured file system (Rosenblum's Sprite LFS). Each block of data or metadata is compressed as it is written to the disk and decompressed as it is read. The logstructuring overcomes the problems of allocation and fragmentation for variablesized blocks. We observe compression factors ranging from 1.6 to 2.2, using algorithms running from 1.7 to 0.4 MBytes per second in software on a DECstation 5000/200. System performance is degraded by a few percent for normal activities (such as compiling or editing), and as much as a factor of 1.6 for file system intensive operations (such as copying multimegabyte files). Hardware
MIRAGE+: A Kernel Implementation of Distributed Shared Memory on a Network of Personal Computers
 Software Practice & Experience
, 1994
"... This paper addresses the architectural dependencies in the design of the system and evaluates performance of the implementation. The new version, MIRAGE + , performs well compared to Mirage even though eight times the amount of data is sent on each page fault because of the larger page size used in ..."
Abstract

Cited by 19 (9 self)
 Add to MetaCart
This paper addresses the architectural dependencies in the design of the system and evaluates performance of the implementation. The new version, MIRAGE + , performs well compared to Mirage even though eight times the amount of data is sent on each page fault because of the larger page size used in the implementation. We show that performance of systems with a large page size to network packet size can be dramatically improved on conventional hardware by applying three wellknown techniques: packet blasting, compression, and running at interrupt level
Longestmatch String Searching for ZivLempel Compression
, 1993
"... This paper presents eight data structures that can be used to accelerate the searching, including adaptations of four methods normally used for exact matching searching. The algorithms are evaluated analytically and empirically, indicating the tradeoffs available between compression speed and memor ..."
Abstract

Cited by 14 (2 self)
 Add to MetaCart
This paper presents eight data structures that can be used to accelerate the searching, including adaptations of four methods normally used for exact matching searching. The algorithms are evaluated analytically and empirically, indicating the tradeoffs available between compression speed and memory consumption. Two of the algorithms are wellknown methods of finding the longest matchthe timeconsuming linear search, and the storageintensive trie (digital search tree). The trie is adapted along the lines of a PATRICIA tree to operate economically. Hashing, binary search trees, splay trees and the BoyerMoore searching algorithm are traditionally used to search for exact matches, but we show how these can be adapted to find longest matches. In addition, two data structures specifically designed for the application are presented
Fast Higher Bandwidth X
 ACM Multimedia 94, Second ACM International Conference on Multimedia
, 1995
"... This paper proposes an X Window System protocol compression scheme called Fast Higher Bandwidth X (FHBX). Previous X protocol compression schemes were either much slower (Higher Bandwidth X) or much less effective (Xremote). By using an application specific predictive hashing technique, FHBX is able ..."
Abstract

Cited by 10 (1 self)
 Add to MetaCart
This paper proposes an X Window System protocol compression scheme called Fast Higher Bandwidth X (FHBX). Previous X protocol compression schemes were either much slower (Higher Bandwidth X) or much less effective (Xremote). By using an application specific predictive hashing technique, FHBX is able to deliver three times the compression performance of Xremote, while running ten times as fast as HBX. The family of structured compression techniques illustrated by FHBX is applicable to other structured protocols, and should enable a host of interactive applications on low bandwidth wireless devices and telephone links. 1 Introduction This research is targeted at the vast numbers of networks which are, and which will continue to be, too slow. These networks include normal telephone connections, wireless connections, internet connections, and ISDN connections. If the current rate of increase in CPU performance holds, unloaded 10Mbit/sec Ethernet connections will be candidates for softwar...
Moving Distributed Shared Memory to the Personal Computer: The MIRAGE+ Experience
 Department of Computer Science, University of California
, 1993
"... This paper describes the evolution of a distributed shared memory (DSM) system, Mirage, from its original implementation on VAX computers to its current implementation on modern highend personal computers. Mirage provides a form of shared memory that is network transparent in a loosely coupled envi ..."
Abstract

Cited by 3 (2 self)
 Add to MetaCart
This paper describes the evolution of a distributed shared memory (DSM) system, Mirage, from its original implementation on VAX computers to its current implementation on modern highend personal computers. Mirage provides a form of shared memory that is network transparent in a loosely coupled environment. The system hides network boundaries for processes that are accessing shared memory and is upward compatible with the System V UNIX 1 interface. This paper addresses the architectural dependencies in the design of the system and evaluates performance of the implementation. Mirage + performance is similar to Mirage, but the communication bottleneck has become more severe because of the larger page size used in the implementation. We show how this problem can be resolved on conventional hardware at little additional expense by using compression techniques. 1 UNIX is a Registered Trademark of AT&T. Contents 1 Introduction 1 1.1 Reasons for Porting to a New Platform : : : : : : ...
Fast Pattern Matching for Entropy Bounded Text
 in Proceedings of DCC'95 Data Compression Conference, Snowbird
, 1995
"... We present the first known case of onedimensional and twodimensional string matching algorithms for text with bounded entropy. Let n be the length of the text and m be the length of the pattern. We show that the expected complexity of the algorithms is related to the entropy of the text for variou ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
We present the first known case of onedimensional and twodimensional string matching algorithms for text with bounded entropy. Let n be the length of the text and m be the length of the pattern. We show that the expected complexity of the algorithms is related to the entropy of the text for various assumptions of the distribution of the pattern. For the case of uniformly distributed patterns, our one dimensional matching algorithm works in O(n log m=pm)) expected running time where H is the entropy of the text and p = 1 \Gamma (1 \Gamma H 2 ) H=(1+H) . The worst case running time T can also be bounded by n log m p(m+ p V ) T n log m p(m\Gamma p V ) if V is the variance of the source from which the pattern is generated. Our algorithm utilizes data structures and probabilistic analysis techniques that are found in certain lossless data compression schemes. 1 Introduction 1.1 Pattern matching problem Given a text of length n and a pattern of length m, the pattern match...
Online Data Compression
, 1992
"... We have incorporated online data compression into the low levels of a logstructured file system (Rosenblum's Sprite LFS). Each block of data or metadata is compressed as it is written to the disk and decompressed as it is read. The logstructuring overcomes the problems of allocation and frag ..."
Abstract
 Add to MetaCart
We have incorporated online data compression into the low levels of a logstructured file system (Rosenblum's Sprite LFS). Each block of data or metadata is compressed as it is written to the disk and decompressed as it is read. The logstructuring overcomes the problems of allocation and fragmentation for variablesized blocks. We observe compression factors ranging from 1.6 to 2.2, using algorithms running from 1.7 to 0.4 MBytes per second in software on a DECstation 5000/200. System performance is degraded by a few percent for normal activities (such as compiling or editing), and as much as a factor of 1.6 for file system intensive operations (such as copying multimegabyte files).
Using Learning and Difficulty of Prediction to Decrease Computation: A Fast Sort and Priority Queue on Entropy Bounded Inputs ∗
"... There is an upsurge in interest in the Markov model and also more general stationary ergodic stochastic distributions in theoretical computer science community recently, (e.g. see [Vitter,Krishnan,FOCS91], [Karlin,Philips,Raghavan,FOCS92] [Raghavan92]) for use of Markov models for online algorithms ..."
Abstract
 Add to MetaCart
There is an upsurge in interest in the Markov model and also more general stationary ergodic stochastic distributions in theoretical computer science community recently, (e.g. see [Vitter,Krishnan,FOCS91], [Karlin,Philips,Raghavan,FOCS92] [Raghavan92]) for use of Markov models for online algorithms e.g., cashing and prefetching). Their results used the fact that compressible sources are predictable (and vise versa), and show that online algorithms can improve their performance by prediction. Actual page access sequences are in fact somewhat compressible, so their predictive methods can be of benefit. This paper investigates the interesting idea of decreasing computation by using learning in the opposite way, namely to determine the difficulty of prediction. That is, we will approximately learn the input distribution, and then improve the performance of the computation when the input is not too predictable, rather than the reverse. To our knowledge, this is first case of a computational problem where we do not assume any particular fixed input distribution and yet computation is decreased when the input is less predictable, rather than the reverse. We concentrate our investigation on a basic computational problem: sorting and a basic data structure problem: maintaining a priority queue. We present the first known case of sorting and priority queue algorithms whose complexity depends on the binary entropy H ≤ 1 of input keys where assume that input keys are generated from an unknown but arbitrary stationary ergodic source. This is, we assume that each of the input keys can be each arbitrarily long, but have entropy H. Note that H
Industry Industry/Government Track Paper Track Paper DensityBased Spam Detector
"... The volume of mass unsolicited electronic mail, often known as spam, has recently increased enormously and has become a serious threat to not only the Internet but also to society. This paper proposes a new spam detection method which uses document space density information. Although it requires ext ..."
Abstract
 Add to MetaCart
The volume of mass unsolicited electronic mail, often known as spam, has recently increased enormously and has become a serious threat to not only the Internet but also to society. This paper proposes a new spam detection method which uses document space density information. Although it requires extensive email traffic to acquire the necessary information, an unsupervised learning engine with a short white list can achieve a 98 % recall rate and 100 % precision. A directmapped cache method contributes handling of over 13,000 emails per second. Experimental results, which were conducted using over 50 million actual emails of traffic, are also reported in this paper.
Robert W. Taylor, DirectorOnline Data Compression
, 1992
"... DEC’s business and technology objectives require a strong research program. The Systems Research Center (SRC) and three other research laboratories are committed to filling that need. SRC began recruiting its first research scientists in l984—their charter, to advance the state of knowledge in all a ..."
Abstract
 Add to MetaCart
DEC’s business and technology objectives require a strong research program. The Systems Research Center (SRC) and three other research laboratories are committed to filling that need. SRC began recruiting its first research scientists in l984—their charter, to advance the state of knowledge in all aspects of computer systems research. Our current work includes exploring highperformance personal computing, distributed computing, programming environments, system modelling techniques, specification technology, and tightlycoupled multiprocessors. Our approach to both hardware and software research is to create and use real systems so that we can investigate their properties fully. Complex systems cannot be evaluated solely in the abstract. Based on this belief, our strategy is to demonstrate the technical and practical feasibility of our ideas by building prototypes and using them as daily tools. The experience we gain is useful in the short term in enabling us to refine our designs, and invaluable in the long term in helping us to advance the state of knowledge about those systems. Most of the major advances