Results 11 - 20
of
458
Min-wise Independent Permutations
- Journal of Computer and System Sciences
, 1998
"... We define and study the notion of min-wise independent families of permutations. We say that F ⊆ Sn is min-wise independent if for any set X ⊆ [n] and any x ∈ X, when π is chosen at random in F we have Pr(min{π(X)} = π(x)) = 1 |X |. In other words we require that all the elements of any fixed set ..."
Abstract
-
Cited by 151 (10 self)
- Add to MetaCart
We define and study the notion of min-wise independent families of permutations. We say that F ⊆ Sn is min-wise independent if for any set X ⊆ [n] and any x ∈ X, when π is chosen at random in F we have Pr(min{π(X)} = π(x)) = 1 |X |. In other words we require that all the elements of any fixed set X have an equal chance to become the minimum element of the image of X under π. Our research was motivated by the fact that such a family (under some relaxations) is essential to the algorithm used in practice by the AltaVista web index software to detect and filter near-duplicate documents. However, in the course of our investigation we have discovered interesting and challenging theoretical questions related to this concept – we present the solutions to some of them and we list the rest as open problems.
What’s hot and what’s not: Tracking most frequent items dynamically
- In Proceedings of ACM Principles of Database Systems
, 2003
"... Most database management systems maintain statistics on the underlying relation. One of the important statistics is that of the “hot items ” in the relation: those that appear many times (most frequently, or more than some threshold). For example, end-biased histograms keep the hot items as part of ..."
Abstract
-
Cited by 139 (12 self)
- Add to MetaCart
Most database management systems maintain statistics on the underlying relation. One of the important statistics is that of the “hot items ” in the relation: those that appear many times (most frequently, or more than some threshold). For example, end-biased histograms keep the hot items as part of the histogram and are used in selectivity estimation. Hot items are used as simple outliers in data mining, and in anomaly detection in many applications. We present new methods for dynamically determining the hot items at any time in a relation which is undergoing deletion operations as well as inserts. Our methods maintain small space data structures that monitor the transactions on the relation, and when required, quickly output all hot items, without rescanning the relation in the database. With user-specified probability, all hot items are correctly reported. Our methods rely on ideas from “group testing”. They are simple to implement, and have provable quality, space and time guarantees. Previously known algorithms for this problem that make similar quality and performance guarantees can not handle deletions, and those that handle deletions can not make similar guarantees without rescanning the database. Our experiments with real and synthetic data show that our algorithms are accurate in dynamically tracking the hot items independent of the rate of insertions and deletions.
Single-Packet IP Traceback
, 2002
"... The design of the IP protocol makes it difficult to reliably identify the originator of an IP packet. Even in the absence of any deliberate attempt to disguise a packet's origin, wide-spread packet forwarding techniques such as NAT and encapsulation may obscure the packet's true source. Techniques h ..."
Abstract
-
Cited by 133 (4 self)
- Add to MetaCart
The design of the IP protocol makes it difficult to reliably identify the originator of an IP packet. Even in the absence of any deliberate attempt to disguise a packet's origin, wide-spread packet forwarding techniques such as NAT and encapsulation may obscure the packet's true source. Techniques have been developed to determine the source of large packet flows, but, to date, no system has been presented to track individual packets in an efficient, scalable fashion. We present a hash-based technique for IP traceback that generates audit trails for traffic within the network, and can trace the origin of a single IP packet delivered by the network in the recent past. We demonstrate that the system is effective, space-efficient (requiring approximately 0.5% of the link capacity per unit time in storage) , and implementable in current or next-generation routing hardware. We present both analytic and simulation results showing the system's effectiveness.
Dynamic Perfect Hashing: Upper and Lower Bounds
, 1990
"... The dynamic dictionary problem is considered: provide an algorithm for storing a dynamic set, allowing the operations insert, delete, and lookup. A dynamic perfect hashing strategy is given: a randomized algorithm for the dynamic dictionary problem that takes O(1) worst-case time for lookups and ..."
Abstract
-
Cited by 123 (13 self)
- Add to MetaCart
The dynamic dictionary problem is considered: provide an algorithm for storing a dynamic set, allowing the operations insert, delete, and lookup. A dynamic perfect hashing strategy is given: a randomized algorithm for the dynamic dictionary problem that takes O(1) worst-case time for lookups and O(1) amortized expected time for insertions and deletions; it uses space proportional to the size of the set stored. Furthermore, lower bounds for the time complexity of a class of deterministic algorithms for the dictionary problem are proved. This class encompasses realistic hashing-based schemes that use linear space. Such algorithms have amortized worst-case time complexity \Omega(log n) for a sequence of n insertions and
On Hiding Information from an Oracle
, 1989
"... : We consider the problem of computing with encrypted data. Player A wishes to know the value f(x) for some x but lacks the power to compute it. Player B has the power to compute f and is willing to send f(y) to A if she sends him y, for any y. Informally, an encryption scheme for the problem f is a ..."
Abstract
-
Cited by 119 (15 self)
- Add to MetaCart
: We consider the problem of computing with encrypted data. Player A wishes to know the value f(x) for some x but lacks the power to compute it. Player B has the power to compute f and is willing to send f(y) to A if she sends him y, for any y. Informally, an encryption scheme for the problem f is a method by which A, using her inferior resources, can transform the cleartext instance x into an encrypted instance y, obtain f(y) from B, and infer f(x) from f(y) in such a way that B cannot infer x from y. When such an encryption scheme exists, we say that f is encryptable. The framework defined in this paper enables us to prove precise statements about what an encrypted instance hides and what it leaks, in an information-theoretic sense. Our definitions are cast in the language of probability theory and do not involve assumptions such as the intractability of factoring or the existence of one-way functions. We use our framework to describe encryption schemes for some well-known function...
Efficient generation of shared RSA keys
- Advances in Cryptology -- CRYPTO 97
, 1997
"... We describe efficient techniques for a number of parties to jointly generate an RSA key. At the end of the protocol an RSA modulus N = pq is publicly known. None of the parties know the factorization of N. In addition a public encryption exponent is publicly known and each party holds a share of the ..."
Abstract
-
Cited by 112 (4 self)
- Add to MetaCart
We describe efficient techniques for a number of parties to jointly generate an RSA key. At the end of the protocol an RSA modulus N = pq is publicly known. None of the parties know the factorization of N. In addition a public encryption exponent is publicly known and each party holds a share of the private exponent that enables threshold decryption. Our protocols are efficient in computation and communication. All results are presented in the honest but curious settings (passive adversary).
Counting Distinct Elements in a Data Stream
, 2002
"... We present three algorithms to count the number of distinct elements in a data stream to within a factor of 1 ± epsilon. Our algorithms improve upon known algorithms for this problem, and offer a spectrum of time/space tradeoffs. ..."
Abstract
-
Cited by 111 (4 self)
- Add to MetaCart
We present three algorithms to count the number of distinct elements in a data stream to within a factor of 1 ± epsilon. Our algorithms improve upon known algorithms for this problem, and offer a spectrum of time/space tradeoffs.
Universal Hash Proofs and a Paradigm for Adaptive Chosen Ciphertext Secure Public-Key Encryption
, 2001
"... We present several new and fairly practical public-key encryption schemes and prove them secure against adaptive chosen ciphertext attack. One scheme is based on Paillier's Decision Composite Residuosity (DCR) assumption [7], while another is based in the classical Quadratic Residuosity (QR) assu ..."
Abstract
-
Cited by 105 (7 self)
- Add to MetaCart
We present several new and fairly practical public-key encryption schemes and prove them secure against adaptive chosen ciphertext attack. One scheme is based on Paillier's Decision Composite Residuosity (DCR) assumption [7], while another is based in the classical Quadratic Residuosity (QR) assumption. The analysis is in the standard cryptographic model, i.e., the security of our schemes does not rely on the Random Oracle model. We also introduce the notion of a universal hash proof system. Essentially, this is a special kind of non-interactive zero-knowledge proof system for an NP language. We do not show that universal hash proof systems exist for all NP languages, but we do show how to construct very ecient universal hash proof systems for a general class of group-theoretic language membership problems. Given an ecient universal hash proof system for a language with certain natural cryptographic indistinguishability properties, we show how to construct an ecient public-key encryption schemes secure against adaptive chosen ciphertext attack in the standard model. Our construction only uses the universal hash proof system as a primitive: no other primitives are required, although even more ecient encryption schemes can be obtained by using hash functions with appropriate collision-resistance properties. We show how to construct ecient universal hash proof systems for languages related to the DCR and QR assumptions. From these we get corresponding public-key encryption schemes that are secure under these assumptions. We also show that the Cramer-Shoup encryption scheme (which up until now was the only practical encryption scheme that could be proved secure against adaptive chosen ciphertext attack under a reasonable assumption, namely, the Decision...
Building a Better NetFlow
, 2004
"... Network operators need to determine the composition of the traffic mix on links when looking for dominant applications, users, or estimating traffic matrices. Cisco's NetFlow has evolved into a solution that satisfies this need by reporting flow records that summarize a sample of the traffic travers ..."
Abstract
-
Cited by 102 (5 self)
- Add to MetaCart
Network operators need to determine the composition of the traffic mix on links when looking for dominant applications, users, or estimating traffic matrices. Cisco's NetFlow has evolved into a solution that satisfies this need by reporting flow records that summarize a sample of the traffic traversing the link. But sampled NetFlow has shortcomings that hinder the collection and analysis of traffic data. First, during flooding attacks router memory and network bandwidth consumed by flow records can increase beyond what is available; second, selecting the right static sampling rate is difficult because no single rate gives the right tradeoff of memory use versus accuracy for all traffic mixes; third, the heuristics routers use to decide when a flow is reported are a poor match to most applications that work with time bins; finally, it is impossible to estimate without bias the number of active flows for aggregates with non-TCP traffic. In thi paper we propose...
Visual Cryptography
, 1995
"... In this paper we consider a new type of cryptographic scheme, which can decode concealed images without any cryptographic computations. The scheme is perfectly secure and very easy to implement. We extend it into a visual variant of the k out of n secret sharing problem, in which a dealer provides a ..."
Abstract
-
Cited by 98 (4 self)
- Add to MetaCart
In this paper we consider a new type of cryptographic scheme, which can decode concealed images without any cryptographic computations. The scheme is perfectly secure and very easy to implement. We extend it into a visual variant of the k out of n secret sharing problem, in which a dealer provides a transparency to each one of the n users; any k of them can see the image by stacking their transparencies, but any k - 1 of them gain no information about it.

