Results 1  10
of
41
Software Defined Traffic Measurement with OpenSketch
"... Most network management tasks in softwaredefined networks (SDN) involve two stages: measurement and control. While many efforts have been focused on network control APIs for SDN, little attention goes into measurement. The key challenge of designing a new measurement API is to strike a careful bala ..."
Abstract

Cited by 29 (3 self)
 Add to MetaCart
(Show Context)
Most network management tasks in softwaredefined networks (SDN) involve two stages: measurement and control. While many efforts have been focused on network control APIs for SDN, little attention goes into measurement. The key challenge of designing a new measurement API is to strike a careful balance between generality (supporting a wide variety of measurement tasks) and efficiency (enabling high link speed and low cost). We propose a software defined traffic measurement architecture OpenSketch, which separates the measurement data plane from the control plane. In the data plane, OpenSketch provides a simple threestage pipeline (hashing, filtering, and counting), which can be implemented with commodity switch components and support many measurement tasks. In the control plane, OpenSketch provides a measurement library that automatically configures the pipeline and allocates resources for different measurement tasks. Our evaluations of realworld packet traces, our prototype on NetFPGA, and the implementation of five measurement tasks on top of OpenSketch, demonstrate that OpenSketch is general, efficient and easily programmable. 1
Deamortized Cuckoo Hashing: Provable WorstCase Performance and Experimental Results
"... Cuckoo hashing is a highly practical dynamic dictionary: it provides amortized constant insertion time, worst case constant deletion time and lookup time, and good memory utilization. However, with a noticeable probability during the insertion of n elements some insertion requires Ω(log n) time. Whe ..."
Abstract

Cited by 20 (3 self)
 Add to MetaCart
Cuckoo hashing is a highly practical dynamic dictionary: it provides amortized constant insertion time, worst case constant deletion time and lookup time, and good memory utilization. However, with a noticeable probability during the insertion of n elements some insertion requires Ω(log n) time. Whereas such an amortized guarantee may be suitable for some applications, in other applications (such as highperformance routing) this is highly undesirable. Kirsch and Mitzenmacher (Allerton ’07) proposed a deamortization of cuckoo hashing using queueing techniques that preserve its attractive properties. They demonstrated a significant improvement to the worst case performance of cuckoo hashing via experimental results, but left open the problem of constructing a scheme with provable properties. In this work we present a deamortization of cuckoo hashing that provably guarantees constant worst case operations. Specifically, for any sequence of polynomially many operations, with overwhelming probability over the randomness of the initialization phase, each operation is performed in constant time. In addition, we present a general approach for proving that the performance guarantees are preserved when using hash functions with limited independence
Maximum matchings in random bipartite graphs and the space utilization of cuckoo hashtables
, 2009
"... We study the the following question in Random Graphs. We are given two disjoint sets L, R with L  = n = αm and R  = m. We construct a random graph G by allowing each x ∈ L to choose d random neighbours in R. The question discussed is as to the size µ(G) of the largest matching in G. When consi ..."
Abstract

Cited by 17 (0 self)
 Add to MetaCart
(Show Context)
We study the the following question in Random Graphs. We are given two disjoint sets L, R with L  = n = αm and R  = m. We construct a random graph G by allowing each x ∈ L to choose d random neighbours in R. The question discussed is as to the size µ(G) of the largest matching in G. When considered in the context of Cuckoo Hashing, one key question is as to when is µ(G) = n whp? We answer this question exactly when d is at least three. We also establish a precise threshold for when Phase 1 of the KarpSipser Greedy matching algorithm suffices to compute a maximum matching whp.
HistoryIndependent Cuckoo Hashing
"... Cuckoo hashing is an efficient and practical dynamic dictionary. It provides expected amortized constant update time, worst case constant lookup time, and good memory utilization. Various experiments demonstrated that cuckoo hashing is highly suitable for modern computer architectures and distribute ..."
Abstract

Cited by 15 (4 self)
 Add to MetaCart
(Show Context)
Cuckoo hashing is an efficient and practical dynamic dictionary. It provides expected amortized constant update time, worst case constant lookup time, and good memory utilization. Various experiments demonstrated that cuckoo hashing is highly suitable for modern computer architectures and distributed settings, and offers significant improvements compared to other schemes. In this work we construct a practical historyindependent dynamic dictionary based on cuckoo hashing. In a historyindependent data structure, the memory representation at any point in time yields no information on the specific sequence of insertions and deletions that led to its current content, other than the content itself. Such a property is significant when preventing unintended leakage of information, and was also found useful in several algorithmic settings. Our construction enjoys most of the attractive properties of cuckoo hashing. In particular, no dynamic memory allocation is required, updates are performed in expected amortized constant time, and membership queries are performed in worst case constant time. Moreover, with high probability, the lookup procedure queries only two memory entries which are independent and can be queried in parallel. The approach underlying our construction is to enforce a canonical memory representation on cuckoo hashing. That is, up to the initial randomness, each set of elements has a unique memory representation.
HashBased Techniques for HighSpeed Packet Processing
"... Abstract Hashing is an extremely useful technique for a variety of highspeed packetprocessing applications in routers. In this chapter, we survey much of the recent work in this area, paying particular attention to the interaction between theoretical and applied research. We assume very little bac ..."
Abstract

Cited by 15 (2 self)
 Add to MetaCart
(Show Context)
Abstract Hashing is an extremely useful technique for a variety of highspeed packetprocessing applications in routers. In this chapter, we survey much of the recent work in this area, paying particular attention to the interaction between theoretical and applied research. We assume very little background in either the theory or applications of hashing, reviewing the fundamentals as necessary. 1
Measuring independence of datasets
 CoRR
"... Approximating pairwise, or kwise, independence with sublinear memory is of considerable importance in the data stream model. In the streaming model the joint distribution is given by a stream of ktuples, with the goal of testing correlations among the components measured over the entire stream. In ..."
Abstract

Cited by 13 (2 self)
 Add to MetaCart
(Show Context)
Approximating pairwise, or kwise, independence with sublinear memory is of considerable importance in the data stream model. In the streaming model the joint distribution is given by a stream of ktuples, with the goal of testing correlations among the components measured over the entire stream. Indyk and McGregor (SODA 08) recently gave exciting new results for measuring pairwise independence in this model. Statistical distance is one of the most fundamental metrics for measuring the similarity of two distributions, and it has been a metric of choice in many papers that discuss distribution closeness. For pairwise independence, the Indyk and McGregor methods provide log napproximation under statistical distance between the joint and product distributions in the streaming model. Indyk and McGregor leave, as their main open question, the problem of improving their log napproximation for the statistical distance metric. In this paper we solve the main open problem posed by Indyk and McGregor for the statistical distance for pairwise independence and extend this result to any constant k. In particular, we present an algorithm that computes an (ɛ, δ)approximation of the statistical distance between the joint and product distributions defined by a stream of 1 nm ktuples. Our algorithm requires O log( ɛ δ)) (30+k) k) memory and a single pass over the data stream.
Invertible Bloom Lookup Tables
 In Proceedings of the 49th Allerton Conference
, 2011
"... We present a version of the Bloom filter data structure that supports not only the insertion, deletion, and lookup of keyvalue pairs, but also allows a complete listing of the pairs it contains with high probability, as long the number of keyvalue pairs is below a designed threshold. Our structure ..."
Abstract

Cited by 13 (6 self)
 Add to MetaCart
(Show Context)
We present a version of the Bloom filter data structure that supports not only the insertion, deletion, and lookup of keyvalue pairs, but also allows a complete listing of the pairs it contains with high probability, as long the number of keyvalue pairs is below a designed threshold. Our structure allows the number of keyvalue pairs to greatly exceed this threshold during normal operation. Exceeding the threshold simply temporarily prevents content listing and reduces the probability of a successful lookup. If entries are later deleted to return the structure below the threshold, everything again functions appropriately. We also show that simple variations of our structure are robust to certain standard errors, such as the deletion of a key without a corresponding insertion or the insertion of two distinct values for a key. The properties of our structure make it suitable for several applications, including database and networking applications that we highlight. 1
String hashing for linear probing
 In Proc. 20th SODA
, 2009
"... Linear probing is one of the most popular implementations of dynamic hash tables storing all keys in a single array. When we get a key, we first hash it to a location. Next we probe consecutive locations until the key or an empty location is found. At STOC’07, Pagh et al. presented data sets where t ..."
Abstract

Cited by 12 (4 self)
 Add to MetaCart
(Show Context)
Linear probing is one of the most popular implementations of dynamic hash tables storing all keys in a single array. When we get a key, we first hash it to a location. Next we probe consecutive locations until the key or an empty location is found. At STOC’07, Pagh et al. presented data sets where the standard implementation of 2universal hashing leads to an expected number of Ω(log n) probes. They also showed that with 5universal hashing, the expected number of probes is constant. Unfortunately, we do not have 5universal hashing for, say, variable length strings. When we want to do such complex hashing from a complex domain, the generic standard solution is that we first do collision free hashing (w.h.p.) into a simpler intermediate domain, and second do the complicated hash function on this intermediate domain. Our contribution is that for an expected constant number of linear probes, it is suffices that each key has O(1) expected collisions with the first hash function, as long as the second hash function is 5universal. This means that the intermediate domain can be n times smaller, and such a smaller intermediate domain typically means that the overall hash function can be made simpler and at least twice as fast. The same doubling of hashing speed for O(1) expected probes follows for most domains bigger than 32bit integers, e.g., 64bit integers and fixed length strings. In addition, we study how the overhead from linear probing diminishes as the array gets larger, and what happens if strings are stored directly as intervals of the array. These cases were not considered by Pagh et al. 1
Some Open Questions Related to Cuckoo Hashing
"... Abstract. The purpose of this brief note is to describe recent work in the area of cuckoo hashing, including a clear description of several open problems, with the hope of spurring further research. 1 ..."
Abstract

Cited by 12 (4 self)
 Add to MetaCart
(Show Context)
Abstract. The purpose of this brief note is to describe recent work in the area of cuckoo hashing, including a clear description of several open problems, with the hope of spurring further research. 1
Sampleoptimal averagecase sparse fourier transform in two dimensions. Unpublished manuscript
, 2012
"... We present the first sampleoptimal sublinear time algorithms for the sparse Discrete Fourier Transform over a twodimensional √ n × √ n grid. Our algorithms are analyzed for average case signals. For signals whose spectrum is exactly sparse, our algorithms use O(k) samples and run in O(klogk) time ..."
Abstract

Cited by 9 (7 self)
 Add to MetaCart
We present the first sampleoptimal sublinear time algorithms for the sparse Discrete Fourier Transform over a twodimensional √ n × √ n grid. Our algorithms are analyzed for average case signals. For signals whose spectrum is exactly sparse, our algorithms use O(k) samples and run in O(klogk) time, wherek is the expected sparsity of the signal. For signals whose spectrum is approximately sparse, our algorithm usesO(klogn) samples and runs in O(klog 2 n) time; the latter algorithm works for k = Θ ( √ n). The number of samples used by our algorithms matches the known lower bounds for the respective signal models. By a known reduction, our algorithms give similar results for the onedimensional sparse Discrete Fourier Transform whennis a power of a small composite number (e.g.,n = 6 t). 1