Results 1  10
of
40
Implementing signatures for transactional memory
 40th Intl. Symp. on Microarchitecture
, 2007
"... Transactional Memory (TM) systems must track the read and write sets—items read and written during a transaction—to detect conflicts among concurrent transactions. Several TMs use signatures, which summarize unbounded read/write sets in bounded hardware at a performance cost of false positives (conf ..."
Abstract

Cited by 35 (6 self)
 Add to MetaCart
Transactional Memory (TM) systems must track the read and write sets—items read and written during a transaction—to detect conflicts among concurrent transactions. Several TMs use signatures, which summarize unbounded read/write sets in bounded hardware at a performance cost of false positives (conflicts detected when none exists). This paper examines different organizations to achieve hardwareefficient and accurate TM signatures. First, we find that implementing each signature with a single khashfunction Bloom filter (True Bloom signature) is inefficient, as it requires multiported SRAMs. Instead, we advocate using k singlehashfunction Bloom filters in parallel (Parallel Bloom signature), using areaefficient singleported SRAMs. Our formal analysis shows that both organizations perform equally well in theory and our simulationbased evaluation shows this to hold approximately in practice. We also show that by choosing highquality hash functions we can achieve signature designs noticeably more accurate than the previously proposed implementations. Finally, we adapt Pagh and Rodler’s cuckoo hashing to implement CuckooBloom signatures. While this representation does not support set intersection, it mitigates false positives for the common case of small read/write sets and performs like a Bloom filter for large sets. 1.
Beyond Bloom Filters: From Approximate Membership Checks to Approximate State Machines
 SIGCOMM '06
, 2006
"... Many networking applications require fast state lookups in a concurrent state machine, which tracks the state of a large number of flows simultaneously. We consider the question of how to compactly represent such concurrent state machines. To achieve compactness, we consider data structures for Appr ..."
Abstract

Cited by 31 (4 self)
 Add to MetaCart
Many networking applications require fast state lookups in a concurrent state machine, which tracks the state of a large number of flows simultaneously. We consider the question of how to compactly represent such concurrent state machines. To achieve compactness, we consider data structures for Approximate Concurrent State Machines (ACSMs) that can return false positives, false negatives, or a “don’t know” response. We describe three techniques based on Bloom filters and hashing, and evaluate them using both theoretical analysis and simulation. Our analysis leads us to an extremely efficient hashingbased scheme with several parameters that can be chosen to trade off space, computation, and the impact of errors. Our hashing approach also yields a simple alternative structure with the same functionality as a counting Bloom filter that uses much less space. We show how ACSMs can be used for video congestion control. Using an ACSM, a router can implement sophisticated Active Queue Management (AQM) techniques for video traffic (without the need for standards changes to mark packets or change video formats), with a factor of four reduction in memory compared to fullstate schemes and with very little error. We also show that ACSMs show promise for realtime detection of P2P traffic.
Finding complex concurrency bugs in large multithreaded applications
 In Proceedings of the European Conference on Computer Systems
, 2011
"... Parallel software is increasingly necessary to take advantage of multicore architectures, but it is also prone to concurrency bugs which are particularly hard to avoid, find, and fix, since their occurrence depends on specific thread interleavings. In this paper we propose a concurrency bug detecto ..."
Abstract

Cited by 11 (0 self)
 Add to MetaCart
Parallel software is increasingly necessary to take advantage of multicore architectures, but it is also prone to concurrency bugs which are particularly hard to avoid, find, and fix, since their occurrence depends on specific thread interleavings. In this paper we propose a concurrency bug detector that automatically identifies when an execution of a program triggers a concurrency bug. Unlike previous concurrency bug detectors, we are able to find two particularly hard classes of bugs. The first are bugs that manifest themselves by subtle violation of application semantics, such as returning an incorrect result. The second are latent bugs, which silently corrupt internal data structures, and are especially hard to detect because when these bugs are triggered they do not become immediately visible. PIKE detects these concurrency bugs by checking both the output and the internal state of the application for linearizability at the level of user requests. This paper presents this technique for finding concurrency bugs, its application in the context of a testing tool that systematically searches for such problems, and our experience in applying our approach to MySQL, a largescale complex multithreaded application. We were able to find several concurrency bugs in a stable version of the application, including subtle violations of application semantics, latent bugs, and incorrect error replies.
Rankindexed hashing: A compact construction of bloom filters and extra bits per counter (σ) lg(M/N
"... Abstract—Bloom filter and its variants have found widespread use in many networking applications. For these applications, minimizing storage cost is paramount as these filters often need to be implemented using scarce and costly (onchip) SRAM. Besides supporting membership queries, Bloom filters ha ..."
Abstract

Cited by 9 (1 self)
 Add to MetaCart
Abstract—Bloom filter and its variants have found widespread use in many networking applications. For these applications, minimizing storage cost is paramount as these filters often need to be implemented using scarce and costly (onchip) SRAM. Besides supporting membership queries, Bloom filters have been generalized to support deletions and the encoding of information. Although a standard Bloom filter construction has proven to be extremely spaceefficient, it is unnecessarily costly when generalized. Alternative constructions based on storing fingerprints in hash tables have been proposed that offer the same functionality as some Bloom filter variants, but using less space. In this paper, we propose a new fingerprint hash table construction called RankIndexed Hashing that can achieve very compact representations. A rankindexed hashing construction that offers the same functionality as a counting Bloom filter can be achieved with a factor of three or more in space savings even for a false positive probability of just 1%. Even for a basic Bloom filter function that only supports membership queries, a rankindexed hashing construction requires less space for a false positive probability as high as 0.1%, which is significant since a standard Bloom filter construction is widely regarded as extremely spaceefficient for approximate membership problems. I.
HashBased Techniques for HighSpeed Packet Processing
"... Abstract Hashing is an extremely useful technique for a variety of highspeed packetprocessing applications in routers. In this chapter, we survey much of the recent work in this area, paying particular attention to the interaction between theoretical and applied research. We assume very little bac ..."
Abstract

Cited by 9 (1 self)
 Add to MetaCart
Abstract Hashing is an extremely useful technique for a variety of highspeed packetprocessing applications in routers. In this chapter, we survey much of the recent work in this area, paying particular attention to the interaction between theoretical and applied research. We assume very little background in either the theory or applications of hashing, reviewing the fundamentals as necessary. 1
Bloom Filters via dleft Hashing and Dynamic Bit Reassignment
 Proceedings of the Allerton Conference on Communication, Control and Computing
, 2006
"... Abstract — In recent work, the authors introduced a data structure with the same functionality as a counting Bloom filter (CBF) based on fingerprints and the dleft hashing technique. This paper describes dynamic bit reassignment, an approach that allows the size of the fingerprint to flexibly chang ..."
Abstract

Cited by 8 (2 self)
 Add to MetaCart
Abstract — In recent work, the authors introduced a data structure with the same functionality as a counting Bloom filter (CBF) based on fingerprints and the dleft hashing technique. This paper describes dynamic bit reassignment, an approach that allows the size of the fingerprint to flexibly change with the load in each hash bucket, thereby reducing the probability of a false positive. This technique allows us to not only improve our dleft counting Bloom filter, but also to construct a data structure with the same functionality as a Bloom filter, including the ability to handle insertions online, that yields fewer false positives for sufficiently large filters. Our results show that our dleft Bloom filter data structure begins achieving smaller false positive rates than the standard construction at 16 bits per element. We explain the technique, describe why it is amenable to hardware implementation, and provide experimental results. I.
Straggler identification in roundtrip data streams via Newton’s identities and invertible Bloom filters
 IEEE Transactions on Knowledge and Data Engineering
, 2011
"... Abstract. We study the straggler identification problem, in which an algorithm must determine the identities of the remaining members of a set after it has had a large number of insertion and deletion operations performed on it, and now has relatively few remaining members. The goal is to do this in ..."
Abstract

Cited by 7 (1 self)
 Add to MetaCart
Abstract. We study the straggler identification problem, in which an algorithm must determine the identities of the remaining members of a set after it has had a large number of insertion and deletion operations performed on it, and now has relatively few remaining members. The goal is to do this in o(n) space, where n is the total number of identities. Straggler identification has applications, for example, in determining the unacknowledged packets in a highbandwidth multicast data stream. We provide a deterministic solution to the straggler identification problem that uses only O(d log n) bits, based on a novel application of Newton’s identities for symmetric polynomials. This solution can identify any subset of d stragglers from a set of n O(log n)bit identifiers, assuming that there are no false deletions of identities not already in the set. Indeed, we give a lower bound argument that shows that any smallspace deterministic solution to the straggler identification problem cannot be guaranteed to handle false deletions. Nevertheless, we provide a simple randomized solution using O(d log nlog(1/ǫ)) bits that can maintain a multiset and solve the straggler identification problem, tolerating false deletions, where ǫ> 0 is a userdefined parameter bounding the probability of an incorrect response. This randomized solution is based on a new type of Bloom filter, which we call the invertible Bloom filter.
A Power Management Proxy with a New BestofN Bloom Filter Design to Reduce False Positives
 In IEEE International Performance Computing and Communications Conference
, 2007
"... Bloom filters are a probabilistic data structure used to evaluate set membership. A group of hash functions are used to map elements into a Bloom filter and to test elements for membership. In this paper, we propose using multiple groups of hash functions and selecting the group that generates the B ..."
Abstract

Cited by 7 (2 self)
 Add to MetaCart
Bloom filters are a probabilistic data structure used to evaluate set membership. A group of hash functions are used to map elements into a Bloom filter and to test elements for membership. In this paper, we propose using multiple groups of hash functions and selecting the group that generates the Bloom filter instance with the smallest number of bits set to 1. We evaluate the performance of this new BestofN method using order statistics and an actual implementation. Our analysis shows that significant reduction in the probability of a false positive can be achieved. We also propose and evaluate a new method that uses a Random Number Generator (RNG) to generate multiple hashes from one initial “seed ” hash. This RNG method (motivated by a method from Kirsch and Mitzenmacher) makes the computational expense of the BestofN method very modest. The target application is a power management proxy for P2P applications executing in a resourceconstrained “SmartNIC”.
Don’t thrash: How to cache your hash on flash
 In Proceedings of the 38th International Conference on Very Large Data Bases
, 2012
"... As the Internet grows, computers collect, store, search, and index data at increasingly rapid rates. A recent IDC study estimates that more than 300 Exabytes of harddisk storage will be delivered in the next five years to ..."
Abstract

Cited by 5 (1 self)
 Add to MetaCart
As the Internet grows, computers collect, store, search, and index data at increasingly rapid rates. A recent IDC study estimates that more than 300 Exabytes of harddisk storage will be delivered in the next five years to
The Bloom Paradox: When not to Use a Bloom Filter?
"... Abstract—In this paper, we uncover the Bloom paradox in Bloom filters: sometimes, it is better to disregard the query results of Bloom filters, and in fact not to even query them, thus making them useless. We first analyze conditions under which the Bloom paradox occurs in a Bloom filter, and demons ..."
Abstract

Cited by 4 (2 self)
 Add to MetaCart
Abstract—In this paper, we uncover the Bloom paradox in Bloom filters: sometimes, it is better to disregard the query results of Bloom filters, and in fact not to even query them, thus making them useless. We first analyze conditions under which the Bloom paradox occurs in a Bloom filter, and demonstrate that it depends on the a priori probability that a given element belongs to the represented set. We show that the Bloom paradox also applies to Counting Bloom Filters (CBFs), and depends on the product of the hashed counters of each element. In addition, both for Bloom filters and CBFs, we suggest improved architectures that deal with the Bloom paradox. We also provide fundamental memory lower bounds required to support element queries with limited falsepositive and falsenegative rates. Last, using simulations, we verify our theoretical results, and show that our improved schemes can lead to a significant improvement in the performance of Bloom filters and CBFs. A. The Bloom Paradox