Results 1 - 10
of
27
Rerun: Exploiting Episodes for Lightweight Memory Race Recording
"... Multiprocessor deterministic replay has many potential uses in the era of multicore computing, including enhanced debugging, fault tolerance, and intrusion detection. While sources of nondeterminism in a uniprocessor can be recorded efficiently in software, it seems likely that hardware support will ..."
Abstract
-
Cited by 29 (0 self)
- Add to MetaCart
Multiprocessor deterministic replay has many potential uses in the era of multicore computing, including enhanced debugging, fault tolerance, and intrusion detection. While sources of nondeterminism in a uniprocessor can be recorded efficiently in software, it seems likely that hardware support will be needed in a multiprocessor environment where the outcome of memory races must also be recorded. We develop a memory race recording mechanism, called Rerun, that uses small hardware state (~166 bytes/core), writes a small race log (~4 bytes/kiloinstruction), and operates well as the number of cores per system scales (e.g., to 16 cores). Rerun exploits the dual of conventional wisdom in race recording: Rather than record information about individual memory accesses that conflict, we record how long a thread executes without conflicting with other threads. In particular, Rerun passively creates atomic episodes. Each episode is a dynamic instruction sequence that a thread happens to execute without interacting with other threads. Rerun uses Lamport Clocks to order episodes and enable replay of an equivalent execution. 1.
Implementing signatures for transactional memory
- 40th Intl. Symp. on Microarchitecture
, 2007
"... Transactional Memory (TM) systems must track the read and write sets—items read and written during a transaction—to detect conflicts among concurrent transactions. Several TMs use signatures, which summarize unbounded read/write sets in bounded hardware at a performance cost of false positives (conf ..."
Abstract
-
Cited by 23 (4 self)
- Add to MetaCart
Transactional Memory (TM) systems must track the read and write sets—items read and written during a transaction—to detect conflicts among concurrent transactions. Several TMs use signatures, which summarize unbounded read/write sets in bounded hardware at a performance cost of false positives (conflicts detected when none exists). This paper examines different organizations to achieve hardware-efficient and accurate TM signatures. First, we find that implementing each signature with a single k-hashfunction Bloom filter (True Bloom signature) is inefficient, as it requires multi-ported SRAMs. Instead, we advocate using k single-hash-function Bloom filters in parallel (Parallel Bloom signature), using area-efficient single-ported SRAMs. Our formal analysis shows that both organizations perform equally well in theory and our simulationbased evaluation shows this to hold approximately in practice. We also show that by choosing high-quality hash functions we can achieve signature designs noticeably more accurate than the previously proposed implementations. Finally, we adapt Pagh and Rodler’s cuckoo hashing to implement Cuckoo-Bloom signatures. While this representation does not support set intersection, it mitigates false positives for the common case of small read/write sets and performs like a Bloom filter for large sets. 1.
Interference-Resilient Information Exchange
"... This paper presents an efficient protocol to reliably exchange information in a single-hop radio network with unpredictable interference. The devices can access C communication channels. We model the interference with an adversary that can disrupt up to t of these channels simultaneously. We assume ..."
Abstract
-
Cited by 13 (6 self)
- Add to MetaCart
This paper presents an efficient protocol to reliably exchange information in a single-hop radio network with unpredictable interference. The devices can access C communication channels. We model the interference with an adversary that can disrupt up to t of these channels simultaneously. We assume no shared secret keys or third-party infrastructure. The running time of our protocol decreases as the gap between C and t increases. Two extreme cases prove particularly interesting: The running time is linear when the number of channels C = Ω(t 2), and exponential when only C = t + 1 channels are available. We prove that exponential-time is unavoidable in the latter case. At the core of our protocol lies a combinatorial function, of independent interest, and described for the first time in this paper: the multi-selector. This function determines a sequence of device channel assignments such that every sufficiently large subset of devices is partitioned, by at least one of these assignments, onto distinct channels.
Notary: Hardware Techniques to Enhance Signatures
- In Proc. of the 41st Annual IEEE/ACM International Symp. on Microarchitecture
, 2008
"... Hardware signatures have been recently proposed as an efficient mechanism to detect conflicts amongst concurrently running transactions in transactional memory systems (e.g., Bulk, LogTM-SE, and SigTM). Signatures use fixed hardware to represent an unbounded number of addresses, but may lead to fals ..."
Abstract
-
Cited by 12 (1 self)
- Add to MetaCart
Hardware signatures have been recently proposed as an efficient mechanism to detect conflicts amongst concurrently running transactions in transactional memory systems (e.g., Bulk, LogTM-SE, and SigTM). Signatures use fixed hardware to represent an unbounded number of addresses, but may lead to false conflicts (detecting a conflict when none exists). Previous work recommends that signatures be implemented with parallel Bloom filters with two or four hash functions (e.g., H3). Two problems exist with current signature designs. First, H3 implementations use many XOR gates. This increases hardware area and power overheads. Second, signature false positives can result from conflicts with signature bits set by private memory addresses that do not require isolation. This paper develops Notary, a coupling of two signature enhancements to ameliorate these problems. First, we use address entropy analysis to develop Page-Block-XOR (PBX) hashing and show it performs similar to H3 at lower hardware cost. Second, we introduce a privatization interface that explicitly allows the programmer to declare shared and private heap memory allocation. Privatization reduces false conflicts arising from private memory accesses and can lead to a reduction in the signature size used. Results from custom transistor-level layouts of H3 and PBX, along with full-system simulation of a 16-core chip-multiprocessor implementing LogTM-SE, show (a) PBX hashing performs similar to H3 hashing while requiring up to 24 % less area and 4.7 % less power overhead and (b) privatization can improve execution time by up to 86 % (by reducing false conflicts by up to 96%). 1.
Stream-based randomised language models for smt
- In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing
, 2009
"... Randomised techniques allow very big language models to be represented succinctly. However, being batch-based they are unsuitable for modelling an unbounded stream of language whilst maintaining a constant error rate. We present a novel randomised language model which uses an online perfect hash fun ..."
Abstract
-
Cited by 10 (1 self)
- Add to MetaCart
Randomised techniques allow very big language models to be represented succinctly. However, being batch-based they are unsuitable for modelling an unbounded stream of language whilst maintaining a constant error rate. We present a novel randomised language model which uses an online perfect hash function to efficiently deal with unbounded text streams. Translation experiments over a text stream show that our online randomised model matches the performance of batch-based LMs without incurring the computational overhead associated with full retraining. This opens up the possibility of randomised language models which continuously adapt to the massive volumes of texts published on the Web each day. 1
Verifying distributed erasure-coded data
- In Proceedings of the 26 th ACM Symposium on Principles of Distributed Computing
, 2007
"... Erasure coding can reduce the space and bandwidth overheads of redundancy in fault-tolerant data storage and delivery systems. But it introduces the fundamental difficulty of ensuring that all erasurecoded fragments correspond to the same block of data. Without such assurance, a different block may ..."
Abstract
-
Cited by 8 (1 self)
- Add to MetaCart
Erasure coding can reduce the space and bandwidth overheads of redundancy in fault-tolerant data storage and delivery systems. But it introduces the fundamental difficulty of ensuring that all erasurecoded fragments correspond to the same block of data. Without such assurance, a different block may be reconstructed from different subsets of fragments. This paper develops a technique for providing this assurance without the bandwidth and computational overheads associated with current approaches. The core idea is to distribute with each fragment what we call homomorphic fingerprints. These fingerprints preserve the structure of the erasure code and allow each fragment to be independently verified as corresponding to a specific block. We demonstrate homomorphic fingerprinting functions that are secure, efficient, and compact.
Evolvability from Learning Algorithms
, 2008
"... Valiant has recently introduced a framework for analyzing the capabilities and the limitations of the evolutionary process of random change guided by selection [27]. In his framework the process of acquiring a complex functionality is viewed as a substantially restricted form of PAC learning of an u ..."
Abstract
-
Cited by 7 (2 self)
- Add to MetaCart
Valiant has recently introduced a framework for analyzing the capabilities and the limitations of the evolutionary process of random change guided by selection [27]. In his framework the process of acquiring a complex functionality is viewed as a substantially restricted form of PAC learning of an unknown function from a certain set of functions [26]. Valiant showed that classes of functions evolvable in his model are also learnable in the statistical query (SQ) model of Kearns [18] and asked whether the converse is true. We show that evolvability is equivalent to learnability by a restricted form of statistical queries. Based on this equivalence we prove that for any fixed distribution D over the instance space, every class of functions learnable by SQs over D is evolvable over D. Previously, only the evolvability of monotone conjunctions of Boolean variables over the uniform distribution was known [28]. On the other hand, we prove that the answer to Valiant’s question is negative when distribution-independent evolvability is considered. To demonstrate this, we develop a technique for proving lower bounds on evolvability and use it to show that decision lists and linear threshold functions are not evolvable in a distribution-independent way. This is in contrast to distribution-independent learnability of decision lists and linear threshold functions in the statistical query model. 1
Optimal Time-Space Trade-Offs for Non-Comparison-Based Sorting ∗
, 2001
"... Reproduction of all or part of this work is permitted for educational or research use on condition that this copyright notice is included in any copy. See back inner page for a list of recent BRICS Report Series publications. Copies may be obtained by contacting: BRICS ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
Reproduction of all or part of this work is permitted for educational or research use on condition that this copyright notice is included in any copy. See back inner page for a list of recent BRICS Report Series publications. Copies may be obtained by contacting: BRICS
Efficient and Robust TCP Stream Normalization
"... Network intrusion detection and prevention systems are vulnerable to evasion by attackers who craft ambiguous traffic to breach the defense of such systems. A normalizer is an inline network element that thwarts evasion attempts by removing ambiguities in network traffic. A particularly challenging ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
Network intrusion detection and prevention systems are vulnerable to evasion by attackers who craft ambiguous traffic to breach the defense of such systems. A normalizer is an inline network element that thwarts evasion attempts by removing ambiguities in network traffic. A particularly challenging step in normalization is the sound detection of inconsistent TCP retransmissions, wherein an attacker sends TCP segments with different payloads for the same sequence number space to present a network monitor with ambiguous analysis. Normalizers that buffer all unacknowledged data to verify the consistency of subsequent retransmissions consume inordinate amounts of memory on highspeed links. On the other hand, normalizers that buffer only the hashes of unacknowledged segments cannot verify the consistency of 20–30 % of retransmissions that, according to our traces, do not align with the original transmissions. This paper presents the design of RoboNorm, a normalizer that buffers only the hashes of unacknowledged segments, and yet can detect all inconsistent retransmissions in any TCP byte stream. RoboNorm consumes 1–2 orders of magnitude less memory than normalizers that buffers all unacknowledged data, and is amenable to a high-speed implementation. RoboNorm is also robust to attacks that attempt to compromise its operation or exhaust its resources. 1.
A quantum cipher with near optimal key-recycling
- BRICS, DEPARTMENT OF COMPUTER SCIENCE, UNIVERSITY OF AARHUS
, 2005
"... Assuming an insecure quantum channel and an authenticated classical channel, we propose an unconditionally secure scheme for encrypting classical messages under a shared key, where attempts to eavesdrop the ciphertext can be detected. If no eavesdropping is detected, we can securely re-use the ent ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Assuming an insecure quantum channel and an authenticated classical channel, we propose an unconditionally secure scheme for encrypting classical messages under a shared key, where attempts to eavesdrop the ciphertext can be detected. If no eavesdropping is detected, we can securely re-use the entire key for encrypting new messages. If eavesdropping is detected, we must discard a number of key bits corresponding to the length of the message, but can re-use almost all of the rest. We show this is essentially optimal. Thus, provided the adversary does not interfere (too much) with the quantum channel, we can securely send an arbitrary number of message bits, independently of the length of the initial key. Moreover, the key-recycling mechanism only requires one-bit feedback. While ordinary quantum key distribution with a classical one time pad could be used instead to obtain a similar functionality, this would need more rounds of interaction and more communication.

