Results 1 - 10
of
257
Random Key Predistribution Schemes for Sensor Networks”,
- IEEE Symposium on Security and Privacy,
, 2003
"... Abstract Efficient key distribution is the basis for providing secure communication, a necessary requirement for many emerging sensor network applications. Many applications require authentic and secret communication among neighboring sensor nodes. However, establishing keys for secure communicatio ..."
Abstract
-
Cited by 832 (12 self)
- Add to MetaCart
(Show Context)
Abstract Efficient key distribution is the basis for providing secure communication, a necessary requirement for many emerging sensor network applications. Many applications require authentic and secret communication among neighboring sensor nodes. However, establishing keys for secure communication among neighboring sensor nodes in a sensor network is a challenging problem, due to the scale of sensor nets, the limited computation and communication resources of sensors, their deployment in hostile environments yet their lack of tamper-resistant hardware. The limited computation resources of sensor nodes prevent using traditional key distribution mechanisms in sensor networks, such as Diffie-Hellman based approaches. Pre-distribution of secret keys among neighbors is generally not feasible, because we do not know which sensors will be neighbors after deployment. Pre-distribution of secret keys for all pairs of nodes is not viable due to the large number of sensors and the limited memory of sensor nodes. A new key distribution approach was proposed by Eschenauer and Gligor [11] to achieve secrecy for node-to-node communication: sensor nodes receive a random subset of keys from a key pool before deployment. In the field, neighboring nodes exchange information to find one common key within their random subset and use that key as their shared secret to secure subsequent communication. In this paper, we generalize the Eschenauer-Gligor key distribution approach. First, we propose two new mechanisms, the q-composite random key predistribution scheme and the multi-path key reinforcement scheme, which substantially increases the security of key setup such that an attacker has to compromise many more nodes to achieve a high probability to compromise communication. Second, we propose a new mechanism, random-pairwise keys scheme, to enable node-to-node authentication without involving a base station and perfect resilience against node capture. We also show how we enable distributed node revocation based on this scheme. To the best of our knowledge, no previous scheme supports efficient node-to-node authentication without involving a base station and distributed node revocation. We give detailed analysis and simulation results to each proposed scheme and show under which situations a scheme should be used to achieve the best security.
Implementation and performance of Munin
- IN PROCEEDINGS OF THE 13TH ACM SYMPOSIUM ON OPERATING SYSTEMS PRINCIPLES
, 1991
"... Munin is a distributed shared memory (DSM) system that allows shared memory parallel programs to be executed efficiently on distributed memory multiprocessors. Munin is unique among existing DSM systems in its use of multiple consistency protocols and in its use of release consistency. In Munin, sha ..."
Abstract
-
Cited by 587 (22 self)
- Add to MetaCart
Munin is a distributed shared memory (DSM) system that allows shared memory parallel programs to be executed efficiently on distributed memory multiprocessors. Munin is unique among existing DSM systems in its use of multiple consistency protocols and in its use of release consistency. In Munin, shared program variables are annotated with their expected access pattern, and these annotations are then used by the runtime system to choose a consistency protocol best suited to that access pattern. Release consistency allows Munin to mask network latency and reduce the number of messages required to keep memory consistent. Munin's multiprotocol release consistency is implemented in software using a delayed update queue that buffers and merges pending outgoing writes. A sixteen-processor prototype of Munin is currently operational. We evaluate its implementation and describe the execution of two Munin programs that achieve performance within ten percent of message passing implementations of the same programs. Munin achieves this level of performance with only minor annotations to the shared memory programs.
Algorithms for Scalable Synchronization on Shared-Memory Multiprocessors
- ACM Transactions on Computer Systems
, 1991
"... Busy-wait techniques are heavily used for mutual exclusion and barrier synchronization in shared-memory parallel programs. Unfortunately, typical implementations of busy-waiting tend to produce large amounts of memory and interconnect contention, introducing performance bottlenecks that become marke ..."
Abstract
-
Cited by 573 (32 self)
- Add to MetaCart
(Show Context)
Busy-wait techniques are heavily used for mutual exclusion and barrier synchronization in shared-memory parallel programs. Unfortunately, typical implementations of busy-waiting tend to produce large amounts of memory and interconnect contention, introducing performance bottlenecks that become markedly more pronounced as applications scale. We argue that this problem is not fundamental, and that one can in fact construct busy-wait synchronization algorithms that induce no memory or interconnect contention. The key to these algorithms is for every processor to spin on separate locally-accessible ag variables, and for some other processor to terminate the spin with a single remote write operation at an appropriate time. Flag variables may be locally-accessible as a result of coherent caching, or by virtue of allocation in the local portion of physically distributed shared memory. We present a new scalable algorithm for spin locks that generates O(1) remote references per lock acquisition, independent of the number of processors attempting to acquire the lock. Our algorithm provides reasonable latency in the absence of contention, requires only a constant amount of space per lock, and requires no hardware support other than
Weak Ordering -- A New Definition
, 1990
"... A memory model for a shared memory, multiprocessor commonly and often implicitly assumed by programmers is that of sequential consistency. This model guarantees that all memory accesses will appear to execute atomically and in program order. An alternative model, weak ordering, offers greater perfor ..."
Abstract
-
Cited by 260 (14 self)
- Add to MetaCart
(Show Context)
A memory model for a shared memory, multiprocessor commonly and often implicitly assumed by programmers is that of sequential consistency. This model guarantees that all memory accesses will appear to execute atomically and in program order. An alternative model, weak ordering, offers greater performance potential. Weak ordering was first defined by Dubois, Scheurich and Briggs in terms of a set of rules for hardware that have to be made visible to software. The central hypothesis of this work is that programmers prefer to reason about sequentially consistent memory, rather than having to think about weaker memory, or even write buffers. Following this hypothesis, we re-define weak ordering as a contract between software and hardware. By this contract, software agrees to some formally specified constraints, and hardware agrees to appear sequentially consistent to at least the software that obeys those constraints. We illustrate the power of the new definition with a set of software constraints that forbid data races and an imple-mentation for cache-coherent systems chat is not allowed by the old definition.
Piranha: A scalable architecture based on single-chip multiprocessing
- SIGARCH Comput. Archit. News
, 2000
"... The microprocessor industry is currently struggling with higher development costs and longer design times that arise from exceedingly complex processors that are pushing the limits of instructionlevel parallelism. Meanwhile, such designs are especially ill suited for important commercial application ..."
Abstract
-
Cited by 244 (7 self)
- Add to MetaCart
(Show Context)
The microprocessor industry is currently struggling with higher development costs and longer design times that arise from exceedingly complex processors that are pushing the limits of instructionlevel parallelism. Meanwhile, such designs are especially ill suited for important commercial applications, such as on-line transaction processing (OLTP), which suffer from large memory stall times and exhibit little instruction-level parallelism. Given that commercial applications constitute by far the most important market for high-performance servers, the above trends emphasize the need to consider alternative processor designs that specifically target such workloads. The abundance of explicit thread-level parallelism in commercial workloads, along with advances in semiconductor integration density, identify chip multiprocessing (CMP) as potentially the most promising approach for designing processors
Transactional memory coherence and consistency
- In ISCA
, 2004
"... In this paper, we propose a new shared memory model: Transactional ..."
Abstract
-
Cited by 233 (17 self)
- Add to MetaCart
(Show Context)
In this paper, we propose a new shared memory model: Transactional
The Wisconsin Wind Tunnel: Virtual Prototyping of Parallel Computers
- In Proceedings of the 1993 ACM Sigmetrics Conference on Measurement and Modeling of Computer Systems
, 1993
"... We have developed a new technique for evaluating cache coherent, shared-memory computers. The Wisconsin Wind Tunnel (WWT) runs a parallel sharedmemory program on a parallel computer (CM-5) and uses execution-driven, distributed, discrete-event simulation to accurately calculate program execution tim ..."
Abstract
-
Cited by 226 (29 self)
- Add to MetaCart
(Show Context)
We have developed a new technique for evaluating cache coherent, shared-memory computers. The Wisconsin Wind Tunnel (WWT) runs a parallel sharedmemory program on a parallel computer (CM-5) and uses execution-driven, distributed, discrete-event simulation to accurately calculate program execution time. WWT is a virtual prototype that exploits similarities between the system under design (the target) and an existing evaluation platform (the host). The host directly executes all target program instructions and memory references that hit in the target cache. WWT's shared memory uses the CM-5 memory 's error-correcting code (ECC) as valid bits for a fine-grained extension of shared virtual memory. Only memory references that miss in the target cache trap to WWT, which simulates a cache-coherence protocol. WWT correctly interleaves target machine events and calculates target program execution time. WWT runs on parallel computers with greater speed and memory capacity than uniprocessors. WWT'...
LimitLESS Directories: A Scalable Cache Coherence Scheme
, 1991
"... Caches enhance the performance of multiprocessors by reducing network tra#c and average memory access latency. However, cache-based systems must address the problem of cache coherence. We propose the LimitLESS directory protocol to solve this problem. The LimitLESS scheme uses a combination of hardw ..."
Abstract
-
Cited by 224 (29 self)
- Add to MetaCart
(Show Context)
Caches enhance the performance of multiprocessors by reducing network tra#c and average memory access latency. However, cache-based systems must address the problem of cache coherence. We propose the LimitLESS directory protocol to solve this problem. The LimitLESS scheme uses a combination of hardware and software techniques to realize the performance of a full-map directory with the memory overhead of a limited directory. This protocol is supported by Alewife, a large-scale multiprocessor. We describe the architectural interfaces needed to implement the LimitLESS directory, and evaluate its performance through simulations of the Alewife machine.
The MIT Alewife Machine: A Large-Scale Distributed-Memory Multiprocessor
- In Proceedings of Workshop on Scalable Shared Memory Multiprocessors
, 1991
"... The Alewife multiprocessor project focuses on the architecture and design of a large-scale parallel machine. The machine uses a low-dimensional direct interconnection network to provide scalable communication bandwidth, while allowing the exploitation of locality. Despite its distributed-memory arch ..."
Abstract
-
Cited by 148 (25 self)
- Add to MetaCart
(Show Context)
The Alewife multiprocessor project focuses on the architecture and design of a large-scale parallel machine. The machine uses a low-dimensional direct interconnection network to provide scalable communication bandwidth, while allowing the exploitation of locality. Despite its distributed-memory architecture, Alewife allows efficient shared-memory programming through a multilayered approach to locality management. A new scalable cache-coherence scheme called LimitLESS directories allows the use of caches for reducing communication latency and network bandwidth requirements. Alewife also employs run-time and compile-time methods for partitioning and placement of data and processes to enhance communication locality. While the above methods attempt to minimize communication latency, communication with distant processors cannot be completely avoided. Alewife's processor, Sparcle, is designed to tolerate these latencies by rapidly switching between threads of computation. This paper describe...
A Unified Formalization of Four Shared-Memory Models
- IEEE Transactions on Parallel and Distributed Systems
, 1993
"... This paper presents a shared-memory model, data-race-free-1, that unifies four earlier models: weak ordering, release consistency (with sequentially consistent special operations), the VAX memory model, and datarace -free-0. The most intuitive and commonly assumed shared-memory model, sequential con ..."
Abstract
-
Cited by 127 (9 self)
- Add to MetaCart
(Show Context)
This paper presents a shared-memory model, data-race-free-1, that unifies four earlier models: weak ordering, release consistency (with sequentially consistent special operations), the VAX memory model, and datarace -free-0. The most intuitive and commonly assumed shared-memory model, sequential consistency, limits performance. The models of weak ordering, release consistency, the VAX, and data-race-free-0 are based on the common intuition that if programs synchronize explicitly and correctly, then sequential consistency can be guaranteed with high performance. However, each model formalizes this intuition differently and has different advantages and disadvantages with respect to the other models. Data-race-free-1 unifies the models of weak ordering, release consistency, the VAX, and data-race-free-0 by formalizing the above intuition in a manner that retains the advantages of each of the four models. A multiprocessor is data-race-free-1 if it guarantees sequential consistency to data-...