Results 1 - 10
of
3,675
Hardware reliability
"... We target the development of new methodologies for analyzing the robustness of circuits described at the Register Transfer (RT) level, with respect to errors caused by transient faults. Analyzing the potential consequences of errors usually involves fault-injection techniques, using simulation or em ..."
Abstract
- Add to MetaCart
We target the development of new methodologies for analyzing the robustness of circuits described at the Register Transfer (RT) level, with respect to errors caused by transient faults. Analyzing the potential consequences of errors usually involves fault-injection techniques, using simulation or emulation-based solutions. Our goal is to take advantage of the logical power of theorem proving tools to get alternative solutions that would allow to reason purely symbolically on errors. In this paper we present our preliminary results with the ACL2 theorem prover, in the context of devices that have auto-correction features. First we give a logical definition of the error model as a conjunction of characteristic properties, from which robustness analysis can be performed. Then we improve the methodology to deal with hierarchical systems.
Characterizing Cloud Computing Hardware Reliability
"... Modern day datacenters host hundreds of thousands of servers that coordinate tasks in order to deliver highly available cloud computing services. These servers consist of multiple hard disks, memory modules, network cards, processors etc., each of which while carefully engineered are capable of fail ..."
Abstract
-
Cited by 57 (0 self)
- Add to MetaCart
of failing. While the probability of seeing any such failure in the lifetime (typically 3-5 years in industry) of a server can be somewhat small, these numbers get magnified across all devices hosted in a datacenter. At such a large scale, hardware component failure is the norm rather than an exception
SLAC-PUB-10835 PEP-II Hardware Reliability
"... Abstract--Hardware reliability takes on special importance in large accelerator facilities intended to work as factories; i.e., when they are expected to deliver design performance for extended periods of time. The PEP-II “B-Factory ” at SLAC is such a facility. In this paper, we summarize PEP-II re ..."
Abstract
- Add to MetaCart
Abstract--Hardware reliability takes on special importance in large accelerator facilities intended to work as factories; i.e., when they are expected to deliver design performance for extended periods of time. The PEP-II “B-Factory ” at SLAC is such a facility. In this paper, we summarize PEP
OpenSPARC: An Open Platform for Hardware Reliability Experimentation
"... Abstract—OpenSPARC is an open source community based around hardware design and experimentation aids for the UltraSPARC TM T1 and T2 chip multi-threaded (CMT) microprocessors[1]. The UltraSPARC TM T2 processor is the industry's first "server on a chip", with 8 cores, 64 threads and on ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
and on-chip networking and security. The richness of the RTL source code, tools and information in OpenSPARC has made it a comprehensive, practical and relevant platform for research in several areas of computing. This paper highlights the potential of using OpenSPARC for research in hardware reliability
Nested Transactions: An Approach to Reliable Distributed Computing
, 1981
"... Distributed computing systems are being built and used more and more frequently. This distributod computing revolution makes the reliability of distributed systems an important concern. It is fairly well-understood how to connect hardware so that most components can continue to work when others are ..."
Abstract
-
Cited by 517 (4 self)
- Add to MetaCart
Distributed computing systems are being built and used more and more frequently. This distributod computing revolution makes the reliability of distributed systems an important concern. It is fairly well-understood how to connect hardware so that most components can continue to work when others
Differential Power Analysis
, 1999
"... Cryptosystem designers frequently assume that secrets will be manipulated in closed, reliable computing environments. Unfortunately, actual computers and microchips leak information about the operations they process. This paper examines specific methods for analyzing power consumption measuremen ..."
Abstract
-
Cited by 1121 (7 self)
- Add to MetaCart
Cryptosystem designers frequently assume that secrets will be manipulated in closed, reliable computing environments. Unfortunately, actual computers and microchips leak information about the operations they process. This paper examines specific methods for analyzing power consumption
No Silver Bullet: Essence and Accidents of Software Engineering
- IEEE Computer
, 1987
"... Of all the monsters that fill the nightmares of our folklore, none terrify more than werewolves, because they transform unexpectedly from the familiar into horrors. For these, one seeks bullets of silver that can magically lay them to rest. The familiar software project, at least as seen by the nont ..."
Abstract
-
Cited by 801 (0 self)
- Add to MetaCart
by the nontechnical manager, has something of this character; it is usually innocent and straightforward, but is capable of becoming a monster of missed schedules, blown budgets, and flawed products. So we hear desperate cries for a silver bullet--something to make software costs drop as rapidly as computer hardware
The BSD Packet Filter: A New Architecture for User-level Packet Capture
, 1992
"... Many versions of Unix provide facilities for user-level packet capture, making possible the use of general purpose workstations for network monitoring. Because network monitors run as user-level processes, packets must be copied across the kernel/user-space protection boundary. This copying can be m ..."
Abstract
-
Cited by 568 (2 self)
- Add to MetaCart
filter evaluator that is up to 20 times faster than the original design. BPF also uses a straightforward buffering strategy that makes its overall performance up to 100 times faster than Sun's NIT running on the same hardware. 1 Introduction Unix has become synonymous with high quality networking
ANALYSIS OF WIRELESS SENSOR NETWORKS FOR HABITAT MONITORING
, 2004
"... We provide an in-depth study of applying wireless sensor networks (WSNs) to real-world habitat monitoring. A set of system design requirements were developed that cover the hardware design of the nodes, the sensor network software, protective enclosures, and system architecture to meet the require ..."
Abstract
-
Cited by 1490 (19 self)
- Add to MetaCart
We provide an in-depth study of applying wireless sensor networks (WSNs) to real-world habitat monitoring. A set of system design requirements were developed that cover the hardware design of the nodes, the sensor network software, protective enclosures, and system architecture to meet
Reliability, Testing, and Fault-Tolerance—Hardware reliability General Terms
"... Microring resonator-based photonic interconnects are being considered for both on-chip and off-chip communication in order to satisfy the power and bandwidth requirements of future large scale chip multiprocessors. However, microring resonators are prone to malfunction due to fabrication errors, and ..."
Abstract
- Add to MetaCart
Microring resonator-based photonic interconnects are being considered for both on-chip and off-chip communication in order to satisfy the power and bandwidth requirements of future large scale chip multiprocessors. However, microring resonators are prone to malfunction due to fabrication errors, and they are also extremely sensitive to fluctuations in temperature. In this paper we derive a fault model for microring based optical links that can be used by computer architects to make informed design choices. We evaluate different schemes for improving resilience, such as retransmission versus error-correction, using an optical fault simulator based on our fault model. We show how meeting a target mean time between failures (MTBF) affects the choice of resilience scheme-our investigation indicates that until fault rates are in the range of 10 −21 to 10 −24 per cycle, error detection/correction schemes will be needed in order to meet a 1M hour MTBF. We also evaluate how the resilience scheme impacts the performance of the link, which will help an architect choose the appropriate scheme based on the throughput requirements of a particular design.
Results 1 - 10
of
3,675