Results 1 - 10
of
11
Interprocedural array regions analyses
, 1995
"... In order to perform powerful program optimizations, an exact interprocedural analysis of array data ow is needed. For that purpose, two new types of array region are introduced. IN and OUT regions represent the sets of array elements, the values of which are imported to or exported from the current ..."
Abstract
-
Cited by 64 (7 self)
- Add to MetaCart
In order to perform powerful program optimizations, an exact interprocedural analysis of array data ow is needed. For that purpose, two new types of array region are introduced. IN and OUT regions represent the sets of array elements, the values of which are imported to or exported from the current statement or procedure. Among the various applications are: compilation of communications for message-passing machines, array privatization, compile-time optimization of local memory or cache behavior in hierarchical memory machines.
Memory Exclusion: Optimizing the Performance of Checkpointing Systems
, 1996
"... Checkpointing systems are a convenient way for users to make their programs fault-tolerant by intermittently saving program state to disk, and restoring that state following a failure. The main concern with checkpointing is the overhead that it adds to running time of the program. This paper describ ..."
Abstract
-
Cited by 23 (0 self)
- Add to MetaCart
Checkpointing systems are a convenient way for users to make their programs fault-tolerant by intermittently saving program state to disk, and restoring that state following a failure. The main concern with checkpointing is the overhead that it adds to running time of the program. This paper describes memory exclusion, an important class of optimizations that reduce the overhead of checkpointing. These optimizations have been implemented in two checkpointers: libckpt, which works on Unix-based workstations, and libNXckpt, which works on the Intel Paragon. Both checkpointers are publicly available at no cost. We have checkpointed various long-running applications with both checkpointers and have explored the performance improvements that may be gained through memory exclusion. Results from these experiments are presented and show that the improvements are significant. We conclude that all checkpointing systems should include primitives allowing programmers and users to gain the full ben...
Improving the Performance of Coordinated Checkpointers on Networks of Workstations using RAID Techniques
, 1996
"... Coordinated checkpointing systems are popular and general-purpose tools for implementing process migration, coarse-grained job swapping, and fault-tolerance on networks of workstations. Though simple in concept, there are several design decisions concerning the placement of checkpoint files that can ..."
Abstract
-
Cited by 22 (10 self)
- Add to MetaCart
Coordinated checkpointing systems are popular and general-purpose tools for implementing process migration, coarse-grained job swapping, and fault-tolerance on networks of workstations. Though simple in concept, there are several design decisions concerning the placement of checkpoint files that can impact the performance and functionality of coordinated checkpointers. Although several such checkpointers have been implemented for popular programming platforms like PVM and MPI, none have taken this issue into consideration. This paper addresses the issue of checkpoint placement and its impact on the performance and functionality of coordinated checkpointing systems. Several strategies, both old and new, are described and implemented on a network of SPARC-5 workstations running PVM. These strategies range from very simple to more complex, borrowing heavily from ideas in RAID (Redundant Arrays of Inexpensive Disks) faulttolerance. The results of this paper will serve as a guide so that f...
Compiler-Assisted Memory Exclusion for Fast Checkpointing
- IEEE TECHNICAL COMMITTEE ON OPERATING SYSTEMS AND APPLICATION ENVIRONMENTS
, 1995
"... Memory exclusion is a powerful tool for optimizing the performance of checkpointing, however it has not been automated completely with low enough overhead. In this paper we present compiler-assisted memory exclusion (CAME), a technique that uses static program analysis to optimize the performance o ..."
Abstract
-
Cited by 19 (4 self)
- Add to MetaCart
Memory exclusion is a powerful tool for optimizing the performance of checkpointing, however it has not been automated completely with low enough overhead. In this paper we present compiler-assisted memory exclusion (CAME), a technique that uses static program analysis to optimize the performance of checkpointing. With the assistance of user-placed directives, the compiler can perform data flow analyses for dead and read-only regions of memory that can be omitted from checkpoints. The result can be a significant reduction in the size of checkpoints, thereby reducing the overhead of checkpointing.
An on-line algorithm for checkpoint placement
- IEEE Transactions on Computers
, 1997
"... Checkpointing is a common technique for reducing the time to recover from faults in computer systems. By saving intermediate states of programs in a reliable storage, checkpointing enables to reduce the lost processing time caused by faults. The length of the intervals between checkpoints affects th ..."
Abstract
-
Cited by 17 (0 self)
- Add to MetaCart
Checkpointing is a common technique for reducing the time to recover from faults in computer systems. By saving intermediate states of programs in a reliable storage, checkpointing enables to reduce the lost processing time caused by faults. The length of the intervals between checkpoints affects the execution time of programs. Long intervals lead to long re-processing time, while too frequent checkpointing leads to high checkpointing overhead. In this paper we present an on-line algorithm for placement of checkpoints. The algorithm uses on-line knowledge of the current cost of a checkpoint when it decides whether or not to place a checkpoint. We show how the execution time of a program using this algorithm can be analyzed. The total overhead of the execution time when the proposed algorithm is used is smaller than the overhead when fixed intervals are used. Although the proposed algorithm uses only on-line knowledge about the cost of checkpointing, its behavior is close to the off-line optimal algorithm that uses a complete knowledge of checkpointing cost. 1
An Adaptive Checkpointing Protocol to Bound Recovery Time with Message Logging
- In Proceeding of the 18th IEEE Symposium on Reliable Distributed Systems
, 1999
"... Numerous mathematical approaches have been proposed to determine the optimal checkpoint interval for minimizing total execution time of an application in the presence of failures. These solutions are often not applicable due to the lack of accurate data on the probability distribution of failures. M ..."
Abstract
-
Cited by 10 (1 self)
- Add to MetaCart
Numerous mathematical approaches have been proposed to determine the optimal checkpoint interval for minimizing total execution time of an application in the presence of failures. These solutions are often not applicable due to the lack of accurate data on the probability distribution of failures. Most current checkpoint libraries require application users to define a fixed time interval for checkpointing. The checkpoint interval usually implies the approximate maximum recovery time for single process applications. However, actual recovery time can be much smaller when message logging is used. Due to this faster recovery, checkpointing may be more frequent than needed and thus unnecessary execution overhead is introduced. In this paper, an adaptive checkpointing protocol is developed to accurately enforce the user-defined recovery time and to reduce excessive checkpoints. An adaptive protocol has been implemented and evaluated using a receiver-based message logging algorithm on wired a...
PREACHES - Portable Recovery and Checkpointing in Heterogeneous Systems
- Proceedings of IEEE Fault-Tolerant Computing Symposium
, 1998
"... Checkpointing in a homogeneous environment, where both checkpointing and recovery are performed on the same type of machine and operating system, has been studied extensively. As heterogeneous distributed systems become pervasive, it is desirable to extend the capability of checkpointing to non-homo ..."
Abstract
-
Cited by 10 (2 self)
- Add to MetaCart
Checkpointing in a homogeneous environment, where both checkpointing and recovery are performed on the same type of machine and operating system, has been studied extensively. As heterogeneous distributed systems become pervasive, it is desirable to extend the capability of checkpointing to non-homogeneous environments. This paper describes a prototype, PREACHES, that achieves portable checkpointing of single process applications in heterogeneous systems using checkpoint propagation. The checkpoint propagation technique generates machine-dependent checkpoints for each different architecture in the heterogeneous environment. When failure occurs, the failed process can be restarted on a specified machine with the checkpoint that is appropriate for the architecture. An implementation of PREACHES on a heterogeneous network of workstations has been successfully developed based on TCP/IP communication. PREACHES also provides automatic and fast recovery for single process programs. 1 Introdu...
Using Application Knowledge to Improve Embedded Systems Dependability
- In Proceedings of the Workshop on Hot Topics in System Dependability (HotDep 2010
, 2010
"... Semiconductor experts are convinced that the rate of soft errors occurring in electronic devices will rise to levels that regularly affect everyday operation of devices. Correcting every single error implies a significant hardware and real-time overhead, especially for embedded devices. Hence, an er ..."
Abstract
-
Cited by 4 (4 self)
- Add to MetaCart
Semiconductor experts are convinced that the rate of soft errors occurring in electronic devices will rise to levels that regularly affect everyday operation of devices. Correcting every single error implies a significant hardware and real-time overhead, especially for embedded devices. Hence, an error classification is needed to distinguish whether an error has to be corrected or not. In this paper, we present an approach using application knowledge. This knowledge is used to classify errors according to their relevance and the influence of their correction on the timing behavior of the whole system. When real-time conditions have to be met not all errors can be fixed immediately. Using a typical soft real-time application, an H.264 video decoder, as an example, we show that error correction can be delayed. Furthermore, we show that the correction overhead will be significantly reduced if application knowledge is employed. 1
Compiler-Assisted Checkpoint Optimization Using SUIF
, 1995
"... In this paper we present compiler-assisted checkpointing, an ongoing research project whose goal is to develop techniques using static data flow analysis to optimize the performance of checkpointing. We achieve this performance gain using libckpt, a checkpointing library that uses memory exclusio ..."
Abstract
- Add to MetaCart
In this paper we present compiler-assisted checkpointing, an ongoing research project whose goal is to develop techniques using static data flow analysis to optimize the performance of checkpointing. We achieve this performance gain using libckpt, a checkpointing library that uses memory exclusion to specify portions of a process's data space that need not be included as part of a checkpoint, thereby reducing the size of the checkpoint file and the time required to write that file to stable storage. This procedure has heretofore relied upon the programmer to analyze the program correctly and insert the proper libckpt memory exclusion function calls, a burdensome and unsafe practice. Our project makes use of the SUIF compiler system to analyze the target program and automatically place the correct memory exclusion function calls. We make use of interval analysis, a technique for solving data flow equations involving array references. We present our full algorithm and describ...
Miscellaneous
"... As cryptographic protocols execute they accumulate information such as values and keys, and evidence of properties about this information. As execution proceeds, new information becomes relevant while some old information ceases to be of use. Identifying what information is necessary at each point i ..."
Abstract
- Add to MetaCart
As cryptographic protocols execute they accumulate information such as values and keys, and evidence of properties about this information. As execution proceeds, new information becomes relevant while some old information ceases to be of use. Identifying what information is necessary at each point in a protocol run is valuable for both analysis and deployment. We formalize this necessary information as the minimal backup of a protocol. We present an analysis that determines the minimal backup at each point in a protocol run. We show that this minimal backup has many uses: it serves as a foundation for job-migration and other kinds of faulttolerance, and also assists protocol designers understand the structure of protocols and identify potential flaws. In a cryptographic context it is dangerous to reason informally. We have therefore formalized and verified this work using the Coq proof assistant. Additionally, Coq provides a certified implementation of our analysis. Concretely, our analysis and its implementation consume protocols written in a variant of the Cryptographic Protocol Programming Language, cppl.

