Results 1 - 10
of
718
Improving MapReduce Performance in Heterogeneous Environments
, 2008
"... MapReduce is emerging as an important programming model for large-scale data-parallel applications such as web indexing, data mining, and scientific simulation. Hadoop is an open-source implementation of MapReduce enjoying wide adoption and is often used for short jobs where low response time is cri ..."
Abstract
-
Cited by 350 (19 self)
- Add to MetaCart
is critical. Hadoop’s performance is closely tied to its task scheduler, which implicitly assumes that cluster nodes are homogeneous and tasks make progress linearly, and uses these assumptions to decide when to speculatively re-execute tasks that appear to be stragglers. In practice, the homogeneity
Pin: building customized program analysis tools with dynamic instrumentation
- IN PLDI ’05: PROCEEDINGS OF THE 2005 ACM SIGPLAN CONFERENCE ON PROGRAMMING LANGUAGE DESIGN AND IMPLEMENTATION
, 2005
"... Robust and powerful software instrumentation tools are essential for program analysis tasks such as profiling, performance evaluation, and bug detection. To meet this need, we have developed a new instrumentation system called Pin. Our goals are to provide easy-to-use, portable, transparent, and eff ..."
Abstract
-
Cited by 991 (35 self)
- Add to MetaCart
original, uninstrumented behavior. Pin uses dynamic compilation to instrument executables while they are running. For efficiency, Pin uses several techniques, including inlining, register re-allocation, liveness analysis, and instruction scheduling to optimize instrumentation. This fully automated approach
Fault-tolerant earliest-deadlinefirst scheduling algorithm
- In IEEE International Parallel and Distributed Processing Symposium
, 2007
"... The general approach to fault tolerance in uniprocessor systems is to maintain enough time redundancy in the schedule so that any task instance can be re-executed in presence of faults during the execution. In this paper a scheme is presented to add enough and efficient time redundancy to the Earlie ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
The general approach to fault tolerance in uniprocessor systems is to maintain enough time redundancy in the schedule so that any task instance can be re-executed in presence of faults during the execution. In this paper a scheme is presented to add enough and efficient time redundancy
Learning by Watching: Extracting Reusable Task Knowledge from Visual Observation of Human Performance
- IEEE Transactions on Robotics and Automation
, 1994
"... A novel task instruction method for future intelligent robots is presented. In our method, a robot learns reusable task plans by watching a human perform assembly tasks. Functional units and working algorithms for visual recognition and analysis of human action sequences are presented. The overall s ..."
Abstract
-
Cited by 298 (6 self)
- Add to MetaCart
in the recognized action sequence is analyzed, which results in a hierarchical task plan describing the higher level structure of the task. In another workspace with a different initial state, the system re-instantiates and executes the task plan to accomplish an equivalent goal. The effectiveness of our method
Reslice: Selective re-execution of long-retired misspeculated instructions using forward slicing
- In MICRO 51
, 2005
"... As more data value speculation mechanisms are being proposed to speed-up processors, there is growing pressure on the critical processor structures that must buffer the state of the speculative instructions. A scalable solution is to checkpoint the processor and retire speculative instructions. Howe ..."
Abstract
-
Cited by 23 (1 self)
- Add to MetaCart
quickly re-execute the slice if a misprediction is declared, and merge its state with the program state. In addition, this paper develops a sufficient condition for correct slice re-execution and merge. As one possible use of ReSlice, we apply it to recover from cross-task dependence violations in a chip
Fault Tolerant Preemptive Real-Time Scheduling Algorithms Term Paper - CS3420 Fault Tolerance in Parallel and Distributed Systems
"... this paper we deal with the problem of building fault tolerant preemptive schedules for hard real-time tasks. In hard real-time systems, tasks must be completed within their deadlines. In the absence of checkpointing, an easy way of tolerating faults is to re-execute tasks when a fault is detected. ..."
Abstract
- Add to MetaCart
this paper we deal with the problem of building fault tolerant preemptive schedules for hard real-time tasks. In hard real-time systems, tasks must be completed within their deadlines. In the absence of checkpointing, an easy way of tolerating faults is to re-execute tasks when a fault is detected
Fault-Tolerant Rate-Monotonic Scheduling
- Journal of Real-Time Systems
, 1998
"... Due to the critical nature of the tasks in hard real-time systems, it is essential that faults be tolerated. Several studies have shown that space applications, which have very high reliability requirements, have also very high transient faults frequency. Therefore, tolerance to this type of faults ..."
Abstract
-
Cited by 45 (12 self)
- Add to MetaCart
is essential in such applications. In this paper, we present a scheme which can be used to tolerate faults during the execution of preemptive real-time tasks. We describe a recovery scheme which can be used to re-execute tasks in the event of single and multiple transient faults and discuss conditions
Flashback: A Lightweight Extension for Rollback and Deterministic Replay for Software Debugging
- In USENIX Annual Technical Conference, General Track
, 2004
"... Unfortunately, finding software bugs is a very challenging task because many bugs are hard to reproduce. While debugging a program, it would be very useful to rollback a crashed program to a previous execution point and deterministically re-execute the "buggy " code region. However ..."
Abstract
-
Cited by 155 (7 self)
- Add to MetaCart
Unfortunately, finding software bugs is a very challenging task because many bugs are hard to reproduce. While debugging a program, it would be very useful to rollback a crashed program to a previous execution point and deterministically re-execute the "buggy " code region
Self-Adaptive Software: Landscape and Research Challenges
- ACM Transactions on Autonomous and Adaptive Systems
, 2009
"... Software systems dealing with distributed applications in changing environments normally require human supervision to continue operation in all conditions. These (re-)configuring, troubleshooting, and in general maintenance tasks lead to costly and time-consuming procedures during the operating phas ..."
Abstract
-
Cited by 166 (7 self)
- Add to MetaCart
Software systems dealing with distributed applications in changing environments normally require human supervision to continue operation in all conditions. These (re-)configuring, troubleshooting, and in general maintenance tasks lead to costly and time-consuming procedures during the operating
Task 7 : Testing Debugging
"... eproducible re-execution of dynamic states. First, the space of states of the static analyzer of PVM programs (the SAPTE sub-tool) can be effectively reduced by the user, who provides the necessary information on intended process connections. This semantic driven information helps to avoid analyzing ..."
Abstract
- Add to MetaCart
eproducible re-execution of dynamic states. First, the space of states of the static analyzer of PVM programs (the SAPTE sub-tool) can be effectively reduced by the user, who provides the necessary information on intended process connections. This semantic driven information helps to avoid
Results 1 - 10
of
718