Results 1 - 10
of
22
Slack-based Techniques for Robust Schedules
"... . Many scheduling systems assume a static environment within which a schedule will be executed. The real world is not so stable: machines break down, operations take longer to execute than expected, and orders may be added or canceled. One approach to dealing with such disruptions is to generate rob ..."
Abstract
-
Cited by 32 (4 self)
- Add to MetaCart
. Many scheduling systems assume a static environment within which a schedule will be executed. The real world is not so stable: machines break down, operations take longer to execute than expected, and orders may be added or canceled. One approach to dealing with such disruptions is to generate robust schedules: schedules that are able to absorb some level of unexpected events without rescheduling. In this paper we investigate three techniques for generating robust schedules based on the insertion of temporal slack. Simulation-based results indicate that the two novel techniques out-perform the existing temporal protection technique both in terms of producing schedules with low simulated tardiness and in producing schedules that better predict the level of simulated tardiness. Keywords: Robustness, Uncertainty, Scheduling, Heuristics 1
Fault-Tolerant Rate-Monotonic Scheduling
- Journal of Real-Time Systems
, 1998
"... Due to the critical nature of the tasks in hard real-time systems, it is essential that faults be tolerated. Several studies have shown that space applications, which have very high reliability requirements, have also very high transient faults frequency. Therefore, tolerance to this type of faults ..."
Abstract
-
Cited by 29 (12 self)
- Add to MetaCart
Due to the critical nature of the tasks in hard real-time systems, it is essential that faults be tolerated. Several studies have shown that space applications, which have very high reliability requirements, have also very high transient faults frequency. Therefore, tolerance to this type of faults is essential in such applications. In this paper, we present a scheme which can be used to tolerate faults during the execution of preemptive real-time tasks. We describe a recovery scheme which can be used to re-execute tasks in the event of single and multiple transient faults and discuss conditions that must be met by any such recovery scheme. We then extend the Rate Monotonic Scheduling (RMS) scheme to provide tolerance for single and multiple transient faults. We derive schedulability bounds for sets of real-time tasks given the desired level of fault tolerance for each task or subset of tasks. Finally, we analyze and compare the bounds derived as a function of the amount of processing ...
Project scheduling under uncertainty: Survey and research potentials
- European Journal of Operational Research
, 2005
"... The vast majority of the research efforts in project scheduling assume complete information about the scheduling problem to be solved and a static deterministic environment within which the pre-computed baseline schedule will be executed. However, in the real world, project activities are subject to ..."
Abstract
-
Cited by 22 (3 self)
- Add to MetaCart
The vast majority of the research efforts in project scheduling assume complete information about the scheduling problem to be solved and a static deterministic environment within which the pre-computed baseline schedule will be executed. However, in the real world, project activities are subject to considerable uncertainty, which is gradually resolved during project execution. In this survey we review the fundamental approaches for scheduling under uncertainty: reactive scheduling, stochastic project scheduling, fuzzy project scheduling, robust (proactive) scheduling and sensitivity analysis. We discuss the potentials of these approaches for scheduling under uncertainty projects with deterministic network evolution structure. Ó 2004 Elsevier B.V. All rights reserved.
Guaranteeing Fault Tolerance Through Scheduling In Real-Time Systems
, 1996
"... Real-time systems are those which must execute all tasks within their timing constraints. Due to the catastrophic consequences of missing deadlines of some realtime tasks, fault tolerance is an essential component of such systems. This thesis introduces techniques to enhance the fault tolerance capa ..."
Abstract
-
Cited by 21 (1 self)
- Add to MetaCart
Real-time systems are those which must execute all tasks within their timing constraints. Due to the catastrophic consequences of missing deadlines of some realtime tasks, fault tolerance is an essential component of such systems. This thesis introduces techniques to enhance the fault tolerance capability of real-time systems by incorporating time redundancy. Time redundancy is essential in ultrareliable real-time systems where correlated faults must be tolerated. It can also be used to detect and tolerate transient faults, which are a majority of the faults in computing systems. This thesis demonstrates how time redundancy can be used in conjunction with hardware and software redundancy to tolerate a variety of faults in real-time systems. This thesis considers several different system and task models, and for each model, presents a schedulability test (a utilization bound or a set of conditions) which guarantees that all tasks in the system will satisfy their timing constraints even ...
A fault-tolerant scheduling algorithm for real-time periodic tasks with possible software faults
- IEEE Trans. Computers
, 2003
"... Abstract — A hard real-time system is usually subject to stringent reliability and timing constraints since failure to produce correct results in a timely manner may lead to a disaster. One way to avoid missing deadlines is to trade the quality of computation results for timeliness, and software fau ..."
Abstract
-
Cited by 15 (0 self)
- Add to MetaCart
Abstract — A hard real-time system is usually subject to stringent reliability and timing constraints since failure to produce correct results in a timely manner may lead to a disaster. One way to avoid missing deadlines is to trade the quality of computation results for timeliness, and software fault-tolerance is often achieved with the use of redundant programs. A deadline mechanism which combines these two methods is proposed to provide software faulttolerance in hard real-time periodic task systems. Specifically, we consider the problem of scheduling a set of realtime periodic tasks each of which has two versions: primary and alternate. The primary version contains more functions (thus more complex) and produces good quality results but its correctness is more difficult to verify because of its high level of complexity and resource usage. By contrast, the alternate version contains only the minimum required functions (thus simpler) and produces less precise but acceptable results, and its correctness is easy to verify. We propose a scheduling algorithm which (i) guarantees either the primary or alternate version of each critical task to be completed in time and (ii) attempts to complete as many primaries as possible. Our basic algorithm uses a fixed priority-driven preemptive scheduling scheme to pre-allocate time intervals to the alternates, and at run-time, attempts to execute primaries first. An alternate will be executed only (1) if its primary fails due to lack of time or manifestation of bugs, or (2) when the latest time to start execution of the alternate without missing the corresponding task deadline is reached. This algorithm is shown to be effective and easy to implement. This algorithm is enhanced further to prevent early failures in executing primaries from triggering failures in the subsequent job executions, thus improving efficiency of processor usage. Index Terms — real-time systems, Deadline mechanisms, notification time, primary, alternate, backwards-RM algorithm, CAT algorithm, EIT algorithm. I.
Schedulability Analysis for Fault Tolerant Real-Time Systems
, 1997
"... Predictability and fault tolerance are major requirements for complex real-time systems, which are either safety or mission critical. Traditionally fault tolerant techniques were employed to tackle the problem of ensuring correctness in the value domain only. We stress that the fault tolerance requi ..."
Abstract
-
Cited by 11 (3 self)
- Add to MetaCart
Predictability and fault tolerance are major requirements for complex real-time systems, which are either safety or mission critical. Traditionally fault tolerant techniques were employed to tackle the problem of ensuring correctness in the value domain only. We stress that the fault tolerance requirements and timing constraints are not orthogonal issues as they appear to be, and hence any viable approach must be an integrated one. Fault tolerance in a real-time system implies that the system is able to deliver correct results in a timely manner even in the presence of faults. Techniques employing time redundancy are commonly used for tolerating a wide class of faults such as transient faults. In these systems, it is essential that the exploitation of time redundancy for correctness does not jeopardize the timeliness attribute. Hence scheduling aspects of fault tolerant real-time systems become all the more important. The research work described in this thesis, focuses on the provision...
Resource Reclaiming in Hard Real-Time Systems with Static and Dynamic Workloads
- Proc. 30th Hawaii International Conference on System Science, IEEE Computer Society Press, Vol I
, 1997
"... This paper addresses resource reclaiming in the context of non-preemptive priority list scheduling for hard real-time systems. Such scheduling is inherently susceptible to multiprocessor timing anomalies. We present low overhead run-time stabilization methods for a general tasking model that allows ..."
Abstract
-
Cited by 8 (4 self)
- Add to MetaCart
This paper addresses resource reclaiming in the context of non-preemptive priority list scheduling for hard real-time systems. Such scheduling is inherently susceptible to multiprocessor timing anomalies. We present low overhead run-time stabilization methods for a general tasking model that allows phantom tasks as a mechanism to model processor external events. A family of scheduling algorithms is de�ned, that guarantees run-time stabilization for systems consisting of tasks with hard and soft deadlines. The later, i.e. soft tasks, may arrive dynamically. Stabilization is addressed in the context of dynamic and static task to processor allocation. Previous stabilization methods, focused on apriori stabilization for static workloads with dynamic task to processor allocation, thus cannot support this general scheduling model. By taking advantage of run-time information, the stabilization algorithms use the scan-window approach to prevent instability from occurring. Mechanisms are introduced that explicitly control the run-time behavior of tasks with hard deadlines. As a consequence, processor resources become available that can be used to improve processor utilization and response time of soft tasks. The resulting scan algorithms are intended for real world applications where low run-time overhead and a realistic task model are needed. 1
Proactive algorithms for job shop scheduling with probabilistic durations
- Journal of Artificial Intelligence Research
"... Most classical scheduling formulations assume a fixed and known duration for each activity. In this paper, we weaken this assumption, requiring instead that each duration can be represented by an independent random variable with a known mean and variance. The best solutions are ones which have a hig ..."
Abstract
-
Cited by 7 (1 self)
- Add to MetaCart
Most classical scheduling formulations assume a fixed and known duration for each activity. In this paper, we weaken this assumption, requiring instead that each duration can be represented by an independent random variable with a known mean and variance. The best solutions are ones which have a high probability of achieving a good makespan. We first create a theoretical framework, formally showing how Monte Carlo simulation can be combined with deterministic scheduling algorithms to solve this problem. We propose an associated deterministic scheduling problem whose solution is proved, under certain conditions, to be a lower bound for the probabilistic problem. We then propose and investigate a number of techniques for solving such problems based on combinations of Monte Carlo simulation, solutions to the associated deterministic problem, and either constraint programming or tabu search. Our empirical results demonstrate that a combination of the use of the associated deterministic problem and Monte Carlo simulation results in algorithms that scale best both in terms of problem size and uncertainty. Further experiments point to the correlation between the quality of the deterministic solution and the quality of the probabilistic solution as a major factor responsible for this success. 1.
Fault-Tolerant RT-Mach (FT-RTMach) and its Application to Real-Time Train Control. Software Practice and Experience
, 1999
"... Even though real-time systems have the stringent constraint of completing tasks before their deadlines, many existing real-time operating systems do not implement fault tolerance capabilities. In this paper we summarize fault tolerant real-time scheduling policy for dynamic tasks with ready times an ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
Even though real-time systems have the stringent constraint of completing tasks before their deadlines, many existing real-time operating systems do not implement fault tolerance capabilities. In this paper we summarize fault tolerant real-time scheduling policy for dynamic tasks with ready times and deadlines. Our focus in this paper is the implementation, which includes fault-tolerant scheduling, re-scheduling, and recovery mechanisms in the FT-RT-Mach operating system, a fault-tolerant version of RT-Mach. A realtime train control application is then implemented using the FT-RT-Mach operating system. Copyright
FLARe: a Fault-tolerant Lightweight Adaptive Real-time Middleware for Distributed Real-time and Embedded Systems
, 2007
"... A key challenge for middleware that supports both realtime and fault-tolerance properties is to maintain both system availability and timeliness, even in the presence of processor and/or process failures and fluctuations in system load. This paper presents FLARe, a new fault-tolerant, lightweight, a ..."
Abstract
-
Cited by 3 (3 self)
- Add to MetaCart
A key challenge for middleware that supports both realtime and fault-tolerance properties is to maintain both system availability and timeliness, even in the presence of processor and/or process failures and fluctuations in system load. This paper presents FLARe, a new fault-tolerant, lightweight, adaptive, real-time middleware. FLARe provides a novel fail-over strategy that is (1) load-aware, i.e., selects fail-over targets based on current CPU utilizations to prevent post recovery overload and maintain realtime performance, (2) proactive, i.e., provides clients with failover targets before a failure occurs to enable faster, localized, and predictable failure recovery; and (3) adaptive, i.e., dynamically adjusts the failover targets in response to failures and load fluctuations. Empirical results on a Linux cluster demonstrate that FLARe’s adaptive approach outperforms static fail-over approaches by adapting efficiently and effectively to failures and load changes. 1

