Between MDPs and SemiMDPs: A Framework for Temporal Abstraction in Reinforcement Learning
 Artificial Intelligence
, 1999
"... Learning, planning, and representing knowledge at multiple levels of temporal abstraction are key, longstanding challenges for AI. In this paper we consider how these challenges can be addressed within the mathematical framework of reinforcement learning and Markov decision processes (MDPs). We ..."
Learning, planning, and representing knowledge at multiple levels of temporal abstraction are key, longstanding challenges for AI. In this paper we consider how these challenges can be addressed within the mathematical framework of reinforcement learning and Markov decision processes (MDPs). We extend the usual notion of action in this framework to include optionsclosedloop policies for taking action over a period of time. Examples of options include picking up an object, going to lunch, and traveling to a distant city, as well as primitive actions such as muscle twitches and joint torques. Overall, we show that options enable temporally abstract knowledge and action to be included in the reinforcement learning framework in a natural and general way. In particular, we show that options may be used interchangeably with primitive actions in planning methods such as dynamic programming and in learning methods such as Qlearning.
A Unified Framework for Hybrid Control: Model and Optimal Control Theory
 IEEE TRANSACTIONS ON AUTOMATIC CONTROL
, 1998
"... Complex natural and engineered systems typically possess a hierarchical structure, characterized by continuousvariable dynamics at the lowest level and logical decisionmaking at the highest. Virtually all control systems todayfrom flight control to the factory floorperform computercoded chec ..."
Complex natural and engineered systems typically possess a hierarchical structure, characterized by continuousvariable dynamics at the lowest level and logical decisionmaking at the highest. Virtually all control systems todayfrom flight control to the factory floorperform computercoded checks and issue logical as well as continuousvariable control commands. The interaction of these different types of dynamics and information leads to a challenging set of "hybrid" control problems. We propose a very general framework that systematizes the notion of a hybrid system, combining differential equations and automata, governed by a hybrid controller that issues continuousvariable commands and makes logical decisions. We first identify the phenomena that arise in realworld hybrid systems. Then, we introduce a mathematical model of hybrid systems as interacting collections of dynamical systems, evolving on continuousvariable state spaces and subject to continuous controls and discrete transitions. The model captures the identified phenomena, subsumes previous models, yet retains enough structure on which to pose and solve meaningful control problems. We develop a theory for synthesizing hybrid controllers for hybrid plants in an optimal control framework. In particular, we demonstrate the existence of optimal (relaxed) and nearoptimal (precise) controls and derive "generalized quasivariational inequalities" that the associated value function satisfies. We summarize algorithms for solving these inequalities based on a generalized Bellman equation, impulse control, and linear programming.
Quantized Feedback Stabilization of Linear Systems
 IEEE Trans. Automat. Control
, 2000
"... This paper addresses feedback stabilization problems for linear timeinvariant control systems with saturating quantized measurements. We propose a new control design methodology, which relies on the possibility of changing the sensitivity of the quantizer while the system evolves. The equation that ..."
This paper addresses feedback stabilization problems for linear timeinvariant control systems with saturating quantized measurements. We propose a new control design methodology, which relies on the possibility of changing the sensitivity of the quantizer while the system evolves. The equation that describes the evolution of the sensitivity with time (discrete rather than continuous in most cases) is interconnected with the given system (either continuous or discrete), resulting in a hybrid system. When applied to systems that are stabilizable by linear timeinvariant feedback, this approach yields global asymptotic stability. Index TermsFeedback stabilization, hybrid system, linear control system, quantized measurement. I. INTRODUCTION T HIS PAPER deals with quantized feedback stabilization problems for linear timeinvariant control systems. A quantizer, as defined here, acts as a functional that maps a realvalued function into a piecewise constant function taking on a finite...
Conflict Resolution for Air Traffic Management: A Study in Multiagent Hybrid Systems
 IEEE TRANSACTIONS ON AUTOMATIC CONTROL
, 1998
"... Air Traffic Management (ATM) of the future allows for the possibility of free flight, in which aircraft choose their own optimal routes, altitudes, and velocities. The safe resolution of trajectory conflicts between aircraft is necessary to the success of such a distributed control system. In this p ..."
Air Traffic Management (ATM) of the future allows for the possibility of free flight, in which aircraft choose their own optimal routes, altitudes, and velocities. The safe resolution of trajectory conflicts between aircraft is necessary to the success of such a distributed control system. In this paper, we present a method to synthesize provably safe conflict resolution maneuvers. The method models the aircraft and the maneuver as a hybrid control system and calculates the maximal set of safe initial conditions for each aircraft so that separation is assured in the presence of uncertainties in the actions of the other aircraft. Examples of maneuvers using both speed and heading changes are worked out in detail.
Supervisory Control of Families of Linear SetPoint Controllers  Part 2: Robustness
 IEEE Trans. Automat. Contr
, 1998
"... A simplystructured highlevel controller called a `supervisor' has recently been proposed in [1] for the purpose of orchestrating the switching of a sequence of candidate setpoint controllers into feedback with an imprecisely modeled siso process so as to cause the output of the process to ap ..."
A simplystructured highlevel controller called a `supervisor' has recently been proposed in [1] for the purpose of orchestrating the switching of a sequence of candidate setpoint controllers into feedback with an imprecisely modeled siso process so as to cause the output of the process to approach and track a constant reference input. The process is assumed to be modeled by a siso linear system whose transfer function is in the union of a number of subclasses, each subclass being small enough so that one of the candidate controllers would solve the setpoint tracking problem, were the process's transfer function to be one of the subclass's members. In [1] it is shown that in the absence of unmodelled process dynamics the proposed supervisor can successfully perform its function fi.e., achieve a zero steady state tracking errorg even if process disturbances are present, provided they are constant. This paper proves that without any further modification, the same supervisor can also perform this function in the face of normbounded unmodelled dynamics and moreover that none of the signals within the overall system can grow without bound in response to bounded disturbance and noise inputs, be they constant or not.
Systems with finite communication bandwidth constraints—I: State estimation problems
 Stanford University, Stanford, CA
, 1997
"... Abstract—In this paper a new class of feedback control problems is introduced. Unlike classical models, the systems considered here have communication channel constraints. As a result, the issue of coding and communication protocol becomes an integral part of the analysis. Since these systems cannot ..."
Abstract—In this paper a new class of feedback control problems is introduced. Unlike classical models, the systems considered here have communication channel constraints. As a result, the issue of coding and communication protocol becomes an integral part of the analysis. Since these systems cannot be asymptotically stabilized if the underlying dynamics are unstable, a weaker stability concept called containability is introduced. A key result connects containability with an inequality equation involving the communication data rate and the rate of change of the state. Index Terms — Asymptotic stability, containability, feedback control, Kraft inequality.
Perspectives and Results on the Stability and Stabilizability of Hybrid Systems
 PROCEEDINGS OF THE IEEE
, 2000
"... This paper introduces the concept of a hybrid system and some of the challenges associated with the stability of such systems, including the issues of guaranteeing stability of switched stable systems and finding conditions for the existence of switched controllers for stabilizing switched unstable ..."
This paper introduces the concept of a hybrid system and some of the challenges associated with the stability of such systems, including the issues of guaranteeing stability of switched stable systems and finding conditions for the existence of switched controllers for stabilizing switched unstable systems. In this endeavor, this paper surveys the major results in the (Lyapunov) stability of finitedimensional hybrid systems and then discusses the stronger, more specialized results of switched linear (stable and unstable) systems. A section detailing how some of the results can be formulated as linear matrix inequalities is given. Stability analyses on the regulation of the angle of attack of an aircraft and on the PI control of a vehicle with an automatic transmission are given. Other examples are included to illustrate various results in this paper.
Controllers for Reachability Specifications for Hybrid Systems
 Automatica
, 1999
"... The problem of systematically synthesizing hybrid controllers which satisfy multiple control objectives is considered. We present a technique, based on the principles of optimal control, for determining the class of least restrictive controllers that satisfies the most important objective (which we ..."
The problem of systematically synthesizing hybrid controllers which satisfy multiple control objectives is considered. We present a technique, based on the principles of optimal control, for determining the class of least restrictive controllers that satisfies the most important objective (which we refer to as safety). The system performance with respect to lower priority objectives (which we refer to as efficiency) can then be optimized within this class. We motivate our approach by showing how the proposed synthesis technique simplifies to well known results from supervisory control and pursuit evasion games when restricted to purely discrete and purely continuous systems respectively. We then illustrate the application of this technique to two examples, one hybrid (the steam boiler benchmark problem), and one primarily continuous (a flight vehicle management system with discrete flight modes). 1 Introduction Hybrid systems, or systems that involve the interaction of discrete and co...
Dynamical Properties of Hybrid Automata
 IEEE Transactions on Automatic Control
, 2003
"... Hybrid automata provide a language for modeling and analyzing digital and analogue computations in realtime systems. Hybrid automata are studied here from a dynamical systems perspective. Necessary and sufficient conditions for existence and uniqueness of solutions are derived and a class of hybrid ..."
Hybrid automata provide a language for modeling and analyzing digital and analogue computations in realtime systems. Hybrid automata are studied here from a dynamical systems perspective. Necessary and sufficient conditions for existence and uniqueness of solutions are derived and a class of hybrid automata whose solutions depend continuously on the initial state is characterized. The results on existence, uniqueness, and continuity serve as a starting point for stability analysis. Lyapunov's theorem on stability via linearization and LaSalle's invariance principle are generalized to hybrid automated
Stability theory for hybrid dynamical systems
 IEEE Transactions on Automatic Control
, 1998
"... Abstract — Hybrid systems which are capable of exhibiting simultaneously several kinds of dynamic behavior in different parts of a system (e.g., continuoustime dynamics, discretetime dynamics, jump phenomena, switching and logic commands, and the like) are of great current interest. In the present ..."
Abstract — Hybrid systems which are capable of exhibiting simultaneously several kinds of dynamic behavior in different parts of a system (e.g., continuoustime dynamics, discretetime dynamics, jump phenomena, switching and logic commands, and the like) are of great current interest. In the present paper we first formulate a model for hybrid dynamical systems which covers a very large class of systems and which is suitable for the qualitative analysis of such systems. Next, we introduce the notion of an invariant set (e.g., equilibrium) for hybrid dynamical systems and we define several types of (Lyapunovlike) stability concepts for an invariant set. We then establish sufficient conditions for uniform stability, uniform asymptotic stability, exponential stability, and instability of an invariant set of hybrid dynamical systems. Under some mild additional assumptions, we also establish necessary conditions for some of the above stability types (converse theorems). In addition to the above, we also establish sufficient conditions for the uniform boundedness of the motions of hybrid dynamical systems (Lagrange stability). To demonstrate the applicability of the developed theory, we present specific examples of hybrid dynamical systems and we conduct a stability analysis of some of these examples (a class of sampleddata feedback control systems with a nonlinear (continuoustime) plant and a linear (discretetime) controller, and a class of systems with impulse effects). Index Terms — Asymptotic stability, boundedness, dynamical system, equilibrium, exponential stability, hybrid, hybrid dynamical