Results 1  10
of
135
Between MDPs and SemiMDPs: A Framework for Temporal Abstraction in Reinforcement Learning
 Artificial Intelligence
, 1999
"... Learning, planning, and representing knowledge at multiple levels of temporal abstraction are key, longstanding challenges for AI. In this paper we consider how these challenges can be addressed within the mathematical framework of reinforcement learning and Markov decision processes (MDPs). We ..."
Abstract

Cited by 560 (37 self)
 Add to MetaCart
(Show Context)
Learning, planning, and representing knowledge at multiple levels of temporal abstraction are key, longstanding challenges for AI. In this paper we consider how these challenges can be addressed within the mathematical framework of reinforcement learning and Markov decision processes (MDPs). We extend the usual notion of action in this framework to include optionsclosedloop policies for taking action over a period of time. Examples of options include picking up an object, going to lunch, and traveling to a distant city, as well as primitive actions such as muscle twitches and joint torques. Overall, we show that options enable temporally abstract knowledge and action to be included in the reinforcement learning framework in a natural and general way. In particular, we show that options may be used interchangeably with primitive actions in planning methods such as dynamic programming and in learning methods such as Qlearning.
Stability theory for hybrid dynamical systems
 IEEE Transactions on Automatic Control
, 1998
"... Abstract — Hybrid systems which are capable of exhibiting simultaneously several kinds of dynamic behavior in different parts of a system (e.g., continuoustime dynamics, discretetime dynamics, jump phenomena, switching and logic commands, and the like) are of great current interest. In the present ..."
Abstract

Cited by 135 (8 self)
 Add to MetaCart
(Show Context)
Abstract — Hybrid systems which are capable of exhibiting simultaneously several kinds of dynamic behavior in different parts of a system (e.g., continuoustime dynamics, discretetime dynamics, jump phenomena, switching and logic commands, and the like) are of great current interest. In the present paper we first formulate a model for hybrid dynamical systems which covers a very large class of systems and which is suitable for the qualitative analysis of such systems. Next, we introduce the notion of an invariant set (e.g., equilibrium) for hybrid dynamical systems and we define several types of (Lyapunovlike) stability concepts for an invariant set. We then establish sufficient conditions for uniform stability, uniform asymptotic stability, exponential stability, and instability of an invariant set of hybrid dynamical systems. Under some mild additional assumptions, we also establish necessary conditions for some of the above stability types (converse theorems). In addition to the above, we also establish sufficient conditions for the uniform boundedness of the motions of hybrid dynamical systems (Lagrange stability). To demonstrate the applicability of the developed theory, we present specific examples of hybrid dynamical systems and we conduct a stability analysis of some of these examples (a class of sampleddata feedback control systems with a nonlinear (continuoustime) plant and a linear (discretetime) controller, and a class of systems with impulse effects). Index Terms — Asymptotic stability, boundedness, dynamical system, equilibrium, exponential stability, hybrid, hybrid dynamical
Effective Synthesis of Switching Controllers for Linear Systems
, 2000
"... In this work we suggest a novel methodology for synthesizing switching controllers for continuous and hybrid systems whose dynamics are defined by linear differential equations. We formulate the synthesis problem as finding the conditions upon which a controller should switch the behavior of the sys ..."
Abstract

Cited by 108 (8 self)
 Add to MetaCart
(Show Context)
In this work we suggest a novel methodology for synthesizing switching controllers for continuous and hybrid systems whose dynamics are defined by linear differential equations. We formulate the synthesis problem as finding the conditions upon which a controller should switch the behavior of the system from one "mode" to another in order to avoid a set of bad states, and propose an abstract algorithm which solves the problem by an iterative computation of reachable states. We have implemented a concrete version of the algorithm, which uses a new approximation scheme for reachability analysis of linear systems.
Differential Dynamic Logic for Hybrid Systems
, 2007
"... Hybrid systems are models for complex physical systems and are defined as dynamical systems with interacting discrete transitions and continuous evolutions along differential equations. With the goal of developing a theoretical and practical foundation for deductive verification of hybrid systems, ..."
Abstract

Cited by 76 (44 self)
 Add to MetaCart
Hybrid systems are models for complex physical systems and are defined as dynamical systems with interacting discrete transitions and continuous evolutions along differential equations. With the goal of developing a theoretical and practical foundation for deductive verification of hybrid systems, we introduce a dynamic logic for hybrid programs, which is a program notation for hybrid systems. As a verification technique that is suitable for automation, we introduce a free variable proof calculus with a novel combination of realvalued free variables and Skolemisation for lifting quantifier elimination for real arithmetic to dynamic logic. The calculus is compositional, i.e., it reduces properties of hybrid programs to properties of their parts. Our main result proves that this calculus axiomatises the transition behaviour of hybrid systems completely relative to differential equations. In a case study with cooperating traffic agents of the European Train Control System, we further show that our calculus is wellsuited for verifying realistic hybrid systems with parametric system dynamics.
Bisimilar Linear Systems
, 2001
"... The notion of bisimulation in theoretical computer science is one of the main complexity reduction methods for the analysis and synthesis of labeled transition systems. Bisimulations are special quotients of the state space that preserve many important properties expressible in temporal logics, and, ..."
Abstract

Cited by 74 (11 self)
 Add to MetaCart
The notion of bisimulation in theoretical computer science is one of the main complexity reduction methods for the analysis and synthesis of labeled transition systems. Bisimulations are special quotients of the state space that preserve many important properties expressible in temporal logics, and, in particular, reachability. In this paper, the framework of bisimilar transition systems is applied to various transition systems that are generated by linear control systems. Given a discretetime or continuoustime linear system, and a finite observation map, we characterize linear quotient maps that result in quotient transition systems that are bisimilar to the original system. Interestingly, the characterizations for discretetime systems are more restrictive than for continuoustime systems, due to the existence of an atomic time step. We show that computing the coarsest bisimulation, which results in maximum complexity reduction, corresponds to computing the maximal controlled or reachability invariant subspace inside the kernel of the observations map. These results establish strong connections between complexity reduction concepts in control theory and computer science.
Specification and Verification of Dynamics in Agent Models
"... Within many domains, among which biological, cognitive, and social areas, multiple interacting processes occur among agents with dynamics that are hard to handle. This paper presents the predicate logical Temporal Trace Language (TTL) for the formal specification and analysis of dynamic properties o ..."
Abstract

Cited by 65 (51 self)
 Add to MetaCart
(Show Context)
Within many domains, among which biological, cognitive, and social areas, multiple interacting processes occur among agents with dynamics that are hard to handle. This paper presents the predicate logical Temporal Trace Language (TTL) for the formal specification and analysis of dynamic properties of agents and multiagent systems. This language supports the specification of both qualitative and quantitative aspects, and therefore subsumes specification languages based on differential equations and qualitative, logical approaches. A software environment has been developed for TTL, which supports editing TTL properties and enables the formal verification of properties against a set of traces. The TTL environment proved its value in a number of projects within different biological, cognitive and social domains.
Temporal Abstraction in Reinforcement Learning
, 2000
"... Decision making usually involves choosing among different courses of action over a broad range of time scales. For instance, a person planning a trip to a distant location makes highlevel decisions regarding what means of transportation to use, but also chooses lowlevel actions, such as the moveme ..."
Abstract

Cited by 64 (2 self)
 Add to MetaCart
Decision making usually involves choosing among different courses of action over a broad range of time scales. For instance, a person planning a trip to a distant location makes highlevel decisions regarding what means of transportation to use, but also chooses lowlevel actions, such as the movements for getting into a car. The problem of picking an appropriate time scale for reasoning and learning has been explored in artificial intelligence, control theory and robotics. In this dissertation we develop a framework that allows novel solutions to this problem, in the context of Markov Decision Processes (MDPs) and reinforcement learning. In this dissertation, we present a general framework for prediction, control and learning at multipl...
Between MDPs and semiMDPs: Learning, planning, and representing knowledge at multiple temporal scales
 Journal of Artificial Intelligence Research
, 1998
"... Learning, planning, and representing knowledge at multiple levels of temporal abstraction are key challenges for AI. In this paper we develop an approach to these problems based on the mathematical framework of reinforcement learning and Markov decision processes (MDPs). We extend the usual notion o ..."
Abstract

Cited by 63 (7 self)
 Add to MetaCart
Learning, planning, and representing knowledge at multiple levels of temporal abstraction are key challenges for AI. In this paper we develop an approach to these problems based on the mathematical framework of reinforcement learning and Markov decision processes (MDPs). We extend the usual notion of action to include options—whole courses of behavior that may be temporally extended, stochastic, and contingent on events. Examples of options include picking up an object, going to lunch, and traveling to a distant city, as well as primitive actions such as muscle twitches and joint torques. Options may be given a priori, learned by experience, or both. They may be used interchangeably with actions in a variety of planning and learning methods. The theory of semiMarkov decision processes (SMDPs) can be applied to model the consequences of options and as a basis for planning and learning methods using them. In this paper we develop these connections, building on prior work by Bradtke and Duff (1995), Parr (in prep.) and others. Our main novel results concern the interface between the MDP and SMDP levels of analysis. We show how a set of options can be altered by changing only their termination conditions
Computing differential invariants of hybrid systems as fixedpoints
, 2008
"... Abstract. We introduce a fixedpoint algorithm for verifying safety properties of hybrid systems with differential equations whose righthand sides are polynomials in the state variables. In order to verify nontrivial systems without solving their differential equations and without numerical errors, ..."
Abstract

Cited by 57 (20 self)
 Add to MetaCart
(Show Context)
Abstract. We introduce a fixedpoint algorithm for verifying safety properties of hybrid systems with differential equations whose righthand sides are polynomials in the state variables. In order to verify nontrivial systems without solving their differential equations and without numerical errors, we use a continuous generalization of induction, for which our algorithm computes the required differential invariants. As a means for combining local differential invariants into global system invariants in a sound way, our fixedpoint algorithm works with a compositional verification logic for hybrid systems. To improve the verification power, we further introduce a saturation procedure that refines the system dynamics successively with differential invariants until safety becomes provable. By complementing our symbolic verification algorithm with a robust version of numerical falsification, we obtain a fast and sound verification procedure. We verify roundabout maneuvers in air traffic management and collision avoidance in train control.