Results 1  10
of
137
Between MDPs and SemiMDPs: A Framework for Temporal Abstraction in Reinforcement Learning
, 1999
"... Learning, planning, and representing knowledge at multiple levels of temporal abstraction are key, longstanding challenges for AI. In this paper we consider how these challenges can be addressed within the mathematical framework of reinforcement learning and Markov decision processes (MDPs). We exte ..."
Abstract

Cited by 569 (38 self)
 Add to MetaCart
(Show Context)
Learning, planning, and representing knowledge at multiple levels of temporal abstraction are key, longstanding challenges for AI. In this paper we consider how these challenges can be addressed within the mathematical framework of reinforcement learning and Markov decision processes (MDPs). We extend the usual notion of action in this framework to include optionsclosedloop policies for taking action over a period of time. Examples of options include picking up an object, going to lunch, and traveling to a distant city, as well as primitive actions such as muscle twitches and joint knowledge and action to be included in the reinforcement learning framework in a natural and general way. In particular, we show that options may be used interchangeably with primitive actions in planning methods such as dynamic programming and in learning methods such as Qlearning. Formally, a set of options defined
Stability theory for hybrid dynamical systems
 IEEE Transactions on Automatic Control
, 1998
"... Abstract — Hybrid systems which are capable of exhibiting simultaneously several kinds of dynamic behavior in different parts of a system (e.g., continuoustime dynamics, discretetime dynamics, jump phenomena, switching and logic commands, and the like) are of great current interest. In the present ..."
Abstract

Cited by 135 (8 self)
 Add to MetaCart
(Show Context)
Abstract — Hybrid systems which are capable of exhibiting simultaneously several kinds of dynamic behavior in different parts of a system (e.g., continuoustime dynamics, discretetime dynamics, jump phenomena, switching and logic commands, and the like) are of great current interest. In the present paper we first formulate a model for hybrid dynamical systems which covers a very large class of systems and which is suitable for the qualitative analysis of such systems. Next, we introduce the notion of an invariant set (e.g., equilibrium) for hybrid dynamical systems and we define several types of (Lyapunovlike) stability concepts for an invariant set. We then establish sufficient conditions for uniform stability, uniform asymptotic stability, exponential stability, and instability of an invariant set of hybrid dynamical systems. Under some mild additional assumptions, we also establish necessary conditions for some of the above stability types (converse theorems). In addition to the above, we also establish sufficient conditions for the uniform boundedness of the motions of hybrid dynamical systems (Lagrange stability). To demonstrate the applicability of the developed theory, we present specific examples of hybrid dynamical systems and we conduct a stability analysis of some of these examples (a class of sampleddata feedback control systems with a nonlinear (continuoustime) plant and a linear (discretetime) controller, and a class of systems with impulse effects). Index Terms — Asymptotic stability, boundedness, dynamical system, equilibrium, exponential stability, hybrid, hybrid dynamical
Effective Synthesis of Switching Controllers for Linear Systems
, 2000
"... In this work we suggest a novel methodology for synthesizing switching controllers for continuous and hybrid systems whose dynamics are defined by linear differential equations. We formulate the synthesis problem as finding the conditions upon which a controller should switch the behavior of the sys ..."
Abstract

Cited by 110 (8 self)
 Add to MetaCart
(Show Context)
In this work we suggest a novel methodology for synthesizing switching controllers for continuous and hybrid systems whose dynamics are defined by linear differential equations. We formulate the synthesis problem as finding the conditions upon which a controller should switch the behavior of the system from one "mode" to another in order to avoid a set of bad states, and propose an abstract algorithm which solves the problem by an iterative computation of reachable states. We have implemented a concrete version of the algorithm, which uses a new approximation scheme for reachability analysis of linear systems.
Differential Dynamic Logic for Hybrid Systems
, 2007
"... Hybrid systems are models for complex physical systems and are defined as dynamical systems with interacting discrete transitions and continuous evolutions along differential equations. With the goal of developing a theoretical and practical foundation for deductive verification of hybrid systems, ..."
Abstract

Cited by 78 (46 self)
 Add to MetaCart
Hybrid systems are models for complex physical systems and are defined as dynamical systems with interacting discrete transitions and continuous evolutions along differential equations. With the goal of developing a theoretical and practical foundation for deductive verification of hybrid systems, we introduce a dynamic logic for hybrid programs, which is a program notation for hybrid systems. As a verification technique that is suitable for automation, we introduce a free variable proof calculus with a novel combination of realvalued free variables and Skolemisation for lifting quantifier elimination for real arithmetic to dynamic logic. The calculus is compositional, i.e., it reduces properties of hybrid programs to properties of their parts. Our main result proves that this calculus axiomatises the transition behaviour of hybrid systems completely relative to differential equations. In a case study with cooperating traffic agents of the European Train Control System, we further show that our calculus is wellsuited for verifying realistic hybrid systems with parametric system dynamics.
Bisimilar Linear Systems
, 2001
"... The notion of bisimulation in theoretical computer science is one of the main complexity reduction methods for the analysis and synthesis of labeled transition systems. Bisimulations are special quotients of the state space that preserve many important properties expressible in temporal logics, and, ..."
Abstract

Cited by 68 (14 self)
 Add to MetaCart
The notion of bisimulation in theoretical computer science is one of the main complexity reduction methods for the analysis and synthesis of labeled transition systems. Bisimulations are special quotients of the state space that preserve many important properties expressible in temporal logics, and, in particular, reachability. In this paper, the framework of bisimilar transition systems is applied to various transition systems that are generated by linear control systems. Given a discretetime or continuoustime linear system, and a finite observation map, we characterize linear quotient maps that result in quotient transition systems that are bisimilar to the original system. Interestingly, the characterizations for discretetime systems are more restrictive than for continuoustime systems, due to the existence of an atomic time step. We show that computing the coarsest bisimulation, which results in maximum complexity reduction, corresponds to computing the maximal controlled or reachability invariant subspace inside the kernel of the observations map. These results establish strong connections between complexity reduction concepts in control theory and computer science.
Temporal Abstraction in Reinforcement Learning
, 2000
"... Decision making usually involves choosing among different courses of action over a broad range of time scales. For instance, a person planning a trip to a distant location makes highlevel decisions regarding what means of transportation to use, but also chooses lowlevel actions, such as the moveme ..."
Abstract

Cited by 65 (2 self)
 Add to MetaCart
Decision making usually involves choosing among different courses of action over a broad range of time scales. For instance, a person planning a trip to a distant location makes highlevel decisions regarding what means of transportation to use, but also chooses lowlevel actions, such as the movements for getting into a car. The problem of picking an appropriate time scale for reasoning and learning has been explored in artificial intelligence, control theory and robotics. In this dissertation we develop a framework that allows novel solutions to this problem, in the context of Markov Decision Processes (MDPs) and reinforcement learning. In this dissertation, we present a general framework for prediction, control and learning at multipl...
Specification and Verification of Dynamics in Agent Models
"... Within many domains, among which biological, cognitive, and social areas, multiple interacting processes occur among agents with dynamics that are hard to handle. This paper presents the predicate logical Temporal Trace Language (TTL) for the formal specification and analysis of dynamic properties o ..."
Abstract

Cited by 64 (50 self)
 Add to MetaCart
(Show Context)
Within many domains, among which biological, cognitive, and social areas, multiple interacting processes occur among agents with dynamics that are hard to handle. This paper presents the predicate logical Temporal Trace Language (TTL) for the formal specification and analysis of dynamic properties of agents and multiagent systems. This language supports the specification of both qualitative and quantitative aspects, and therefore subsumes specification languages based on differential equations and qualitative, logical approaches. A software environment has been developed for TTL, which supports editing TTL properties and enables the formal verification of properties against a set of traces. The TTL environment proved its value in a number of projects within different biological, cognitive and social domains.
Between MDPs and semiMDPs: Learning, planning, and representing knowledge at multiple temporal scales
 Journal of Artificial Intelligence Research
, 1998
"... Learning, planning, and representing knowledge at multiple levels of temporal abstraction are key challenges for AI. In this paper we develop an approach to these problems based on the mathematical framework of reinforcement learning and Markov decision processes (MDPs). We extend the usual notion o ..."
Abstract

Cited by 63 (7 self)
 Add to MetaCart
Learning, planning, and representing knowledge at multiple levels of temporal abstraction are key challenges for AI. In this paper we develop an approach to these problems based on the mathematical framework of reinforcement learning and Markov decision processes (MDPs). We extend the usual notion of action to include options—whole courses of behavior that may be temporally extended, stochastic, and contingent on events. Examples of options include picking up an object, going to lunch, and traveling to a distant city, as well as primitive actions such as muscle twitches and joint torques. Options may be given a priori, learned by experience, or both. They may be used interchangeably with actions in a variety of planning and learning methods. The theory of semiMarkov decision processes (SMDPs) can be applied to model the consequences of options and as a basis for planning and learning methods using them. In this paper we develop these connections, building on prior work by Bradtke and Duff (1995), Parr (in prep.) and others. Our main novel results concern the interface between the MDP and SMDP levels of analysis. We show how a set of options can be altered by changing only their termination conditions
Automotive engine control and hybrid systems: challenges and opportunities
 PROCEEDINGS OF THE IEEE
, 2000
"... The design of engine control systems has been traditionally carried out using a mix of heuristic techniques validated by simulation and prototyping using approximate averagevalue models. However, the ever increasing demands on passengers ’ comfort, safety, emissions, and fuel consumption imposed by ..."
Abstract

Cited by 60 (16 self)
 Add to MetaCart
The design of engine control systems has been traditionally carried out using a mix of heuristic techniques validated by simulation and prototyping using approximate averagevalue models. However, the ever increasing demands on passengers ’ comfort, safety, emissions, and fuel consumption imposed by car manufacturers and regulations call for more robust techniques and the use of cycleaccurate models. We argue that these models must be hybrid because of the combination of timedomain and eventbased behaviors. In this paper, we present a hybrid model of the engine in which both continuous and discrete timedomain as well as eventbased phenomena are modeled in a separate but integrated manner. Based on this model, we formalize the specification of the overall engine control by defining a number of hybrid control problems. To cope with the difficulties arising in the design of hybrid controllers, a design methodology is proposed. This methodology consists of a relaxation