Results 1  10
of
16
A Heuristic Search Approach to Planning with Continuous Resources in Stochastic Domains
"... We consider the problem of optimal planning in stochastic domains with resource constraints, where the resources are continuous and the choice of action at each step depends on resource availability. We introduce the HAO * algorithm, a generalization of the AO * algorithm that performs search in a h ..."
Abstract

Cited by 8 (0 self)
 Add to MetaCart
(Show Context)
We consider the problem of optimal planning in stochastic domains with resource constraints, where the resources are continuous and the choice of action at each step depends on resource availability. We introduce the HAO * algorithm, a generalization of the AO * algorithm that performs search in a hybrid state space that is modeled using both discrete and continuous state variables, where the continuous variables represent monotonic resources. Like other heuristic search algorithms, HAO * leverages knowledge of the start state and an admissible heuristic to focus computational effort on those parts of the state space that could be reached from the start state by following an optimal policy. We show that this approach is especially effective when resource constraints limit how much of the state space is reachable. Experimental results demonstrate its effectiveness in the domain that motivates our research: automated planning for planetary exploration rovers. 1.
Strategic Advice Provision in Repeated HumanAgent Interactions
"... This paper addresses the problem of automated advice provision in settings that involve repeated interactions between people and computer agents. This problem arises in many real world applications such as route selection systems and office assistants. To succeed in such settings agents must reason ..."
Abstract

Cited by 7 (6 self)
 Add to MetaCart
This paper addresses the problem of automated advice provision in settings that involve repeated interactions between people and computer agents. This problem arises in many real world applications such as route selection systems and office assistants. To succeed in such settings agents must reason about how their actions in the present influence people’s future actions. This work models such settings as a family of repeated bilateral games of incomplete information called “choice selection processes”, in which players may share certain goals, but are essentially selfinterested. The paper describes several possible models of human behavior that were inspired by behavioral economic theories of people’s play in repeated interactions. These models were incorporated into several agent designs to repeatedly generate offers to people playing the game. These agents were evaluated in extensive empirical investigations including hundreds of subjects that interacted with computers in different choice selections processes. The results revealed that an agent that combined a hyperbolic discounting model of human behavior with a social utility function was able to outperform alternative agent designs, including an agent that approximated the optimal strategy using continuous MDPs and an agent using epsilongreedy strategies to describe people’s behavior. We show that this approach was able to generalize to new people as well as choice selection processes that were not used for training. Our results demonstrate that combining computational approaches with behavioral economics models of people in repeated interactions facilitates the design of advice provision strategies for a large class of realworld settings.
Symbolic dynamic programming for discrete and continuous state mdps
 In UAI2011
, 2011
"... Many realworld decisiontheoretic planning problems can be naturally modeled with discrete and continuous state Markov decision processes (DCMDPs). While previous work has addressed automated decisiontheoretic planning for DCMDPs, optimal solutions have only been defined so far for limited setti ..."
Abstract

Cited by 4 (3 self)
 Add to MetaCart
(Show Context)
Many realworld decisiontheoretic planning problems can be naturally modeled with discrete and continuous state Markov decision processes (DCMDPs). While previous work has addressed automated decisiontheoretic planning for DCMDPs, optimal solutions have only been defined so far for limited settings, e.g., DCMDPs having hyperrectangular piecewise linear value functions. In this work, we extend symbolic dynamic programming (SDP) techniques to provide optimal solutions for a vastly expanded class of DCMDPs. To address the inherent combinatorial aspects of SDP, we introduce the XADD — a continuous variable extension of the algebraic decision diagram (ADD) — that maintains compact representations of the exact value function. Empirically, we demonstrate an implementation of SDP with XADDs on various DCMDPs, showing the first optimal automated solutions to DCMDPs with linear and nonlinear piecewise partitioned value functions and showing the advantages of constraintbased pruning for XADDs. 1
Stochastic Model Predictive Control of TimeVariant Nonlinear Systems with Imperfect State Information
"... Abstract — In many technical systems, the system state, which is to be controlled, is not directly accessible, but has to be estimated from observations. Furthermore, the uncertainties arising from this procedure are typically neglected in the controller. To remedy this deficiency, in this paper, we ..."
Abstract

Cited by 3 (3 self)
 Add to MetaCart
(Show Context)
Abstract — In many technical systems, the system state, which is to be controlled, is not directly accessible, but has to be estimated from observations. Furthermore, the uncertainties arising from this procedure are typically neglected in the controller. To remedy this deficiency, in this paper, we present a novel approach to stochastic nonlinear model predictive control (NMPC) for heavily noiseaffected systems with not directly accessible, i.e., hidden states, extending the stochastic NMPCframework presented in [1]. An important property of our novel method is that, in contrast to classical approaches, timevariant system and measurement equations as well as timevariant step rewards can be considered. Extending the techniques from [1] by introducing virtual future observations and combining this with a novel tree search algorithm, called probabilistic branchandbound search (PBAB), a solution with a feasible computational demand of the challenging problem is possible. I.
Symbolic Dynamic Programming for Continuous State and Action MDPs
"... Many realworld decisiontheoretic planning problems are naturally modeled using both continuous state and action (CSA) spaces, yet little work has provided exact solutions for the case of continuous actions. In this work, we propose a symbolic dynamic programming (SDP) solution to obtain the optima ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
Many realworld decisiontheoretic planning problems are naturally modeled using both continuous state and action (CSA) spaces, yet little work has provided exact solutions for the case of continuous actions. In this work, we propose a symbolic dynamic programming (SDP) solution to obtain the optimal closedform value function and policy for CSAMDPs with multivariate continuous state and actions, discrete noise, piecewise linear dynamics, and piecewise linear (or restricted piecewise quadratic) reward. Our key contribution over previous SDP work is to show how the continuous action maximization step in the dynamic programming backup can be evaluated optimally and symbolically — a task which amounts to symbolic constrained optimization subject to unknown state parameters; we further integrate this technique to work with an efficient and compact data structure for SDP — the extended algebraic decision diagram (XADD). We demonstrate empirical results on a didactic nonlinear planning example and two domains from operations research to show the first automated exact solution to these problems.
Improving Adjustable Autonomy Strategies for TimeCritical Domains
"... As agents begin to perform complex tasks alongside humans as collaborative teammates, it becomes crucial that the resulting humanmultiagent teams adapt to timecritical domains. In such domains, adjustable autonomy has proven useful by allowing for a dynamic transfer of control of decision making be ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
(Show Context)
As agents begin to perform complex tasks alongside humans as collaborative teammates, it becomes crucial that the resulting humanmultiagent teams adapt to timecritical domains. In such domains, adjustable autonomy has proven useful by allowing for a dynamic transfer of control of decision making between human and agents. However, existing adjustable autonomy algorithms commonly discretize time, which not only results in high algorithm runtimes but also translates into inaccurate transfer of control policies. In addition, existing techniques fail to address decision making inconsistencies often encountered in human multiagent decision making. To address these limitations, we present novel approach for Resolving Inconsistencies in Adjustable Autonomy in Continuous Time (RIAACT) that makes three contributions: First, we apply continuous time planning paradigm to adjustable autonomy, resulting in highaccuracy transfer of control policies. Second, our new adjustable autonomy framework both models and plans for the resolving of inconsistencies between human and agent decisions. Third, we introduce a new model, Interruptible Action Timedependent Markov Decision Problem (IATMDP), which allows for actions to be interrupted at any point in continuous time. We show how to solve IATMDPs efficiently and leverage them to plan for the resolving of inconsistencies in RIAACT. Furthermore, these contributions have been realized and evaluated in a complex disaster response simulation system. 1.
ALARMS: Alerting and Reasoning Management System for Next Generation Aircraft Hazards
"... The Next Generation Air Transportation System will introduce new, advanced sensor technologies into the cockpit. With the introduction of such systems, the responsibilities of the pilot are expected to dramatically increase. In the ALARMS (ALerting And Reasoning Management System) project for NASA, ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
The Next Generation Air Transportation System will introduce new, advanced sensor technologies into the cockpit. With the introduction of such systems, the responsibilities of the pilot are expected to dramatically increase. In the ALARMS (ALerting And Reasoning Management System) project for NASA, we focus on a key challenge of this environment, the quick and efficient handling of aircraft sensor alerts. It is infeasible to alert the pilot on the state of all subsystems at all times. Furthermore, there is uncertainty as to the true hazard state despite the evidence of the alerts, and there is uncertainty as to the effect and duration of actions taken to address these alerts.
Planning in Hybrid Structured Stochastic Domains
, 2006
"... Efficient representations and solutions for large structured decision problems with continuous and discrete variables are among the important challenges faced by the designers of automated decision support systems. In this work, we describe a novel hybrid factored Markov decision process (MDP) mod ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Efficient representations and solutions for large structured decision problems with continuous and discrete variables are among the important challenges faced by the designers of automated decision support systems. In this work, we describe a novel hybrid factored Markov decision process (MDP) model that allows for a compact representation of these problems, and a hybrid approximate linear programming (HALP) framework that permits their efficient solutions. The central idea of HALP is to approximate the optimal value function of an MDP by a linear combination of basis functions and optimize its weights by linear programming. We study both theoretical and practical aspects of this approach, and demonstrate its scaleup potential on several hybrid optimization problems
Stochastic Optimal Control based on ValueFunction Approximation using Sinc Interpolation
"... Abstract: An efficient approach for solving stochastic optimal control problems is to employ dynamic programming (DP). For continuousvalued nonlinear systems, the corresponding DP recursion generally cannot be solved in closed form. Thus, a typical approach is to discretize the DP value functions i ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
Abstract: An efficient approach for solving stochastic optimal control problems is to employ dynamic programming (DP). For continuousvalued nonlinear systems, the corresponding DP recursion generally cannot be solved in closed form. Thus, a typical approach is to discretize the DP value functions in order to be able to carry out the calculation. Especially for multidimensional systems, either a large number of discretization points is necessary or the quality of approximation degrades. This problem can be alleviated by interpolating the discretized value function. In this paper, we present an approach based on optimal lowpass interpolation employing sinc functions (sine cardinal). For the important case of systems with Gaussian mixture noise (including the special case of Gaussian noise), we show how the calculations required for this approach, especially the nontrivial calculation of an expected value of a Gaussian mixture random variable transformed by a sinc function, can be carried out analytically. We illustrate the effectiveness of the proposed interpolation scheme by an example from the field of Stochastic Nonlinear Model Predictive Control (SNMPC). 1.
14 Feature Article: Making Good Decisions Quickly Making Good Decisions Quickly
"... Abstract—Several disciplines, including artificial intelligence, operations research and many others, study how to make good decisions. In this overview article, we argue that the key to making progress in our research area is to combine their ideas, which often requires serious technical advances t ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract—Several disciplines, including artificial intelligence, operations research and many others, study how to make good decisions. In this overview article, we argue that the key to making progress in our research area is to combine their ideas, which often requires serious technical advances to reconcile their different assumptions and methods in a way that results in synergy among them. To illustrate this point, we give a broad overview of our ongoing research on search and planning (with a large number of students and colleagues, both at the University of Southern California and elsewhere) to demonstrate how to combine ideas from different decision making disciplines. For example, we describe how to combine ideas from artificial intelligence, operations research, and utility theory to create the foundations for building decision support systems that fit the risk preferences of human decision makers in highstake oneshot decision situations better than current systems. We also describe how to combine ideas from artificial intelligence, economics, theoretical computer science and operations research to build teams of robots that use auctions to distribute tasks autonomously among themselves, and give several more examples. Index Terms—agents, ant robotics, artificial intelligence, auctionbased coordination, decision theory, dynamic programming, economics, freespace assumption, goaldirected navigation, greedy online planning, heuristic search, highstake oneshot decision making, incremental heuristic search, Markov decision processes, multiagent systems, nonlinear utility functions, operations research, planning, realtime heuristic search, reinforcement learning, risk preferences, robotics, scarce resources, sequentialsingle item auctions, terrain coverage, utility theory. I.