Results 1  10
of
63
LQGMP: Optimized Path Planning for Robots with Motion Uncertainty and Imperfect State Information
, 2010
"... This paper presents LQGMP (linearquadratic Gaussian motion planning), a new approach to robot motion planning that takes into account the sensors and the controller that will be used during execution of the robot’s path. LQGMP is based on the linearquadratic controller with Gaussian models of un ..."
Abstract

Cited by 46 (12 self)
 Add to MetaCart
(Show Context)
This paper presents LQGMP (linearquadratic Gaussian motion planning), a new approach to robot motion planning that takes into account the sensors and the controller that will be used during execution of the robot’s path. LQGMP is based on the linearquadratic controller with Gaussian models of uncertainty, and explicitly characterizes in advance (i.e., before execution) the apriori probability distributions of the state of the robot along its path. These distributions canbeusedtoassessthequalityofthepath, forinstancebycomputingtheprobability of avoiding collisions. Many methods can be used to generate the needed ensemble of candidate paths from which the best path is selected; in this paper we report results using RapidlyExploring Random Trees (RRT). We study the performance of LQGMP with simulation experiments in three scenarios: A) a kinodynamic carlike robot, B) multirobot planning with differentialdrive robots, and C) a 6DOF serial manipulator. We also apply Kalman Smoothing to make paths Ckcontinuous while avoiding obstacles and apply LQGMP to precomputed roadmaps using a variant of Dijkstra’s algorithm to efficiently find nearoptimal paths. 1 1
Modelbased Bayesian reinforcement learning in partially observable domains. ISAIM
, 2008
"... Bayesian reinforcement learning in partially observable domains is notoriously difficult, in part due to the unknown form of the beliefs and the optimal value function. We show that beliefs represented by mixtures of products of Dirichlet distributions are closed under belief updates for factored do ..."
Abstract

Cited by 38 (0 self)
 Add to MetaCart
(Show Context)
Bayesian reinforcement learning in partially observable domains is notoriously difficult, in part due to the unknown form of the beliefs and the optimal value function. We show that beliefs represented by mixtures of products of Dirichlet distributions are closed under belief updates for factored domains. Belief monitoring algorithms that use this mixture representation are proposed. We also show that the optimal value function is a linear combination of products of Dirichlets for factored domains. Finally, we extend BEETLE, which is a pointbased value iteration algorithm for Bayesian RL in fully observable domains, to partially observable domains. 1
A Scalable Method for Solving HighDimensional Continuous POMDPs Using Local Approximation
 Conf. on Uncertainty in Artificial Intelligence
, 2010
"... PartiallyObservable Markov Decision Processes (POMDPs) are typically solved by finding an approximate global solution to a corresponding beliefMDP. In this paper, we offer a new planning algorithm for POMDPs with continuous state, action and observation spaces. Since such domains have an inherent ..."
Abstract

Cited by 31 (1 self)
 Add to MetaCart
(Show Context)
PartiallyObservable Markov Decision Processes (POMDPs) are typically solved by finding an approximate global solution to a corresponding beliefMDP. In this paper, we offer a new planning algorithm for POMDPs with continuous state, action and observation spaces. Since such domains have an inherent notion of locality, we can find an approximate solution using local optimization methods. We parameterize the belief distribution as a Gaussian mixture, and use the Extended Kalman Filter (EKF) to approximate the belief update. Since the EKF is a firstorder filter, we can marginalize over the observations analytically. By using feedback control and state estimation during policy execution, we recover a behavior that is effectively conditioned on incoming observations despite the unconditioned planning. Local optimization provides no guarantees of global optimality, but it allows us to tackle domains that are at least an order of magnitude larger than the current stateoftheart. We demonstrate the scalability of our algorithm by considering a simulated handeye coordination domain with 16 continuous state dimensions and 6 continuous action dimensions.
A decisiontheoretic model of assistance
 In IJCAI
, 2007
"... There is a growing interest in intelligent assistants for a variety of applications from organizing tasks for knowledge workers to helping people with dementia. In this paper, we present and evaluate a decisiontheoretic framework that captures the general notion of assistance. The objective is to o ..."
Abstract

Cited by 24 (3 self)
 Add to MetaCart
(Show Context)
There is a growing interest in intelligent assistants for a variety of applications from organizing tasks for knowledge workers to helping people with dementia. In this paper, we present and evaluate a decisiontheoretic framework that captures the general notion of assistance. The objective is to observe a goaldirected agent and to select assistive actions in order to minimize the overall cost. We model the problem as an assistant POMDP where the hidden state corresponds to the agent’s unobserved goals. This formulation allows us to exploit domain models for both estimating the agent’s goals and selecting assistive action. In addition, the formulation naturally handles uncertainty, varying action costs, and customization to specific agents via learning. We argue that in many domains myopic heuristics will be adequate for selecting actions in the assistant POMDP and present two such heuristics. We evaluate our approach in two domains where human subjects perform tasks in gamelike computer environments. The results show that the assistant substantially reduces user effort with only a modest computational effort. 1
Bayesian Reinforcement Learning in Continuous POMDPs with Application to Robot Navigation
"... We consider the problem of optimal control in continuous and partially observable environments when the parameters of the model are not known exactly. Partially Observable Markov Decision Processes (POMDPs) provide a rich mathematical model to handle such environments but require a known model to be ..."
Abstract

Cited by 23 (3 self)
 Add to MetaCart
We consider the problem of optimal control in continuous and partially observable environments when the parameters of the model are not known exactly. Partially Observable Markov Decision Processes (POMDPs) provide a rich mathematical model to handle such environments but require a known model to be solved by most approaches. This is a limitation in practice as the exact model parameters are often difficult to specify exactly. We adopt a Bayesian approach where a posterior distribution over the model parameters is maintained and updated through experience with the environment. We propose a particle filter algorithm to maintain the posterior distribution and an online planning algorithm, based on trajectory sampling, to plan the best action to perform under the current posterior. The resulting approach selects control actions which optimally tradeoff between 1) exploring the environment to learn the model, 2) identifying the system’s state, and 3) exploiting its knowledge in order to maximize longterm rewards. Our preliminary results on a simulated robot navigation problem show that our approach is able to learn good models of the sensors and actuators, and performs as well as if it had the true model.
Optimal design of sequential realtime communication systems
 IEEE Trans. Inf. Theory
, 2009
"... Abstract—Optimal design of sequential realtime communication of a Markov source over a noisy channel is investigated. In such a system, the delay between the source output and its reconstruction at the receiver should equal a fixed prespecified amount. An optimal communication strategy must minimiz ..."
Abstract

Cited by 22 (6 self)
 Add to MetaCart
Abstract—Optimal design of sequential realtime communication of a Markov source over a noisy channel is investigated. In such a system, the delay between the source output and its reconstruction at the receiver should equal a fixed prespecified amount. An optimal communication strategy must minimize the total expected symbolbysymbol distortion between the source output and its reconstruction. Design techniques or performance bounds for such realtime communication systems are unknown. In this paper a systematic methodology, based on the concepts of information structures and information states, to search for an optimal realtime communication strategy is presented. This methodology trades off complexity in communication length (linear in contrast to doubly exponential) with complexity in alphabet sizes (doubly exponential in contrast to exponential). As the communication length is usually order of magnitudes bigger
Monte Carlo Value Iteration for ContinuousState POMDPs
 WORKSHOP ON THE ALGORITHMIC FOUNDATIONS OF ROBOTICS
, 2010
"... Partially observable Markov decision processes (POMDPs) have been successfully applied to various robot motion planning tasks under uncertainty. However, most existing POMDP algorithms assume a discrete state space, while the natural state space of a robot is often continuous. This paper presents Mo ..."
Abstract

Cited by 20 (3 self)
 Add to MetaCart
Partially observable Markov decision processes (POMDPs) have been successfully applied to various robot motion planning tasks under uncertainty. However, most existing POMDP algorithms assume a discrete state space, while the natural state space of a robot is often continuous. This paper presents Monte Carlo Value Iteration (MCVI) for continuousstate POMDPs. MCVI samples both a robot’s state space and the corresponding belief space, and avoids inefficient a priori discretization of the state space as a grid. Both theoretical results and preliminary experimental results indicate that MCVI is a promising new approach for robot motion planning under uncertainty.
Efficient Planning under Uncertainty with Macroactions
, 2014
"... Terms of Use Article is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use. Detailed Terms The MIT Faculty has made this article openly available. Please share how this access benefits you. ..."
Abstract

Cited by 18 (0 self)
 Add to MetaCart
(Show Context)
Terms of Use Article is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use. Detailed Terms The MIT Faculty has made this article openly available. Please share how this access benefits you. Your story matters.
LQGMP: Optimized Path Planning for Robots with Motion Uncertainty and Imperfect State Information
"... This paper presents LQGMP (linearquadratic Gaussian motion planning), a new approach to robot motion planning that takes into account the sensors and the controller that will be used during execution of the robot’s path. LQGMP is based on the linearquadratic controller with Gaussian models of ..."
Abstract

Cited by 16 (1 self)
 Add to MetaCart
(Show Context)
This paper presents LQGMP (linearquadratic Gaussian motion planning), a new approach to robot motion planning that takes into account the sensors and the controller that will be used during execution of the robot’s path. LQGMP is based on the linearquadratic controller with Gaussian models of uncertainty, and explicitly characterizes in advance (i.e., before execution) the apriori probability distributions of the state of the robot along its path. These distributions can be used to assess the quality of the path, for instance by computing the probability of avoiding collisions. Many methods can be used to generate the needed ensemble of candidate paths from which the best path is selected; in this paper we report results using the RRTalgorithm. We study the performance of LQGMP with simulation experiments in three scenarios involving a kinodynamic carlike robot, multirobot planning with differentialdrive robots, and a 6DOF manipulator.
Randomized BeliefSpace Replanning in PartiallyObservable Continuous Spaces
"... Abstract: We present a samplebased replanning strategy for driving partiallyobservable, highdimensional robotic systems to a desired goal. At each time step, it uses forward simulation of randomlysampled openloop controls to construct a beliefspace search tree rooted at its current belief state ..."
Abstract

Cited by 14 (1 self)
 Add to MetaCart
(Show Context)
Abstract: We present a samplebased replanning strategy for driving partiallyobservable, highdimensional robotic systems to a desired goal. At each time step, it uses forward simulation of randomlysampled openloop controls to construct a beliefspace search tree rooted at its current belief state. Then, it executes the action at the root that leads to the best node in the tree. As a node quality metric we use Monte Carlo simulation to estimate the likelihood of success under the QMDP beliefspace feedback policy, which encourages the robot to take informationgathering actions as needed to reach the goal. The technique is demonstrated on targetfinding and localization examples in up to 5D state spacess. 1