Results 1  10
of
37
A decisiontheoretic model of assistance
 In IJCAI
, 2007
"... There is a growing interest in intelligent assistants for a variety of applications from organizing tasks for knowledge workers to helping people with dementia. In this paper, we present and evaluate a decisiontheoretic framework that captures the general notion of assistance. The objective is to o ..."
Abstract

Cited by 24 (3 self)
 Add to MetaCart
(Show Context)
There is a growing interest in intelligent assistants for a variety of applications from organizing tasks for knowledge workers to helping people with dementia. In this paper, we present and evaluate a decisiontheoretic framework that captures the general notion of assistance. The objective is to observe a goaldirected agent and to select assistive actions in order to minimize the overall cost. We model the problem as an assistant POMDP where the hidden state corresponds to the agent’s unobserved goals. This formulation allows us to exploit domain models for both estimating the agent’s goals and selecting assistive action. In addition, the formulation naturally handles uncertainty, varying action costs, and customization to specific agents via learning. We argue that in many domains myopic heuristics will be adequate for selecting actions in the assistant POMDP and present two such heuristics. We evaluate our approach in two domains where human subjects perform tasks in gamelike computer environments. The results show that the assistant substantially reduces user effort with only a modest computational effort. 1
A switching planner for combined task and observation planning
 In TwentyFifth Conference on Artificial Intelligence (AAAI11
, 2011
"... From an automated planning perspective the problem of practical mobile robot control in realistic environments poses many important and contrary challenges. On the one hand, the planning process must be lightweight, robust, and timely. Over the lifetime of the robot it must always respond quickly wi ..."
Abstract

Cited by 17 (6 self)
 Add to MetaCart
(Show Context)
From an automated planning perspective the problem of practical mobile robot control in realistic environments poses many important and contrary challenges. On the one hand, the planning process must be lightweight, robust, and timely. Over the lifetime of the robot it must always respond quickly with new plans that accommodate exogenous events, changing objectives, and the underlying unpredictability of the environment. On the other hand, in order to promote efficient behaviours the planning process must perform computationally expensive reasoning about contingencies and possible revisions of subjective beliefs according to quantitatively modelled uncertainty in acting and sensing. Towards addressing these challenges, we develop a continual planning approach that switches between using a fast satisficing “classical ” planner, to decide on the overall strategy, and decisiontheoretic planning to solve small abstract subproblems where deeper consideration of the sensing model is both practical, and can significantly impact overall performance. We evaluate our approach in large problems from a realistic robot exploration domain.
Scaling up: Solving pomdps through value based clustering
 In Proceedings of AAAI
, 1295
"... Partially Observable Markov Decision Processes (POMDPs) provide an appropriately rich model for agents operating under partial knowledge of the environment. Since finding an optimal POMDP policy is intractable, approximation techniques have been a main focus of research, among them pointbased algor ..."
Abstract

Cited by 11 (3 self)
 Add to MetaCart
Partially Observable Markov Decision Processes (POMDPs) provide an appropriately rich model for agents operating under partial knowledge of the environment. Since finding an optimal POMDP policy is intractable, approximation techniques have been a main focus of research, among them pointbased algorithms, which scale up relatively well up to thousands of states. An important decision in a pointbased algorithm is the order of backup operations over belief states. Prioritization techniques for ordering the sequence of backup operations reduce the number of needed backups considerably, but involve significant overhead. This paper suggests a new way to order backups, based on a soft clustering of the belief space. Our novel soft clustering method relies on the solution of the underlying MDP. Empirical evaluation verifies that our method rapidly computes a good order of backups, showing orders of magnitude improvement in runtime over a number of benchmarks.
Optimally solving DecPOMDPs as continuousstate MDPs
 in Proceedings of the TwentyThird International Joint Conference on Artificial Intelligence
, 2013
"... Optimally solving decentralized partially observable Markov decision processes (DecPOMDPs) is a hard combinatorial problem. Current algorithms search through the space of full histories for each agent. Because of the doubly exponential growth in the number of policies in this space as the planning ..."
Abstract

Cited by 11 (4 self)
 Add to MetaCart
Optimally solving decentralized partially observable Markov decision processes (DecPOMDPs) is a hard combinatorial problem. Current algorithms search through the space of full histories for each agent. Because of the doubly exponential growth in the number of policies in this space as the planning horizon increases, these methods quickly become intractable. However, in real world problems, computing policies over the full history space is often unnecessary. True histories experienced by the agents often lie near a structured, lowdimensional manifold embedded into the history space. We show that by transforming a DecPOMDP into a continuousstate MDP, we are able to find and exploit these lowdimensional representations. Using this novel transformation, we can then apply powerful techniques for solving POMDPs and continuousstate MDPs. By combining a general search algorithm and dimension reduction based on feature selection, we introduce a novel approach to optimally solve problems with significantly longer planning horizons than previous methods. 1
Efficient touch based localization through submodularity
 In IEEE ICRA
, 2013
"... Abstract — Many robotic systems deal with uncertainty by performing a sequence of information gathering actions. In this work, we focus on the problem of efficiently constructing such a sequence by drawing an explicit connection to submodularity. Ideally, we would like a method that finds the optima ..."
Abstract

Cited by 8 (6 self)
 Add to MetaCart
(Show Context)
Abstract — Many robotic systems deal with uncertainty by performing a sequence of information gathering actions. In this work, we focus on the problem of efficiently constructing such a sequence by drawing an explicit connection to submodularity. Ideally, we would like a method that finds the optimal sequence, taking the minimum amount of time while providing sufficient information. Finding this sequence, however, is generally intractable. As a result, many wellestablished methods select actions greedily. Surprisingly, this often performs well. Our work first explains this high performance – we note a commonly used metric, reduction of Shannon entropy, is submodular under certain assumptions, rendering the greedy solution comparable to the optimal plan in the offline setting. However, reacting online to observations can increase performance. Recently developed notions of adaptive submodularity provide guarantees for a greedy algorithm in this online setting. In this work, we develop new methods based on adaptive submodularity for selecting a sequence of information gathering actions online. In addition to providing guarantees, we can capitalize on submodularity to attain additional computational speedups. We demonstrate the effectiveness of these methods in simulation and on a robot.
Canadian Traveler Problem with Remote Sensing
"... The Canadian Traveler Problem (CTP) is a navigation problem where a graph is initially known, but some edges may be blocked with a known probability. The task is to minimize travel effort of reaching the goal. We generalize CTP to allow for remote sensing actions, now requiring minimization of the s ..."
Abstract

Cited by 6 (1 self)
 Add to MetaCart
The Canadian Traveler Problem (CTP) is a navigation problem where a graph is initially known, but some edges may be blocked with a known probability. The task is to minimize travel effort of reaching the goal. We generalize CTP to allow for remote sensing actions, now requiring minimization of the sum of the travel cost and the remote sensing cost. Finding optimal policies for both versions is intractable. We provide optimal solutions for special case graphs. We then develop a framework that utilizes heuristics to determine when and where to sense the environment in order to minimize total costs. Several such heuristics, based on the expected total cost are introduced. Empirical evaluations show the benefits of our heuristics and support some of the theoretical results.
Controller Compilation and Compression for Resource Constrained Applications
"... Abstract. Recent advances in planning techniques for partially observable Markov decision processes have focused on online search techniques and offline pointbased value iteration. While these techniques allow practitioners to obtain policies for fairly large problems, they assume that a nonneglig ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
(Show Context)
Abstract. Recent advances in planning techniques for partially observable Markov decision processes have focused on online search techniques and offline pointbased value iteration. While these techniques allow practitioners to obtain policies for fairly large problems, they assume that a nonnegligible amount of computation can be done between each decision point. In contrast, the recent proliferation of mobile and embedded devices has lead to a surge of applications that could benefit from state of the art planning techniques if they can operate under severe constraints on computational resources. To that effect, we describe two techniques to compile policies into controllers that can be executed by a mere table lookup at each decision point. The first approach compiles policies induced by a set of alpha vectors (such as those obtained by pointbased techniques) into approximately equivalent controllers, while the second approach performs a simulation to compile arbitrary policies into approximately equivalent controllers. We also describe an approach to compress controllers by removing redundant and dominated nodes, often yielding smaller and yet better controllers. The compilation and compression techniques are demonstrated on benchmark problems as well as a mobile application to help Alzheimer patients to wayfind.
Structured Probabilistic Modelling for Dialogue Management
, 2014
"... reproduced or transmitted, in any form or by any means, without permission. ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
reproduced or transmitted, in any form or by any means, without permission.
Learning and solving partially observable Markov Decision Processes
, 2007
"... Partially Observable Markov Decision Processes (POMDPs) provide a rich representation for agents acting in a stochastic domain under partial observability. POMDPs optimally balance key properties such as the need for information and the sum of collected rewards. However, POMDPs are difficult to use ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
Partially Observable Markov Decision Processes (POMDPs) provide a rich representation for agents acting in a stochastic domain under partial observability. POMDPs optimally balance key properties such as the need for information and the sum of collected rewards. However, POMDPs are difficult to use for two reasons; first, it is difficult to obtain the environment dynamics and second, even given the environment dynamics, solving POMDPs optimally is intractable. This dissertation deals with both difficulties. We begin with a number of methods for learning POMDPs. Methods for learning POMDPs are usually categorized as either modelfree or modelbased. We show how modelfree methods fail to provide good policies as noise in the environment increases. We continue to suggest how to transform modelfree into modelbased methods, thus improving their solution. This transformation is first demonstrated in an offline process — after the modelfree method has computed a policy, and then in an online setting — where a model of the environment is learned together with a policy through interactions with the environment. The second part of the dissertation focuses on ways to solve predefined POMDPs. Point
A Fresh Look at SensorBased Navigation: Navigation with Sensing Costs
"... Most work on navigation minimize travel effort or computational effort of the navigating agent, while assuming that unknown components of the environment are sensed by the agent at no cost. We introduce a framework for navigation where the agent needs to minimize a global cost function which include ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Most work on navigation minimize travel effort or computational effort of the navigating agent, while assuming that unknown components of the environment are sensed by the agent at no cost. We introduce a framework for navigation where the agent needs to minimize a global cost function which includes both the travel cost and the sensing cost. At each point in time, the agent needs to decide whether to perform sense queries or to move towards the target. We develop the SN (Sensingbased Navigation) framework that utilizes heuristic functions to determine when and where to sense the environment in order to minimize total costs. We develop several such heuristics, based on the expected total cost. Experimental results show the benefits of our heuristics over existing work, and demonstrate the generality of the SN framework.