Results 1 - 10
of
59
Planning with Incomplete Information as Heuristic Search in Belief Space
, 2000
"... The formulation of planning as heuristic search with heuristics derived from problem representations has turned out to be a fruitful approach for classical planning. In this paper, we pursue a similar idea in the context planning with incomplete information. Planning with incomplete information ..."
Abstract
-
Cited by 174 (23 self)
- Add to MetaCart
The formulation of planning as heuristic search with heuristics derived from problem representations has turned out to be a fruitful approach for classical planning. In this paper, we pursue a similar idea in the context planning with incomplete information. Planning with incomplete information can be formulated as a problem of search in belief space, where belief states can be either sets of states or more generally probability distribution over states. While the formulation (as the formulation of classical planning as heuristic search) is not particularly novel, the contribution of this paper is to make it explicit, to test it over a number of domains, and to extend it to tasks like planning with sensing where the standard search algorithms do not apply. The resulting planner appears to be competitive with the most recent conformant and contingent planners (e.g., cgp, sgp, and cmbp) while at the same time is more general as it can handle probabilistic actions and se...
Admissible Heuristics for Optimal Planning
- In Proceedings of AIPS-00
, 2000
"... hsp and hspr are two recent planners that search the state-space using an heuristic function extracted from Strips encodings. hsp does a forward search from the initial state recomputing the heuristic in every state, while hspr does a regression search from the goal computing a suitable representati ..."
Abstract
-
Cited by 128 (16 self)
- Add to MetaCart
hsp and hspr are two recent planners that search the state-space using an heuristic function extracted from Strips encodings. hsp does a forward search from the initial state recomputing the heuristic in every state, while hspr does a regression search from the goal computing a suitable representation of the heuristic only once. Both planners have shown good performance, often producing solutions that are competitive in time and number of actions with the solutions found by Graphplan and sat planners. hsp and hspr, however, are not optimal planners. This is because the heuristic function is not admissible and the search algorithms are not optimal. In this paper we address this problem. We formulate a new admissible heuristic for planning, use it to guide an ida search, and empirically evaluate the resulting optimal planner over a number of domains. The main contribution is the idea underlying the heuristic that yields not one but a whole family of polynomial and admissible heuristics that trade accuracy for e ciency. The formulation is general and sheds some light on the heuristics used in hsp and Graphplan, and their relation. It exploits the factored (Strips) representation of planning problems, mapping shortest-path problems in state-space into suitably dened shortest-path problems in atom-space. The formulation applies with little variation to sequential and parallel planning, and problems with di erent action costs.
Model Checking for Probability and Time: From Theory to Practice
- In Proc. Logic in Computer Science
, 2003
"... Probability features increasingly often in software and hardware systems: it is used in distributed co-ordination and routing problems, to model fault-tolerance and performance, and to provide adaptive resource management strategies. Probabilistic model checking is an automatic procedure for establi ..."
Abstract
-
Cited by 41 (1 self)
- Add to MetaCart
Probability features increasingly often in software and hardware systems: it is used in distributed co-ordination and routing problems, to model fault-tolerance and performance, and to provide adaptive resource management strategies. Probabilistic model checking is an automatic procedure for establishing if a desired property holds in a probabilistic model, aimed at verifying probabilistic specifications such as "leader election is eventually resolved with probability 1", "the chance of shutdown occurring is at most 0.01%", and "the probability that a message will be delivered within 30ms is at least 0.75". A probabilistic model checker calculates the probability of a given temporal logic property being satisfied, as opposed to validity. In contrast to conventional model checkers, which rely on reachability analysis of the underlying transition system graph, probabilistic model checking additionally involves numerical solutions of linear equations and linear programming problems. This paper reports our experience with implementing PRISM (www.cs.bham.ac.uk/dxp/ prism/), a Probabilistic Symbolic Model Checker, demonstrates its usefulness in analysing real-world probabilistic protocols, and outlines future challenges for this research direction.
GPT: A Tool for Planning with Uncertainty and Partial Information
- In Proc. IJCAI01 Workshop on Planning with Uncertainty and Incomplete Information
, 2001
"... Introduction We describe the GPT system and its utilization over a number of examples. GPT (General Planning Tool) is an integrated software tool for modeling, analyzing and solving a wide range of planning problems dealing with uncertainty and partial information, that has been used for us and othe ..."
Abstract
-
Cited by 33 (9 self)
- Add to MetaCart
Introduction We describe the GPT system and its utilization over a number of examples. GPT (General Planning Tool) is an integrated software tool for modeling, analyzing and solving a wide range of planning problems dealing with uncertainty and partial information, that has been used for us and others for research and teaching. Our approach is based on different state models that can handle various types of action dynamics (deterministic and probabilistic) and sensor feedback (null, partial, and complete). The system consists mainly of a high-level language for expressing actions, sensors, and goals, and a bundle algorithms based on heuristic search for solving them. The language is one of GPT's strengths since it presents the user a consistent and unified framework for the planning task. These descriptions are then solved by appropriate algorithms chosen from the bun
Faster Heuristic Search Algorithms for Planning with Uncertainty and Full Feedback
- Proc. 18th International Joint Conf. on Artificial Intelligence
, 2003
"... Recent algorithms like RTDP and LAO* combine the strength of Heuristic Search (HS) and Dynamic Programming (DP) methods by exploiting knowledge of the initial state and an admissible heuristic function for producing optimal policies without evaluating the entire space. In this paper, we introdu ..."
Abstract
-
Cited by 33 (5 self)
- Add to MetaCart
Recent algorithms like RTDP and LAO* combine the strength of Heuristic Search (HS) and Dynamic Programming (DP) methods by exploiting knowledge of the initial state and an admissible heuristic function for producing optimal policies without evaluating the entire space. In this paper, we introduce and analyze three new HS/DP algorithms.
Revenue management under a general discrete choice model of consumer behavior. Working paper, Universitat Pompeu Fabra
, 2001
"... Customer choice behavior, such as “buy-up ” and “buy-down”, is an important phenomenon in a wide range of revenue management contexts. Yet most revenue management methodologies ignore this phenomenon- or at best approximate it in a heuristic way. In this paper, we provide an exact and quite general ..."
Abstract
-
Cited by 24 (1 self)
- Add to MetaCart
Customer choice behavior, such as “buy-up ” and “buy-down”, is an important phenomenon in a wide range of revenue management contexts. Yet most revenue management methodologies ignore this phenomenon- or at best approximate it in a heuristic way. In this paper, we provide an exact and quite general analysis of this problem. Specifically, we analyze a single-leg yield management problem in which the buyers’ choice behavior is modeled explicitly. The choice model is perfectly general and simply specifies the probability of purchasing each fare product as function of the set of fare products offered. The control problem is to decide which subset of fare products to offer at each point in time. We show that the optimal policy is of a simple form. Namely, it consists of 1) identifying the ordered family of “nondominated ” subsets S1,..., Sm, and 2) at each point in time opening one of these sets Sk, where the optimal index k is increasing in the remaining capacity x. Thatis,themorecapacitywehaveavailable,the further the optimal set is along this sequence. Moreover, we show that the optimal policy is nested if and only if the ordered sets are increasing, that is S1 ⊆ S2 ⊆... ⊆ Sn, and we give a complete characterization of when nesting by fare order is optimal. We then show that two important models, the independent demand model and the multinomial logit model (MNL), satisfy this later condition and hence nested-by-fare-order policies are optimal in these cases. We also develop an estimation procedure for this setting based on the expectation-maximization (EM) method that jointly estimates arrival rates and choice model parameters when no-purchase outcomes are unobservable. Numerical results are given to illustrate both the model and estimation procedure.
Learning Generalized Policies in Planning Using Concept Languages
, 2000
"... In this paper we are concerned with the problem of learning how to solve planning problems in one domain given a number of solved instances. This problem is formulated as the problem of inferring a function that operates over all instances in the domain and maps states and goals into actions. ..."
Abstract
-
Cited by 21 (0 self)
- Add to MetaCart
In this paper we are concerned with the problem of learning how to solve planning problems in one domain given a number of solved instances. This problem is formulated as the problem of inferring a function that operates over all instances in the domain and maps states and goals into actions. We call such functions generalized policies and the question that we address is how to learn suitable representations of generalized policies from data. This question has been addressed recently by Roni Khardon [16]. Khardon represents generalized policies using an ordered list of existentially quantified rules that are inferred from a training set using a version of Rivest's learning algorithm [22]. Here, we follow Khardon's approach but represent generalized policies in a different way using a concept language. We show through a number of experiments in the blocks-world that the concept language yields a better policy using a smaller set of examples and no background knowle...
An adaptive sampling algorithm for solving Markov decision processes
- Operations Research
, 2005
"... Based on recent results for multi-armed bandit problems, we propose an adaptive sampling algorithm that approximates the optimal value of a finite horizon Markov decision process (MDP) with infinite state space but finite action space and bounded rewards. The algorithm adaptively chooses which actio ..."
Abstract
-
Cited by 15 (5 self)
- Add to MetaCart
Based on recent results for multi-armed bandit problems, we propose an adaptive sampling algorithm that approximates the optimal value of a finite horizon Markov decision process (MDP) with infinite state space but finite action space and bounded rewards. The algorithm adaptively chooses which action to sample as the sampling process proceeds, and it is proven that the estimate produced by the algorithm is asymptotically unbiased and the worst possible bias is bounded by a quantity that converges to zero at rate O � � H ln N N,whereHis the horizon length and N is the total number of samples that are used per state sampled in each stage. The worst-case running-time complexity of the algorithm is O((|A|N) H), independent of the state space size, where |A | is the size of the action space. The algorithm can be used to create an approximate receding horizon control to solve infinite horizon MDPs.
Dynamic Node Activation in Networks of Rechargeable Sensors
"... Abstract—We consider a network of rechargeable sensors, deployed redundantly in a random sensing environment, and address the problem of how sensor nodes should be activated dynamically so as to maximize a generalized system performance objective. The optimal sensor activation problem is a very diff ..."
Abstract
-
Cited by 11 (2 self)
- Add to MetaCart
Abstract—We consider a network of rechargeable sensors, deployed redundantly in a random sensing environment, and address the problem of how sensor nodes should be activated dynamically so as to maximize a generalized system performance objective. The optimal sensor activation problem is a very difficult decision question, and under Markovian assumptions on the sensor discharge/recharge periods, it represents a complex semi-Markov decision problem. With the goal of developing a practical, distributed but efficient solution to this complex, global optimization problem, we first consider the activation question for a set of sensor nodes whose coverage areas overlap completely. For this scenario, we show analytically that there exists a simple threshold activation policy that achieves a performance of at least 3/4 of the optimum over all possible policies. We extend this threshold policy to a general network setting where the coverage areas of different sensors could have partial or no overlap with each other, and show by simulations that the performance of our policy is very close to that of the globally optimal policy. Our policy is fully distributed, and requires the sensor nodes to only keep track of the node activation states in its immediate neighborhood. We also consider the effects of spatial correlation on the performance of the threshold activation policy, and the choice of the optimal threshold. Index Terms—Rechargable sensors, sensor activation, spatial correlation.
Planning and Control in Artificial Intelligence: A Unifying Perspective
- Applied Intelligence
, 2001
"... The problem of selecting actions in environments that are dynamic and not completely predictable or observable is a central problem in intelligent behavior. In AI, this translates into the problem of designing controllers that can map sequences of observations into actions so that certain goals ..."
Abstract
-
Cited by 9 (0 self)
- Add to MetaCart
The problem of selecting actions in environments that are dynamic and not completely predictable or observable is a central problem in intelligent behavior. In AI, this translates into the problem of designing controllers that can map sequences of observations into actions so that certain goals are achieved. Three main approaches have been used in AI for designing such controllers: the programming approach, where the controller is programmed by hand in a suitable high-level procedural language, the planning approach, where the control is automatically derived from a suitable description of actions and goals, and the learning approach, where the control is derived from a collection of experiences. The three approaches can exhibit successes and limitations. The focus of this paper is on the planning approach. More specifically, we present an approach to planning based on various state models that can handle various types of action dynamics (deterministic and probabilistic) ...

