Results 1  10
of
96
Planning with Incomplete Information as Heuristic Search in Belief Space
, 2000
"... The formulation of planning as heuristic search with heuristics derived from problem representations has turned out to be a fruitful approach for classical planning. In this paper, we pursue a similar idea in the context planning with incomplete information. Planning with incomplete information ..."
Abstract

Cited by 203 (31 self)
 Add to MetaCart
The formulation of planning as heuristic search with heuristics derived from problem representations has turned out to be a fruitful approach for classical planning. In this paper, we pursue a similar idea in the context planning with incomplete information. Planning with incomplete information can be formulated as a problem of search in belief space, where belief states can be either sets of states or more generally probability distribution over states. While the formulation (as the formulation of classical planning as heuristic search) is not particularly novel, the contribution of this paper is to make it explicit, to test it over a number of domains, and to extend it to tasks like planning with sensing where the standard search algorithms do not apply. The resulting planner appears to be competitive with the most recent conformant and contingent planners (e.g., cgp, sgp, and cmbp) while at the same time is more general as it can handle probabilistic actions and se...
Admissible Heuristics for Optimal Planning
 In Proceedings of AIPS00
, 2000
"... hsp and hspr are two recent planners that search the statespace using an heuristic function extracted from Strips encodings. hsp does a forward search from the initial state recomputing the heuristic in every state, while hspr does a regression search from the goal computing a suitable representati ..."
Abstract

Cited by 169 (21 self)
 Add to MetaCart
hsp and hspr are two recent planners that search the statespace using an heuristic function extracted from Strips encodings. hsp does a forward search from the initial state recomputing the heuristic in every state, while hspr does a regression search from the goal computing a suitable representation of the heuristic only once. Both planners have shown good performance, often producing solutions that are competitive in time and number of actions with the solutions found by Graphplan and sat planners. hsp and hspr, however, are not optimal planners. This is because the heuristic function is not admissible and the search algorithms are not optimal. In this paper we address this problem. We formulate a new admissible heuristic for planning, use it to guide an ida search, and empirically evaluate the resulting optimal planner over a number of domains. The main contribution is the idea underlying the heuristic that yields not one but a whole family of polynomial and admissible heuristics that trade accuracy for e ciency. The formulation is general and sheds some light on the heuristics used in hsp and Graphplan, and their relation. It exploits the factored (Strips) representation of planning problems, mapping shortestpath problems in statespace into suitably dened shortestpath problems in atomspace. The formulation applies with little variation to sequential and parallel planning, and problems with di erent action costs.
Model Checking for Probability and Time: From Theory to Practice
 In Proc. Logic in Computer Science
, 2003
"... Probability features increasingly often in software and hardware systems: it is used in distributed coordination and routing problems, to model faulttolerance and performance, and to provide adaptive resource management strategies. Probabilistic model checking is an automatic procedure for establi ..."
Abstract

Cited by 47 (1 self)
 Add to MetaCart
Probability features increasingly often in software and hardware systems: it is used in distributed coordination and routing problems, to model faulttolerance and performance, and to provide adaptive resource management strategies. Probabilistic model checking is an automatic procedure for establishing if a desired property holds in a probabilistic model, aimed at verifying probabilistic specifications such as "leader election is eventually resolved with probability 1", "the chance of shutdown occurring is at most 0.01%", and "the probability that a message will be delivered within 30ms is at least 0.75". A probabilistic model checker calculates the probability of a given temporal logic property being satisfied, as opposed to validity. In contrast to conventional model checkers, which rely on reachability analysis of the underlying transition system graph, probabilistic model checking additionally involves numerical solutions of linear equations and linear programming problems. This paper reports our experience with implementing PRISM (www.cs.bham.ac.uk/dxp/ prism/), a Probabilistic Symbolic Model Checker, demonstrates its usefulness in analysing realworld probabilistic protocols, and outlines future challenges for this research direction.
Revenue management under a general discrete choice model of consumer behavior. Working paper, Universitat Pompeu Fabra
, 2001
"... Customer choice behavior, such as “buyup ” and “buydown”, is an important phenomenon in a wide range of revenue management contexts. Yet most revenue management methodologies ignore this phenomenon or at best approximate it in a heuristic way. In this paper, we provide an exact and quite general ..."
Abstract

Cited by 44 (5 self)
 Add to MetaCart
Customer choice behavior, such as “buyup ” and “buydown”, is an important phenomenon in a wide range of revenue management contexts. Yet most revenue management methodologies ignore this phenomenon or at best approximate it in a heuristic way. In this paper, we provide an exact and quite general analysis of this problem. Specifically, we analyze a singleleg yield management problem in which the buyers’ choice behavior is modeled explicitly. The choice model is perfectly general and simply specifies the probability of purchasing each fare product as function of the set of fare products offered. The control problem is to decide which subset of fare products to offer at each point in time. We show that the optimal policy is of a simple form. Namely, it consists of 1) identifying the ordered family of “nondominated ” subsets S1,..., Sm, and 2) at each point in time opening one of these sets Sk, where the optimal index k is increasing in the remaining capacity x. Thatis,themorecapacitywehaveavailable,the further the optimal set is along this sequence. Moreover, we show that the optimal policy is nested if and only if the ordered sets are increasing, that is S1 ⊆ S2 ⊆... ⊆ Sn, and we give a complete characterization of when nesting by fare order is optimal. We then show that two important models, the independent demand model and the multinomial logit model (MNL), satisfy this later condition and hence nestedbyfareorder policies are optimal in these cases. We also develop an estimation procedure for this setting based on the expectationmaximization (EM) method that jointly estimates arrival rates and choice model parameters when nopurchase outcomes are unobservable. Numerical results are given to illustrate both the model and estimation procedure.
Faster Heuristic Search Algorithms for Planning with Uncertainty and Full Feedback
 Proc. 18th International Joint Conf. on Artificial Intelligence
, 2003
"... Recent algorithms like RTDP and LAO* combine the strength of Heuristic Search (HS) and Dynamic Programming (DP) methods by exploiting knowledge of the initial state and an admissible heuristic function for producing optimal policies without evaluating the entire space. In this paper, we introdu ..."
Abstract

Cited by 42 (6 self)
 Add to MetaCart
Recent algorithms like RTDP and LAO* combine the strength of Heuristic Search (HS) and Dynamic Programming (DP) methods by exploiting knowledge of the initial state and an admissible heuristic function for producing optimal policies without evaluating the entire space. In this paper, we introduce and analyze three new HS/DP algorithms.
GPT: A Tool for Planning with Uncertainty and Partial Information
 In Proc. IJCAI01 Workshop on Planning with Uncertainty and Incomplete Information
, 2001
"... Introduction We describe the GPT system and its utilization over a number of examples. GPT (General Planning Tool) is an integrated software tool for modeling, analyzing and solving a wide range of planning problems dealing with uncertainty and partial information, that has been used for us and othe ..."
Abstract

Cited by 35 (9 self)
 Add to MetaCart
Introduction We describe the GPT system and its utilization over a number of examples. GPT (General Planning Tool) is an integrated software tool for modeling, analyzing and solving a wide range of planning problems dealing with uncertainty and partial information, that has been used for us and others for research and teaching. Our approach is based on different state models that can handle various types of action dynamics (deterministic and probabilistic) and sensor feedback (null, partial, and complete). The system consists mainly of a highlevel language for expressing actions, sensors, and goals, and a bundle algorithms based on heuristic search for solving them. The language is one of GPT's strengths since it presents the user a consistent and unified framework for the planning task. These descriptions are then solved by appropriate algorithms chosen from the bun
Taxes and quotas for a stock pollutant with multiplicative uncertainty
 J. Public Econ
"... We compare taxes and quotas when �rms and the regulator have asymmetric information about the slope of �rms ’ abatement costs. Damages are caused by a stock pollutant. We calibrate the model using cost and damage estimates of greenhouse gasses. Taxes dominate quotas, as with additive uncertainty. Th ..."
Abstract

Cited by 32 (12 self)
 Add to MetaCart
We compare taxes and quotas when �rms and the regulator have asymmetric information about the slope of �rms ’ abatement costs. Damages are caused by a stock pollutant. We calibrate the model using cost and damage estimates of greenhouse gasses. Taxes dominate quotas, as with additive uncertainty. This model with multiplicative uncertainty allows us to compare expected stock levels under the two policies, and to investigate the importance of stock size and the magnitude of uncertainty on the policy ranking.
Learning Generalized Policies in Planning Using Concept Languages
, 2000
"... In this paper we are concerned with the problem of learning how to solve planning problems in one domain given a number of solved instances. This problem is formulated as the problem of inferring a function that operates over all instances in the domain and maps states and goals into actions. ..."
Abstract

Cited by 26 (0 self)
 Add to MetaCart
In this paper we are concerned with the problem of learning how to solve planning problems in one domain given a number of solved instances. This problem is formulated as the problem of inferring a function that operates over all instances in the domain and maps states and goals into actions. We call such functions generalized policies and the question that we address is how to learn suitable representations of generalized policies from data. This question has been addressed recently by Roni Khardon [16]. Khardon represents generalized policies using an ordered list of existentially quantified rules that are inferred from a training set using a version of Rivest's learning algorithm [22]. Here, we follow Khardon's approach but represent generalized policies in a different way using a concept language. We show through a number of experiments in the blocksworld that the concept language yields a better policy using a smaller set of examples and no background knowle...
An adaptive sampling algorithm for solving Markov decision processes
 Operations Research
, 2005
"... Based on recent results for multiarmed bandit problems, we propose an adaptive sampling algorithm that approximates the optimal value of a finite horizon Markov decision process (MDP) with infinite state space but finite action space and bounded rewards. The algorithm adaptively chooses which actio ..."
Abstract

Cited by 23 (6 self)
 Add to MetaCart
Based on recent results for multiarmed bandit problems, we propose an adaptive sampling algorithm that approximates the optimal value of a finite horizon Markov decision process (MDP) with infinite state space but finite action space and bounded rewards. The algorithm adaptively chooses which action to sample as the sampling process proceeds, and it is proven that the estimate produced by the algorithm is asymptotically unbiased and the worst possible bias is bounded by a quantity that converges to zero at rate O � � H ln N N,whereHis the horizon length and N is the total number of samples that are used per state sampled in each stage. The worstcase runningtime complexity of the algorithm is O((AN) H), independent of the state space size, where A  is the size of the action space. The algorithm can be used to create an approximate receding horizon control to solve infinite horizon MDPs.
Dynamic Node Activation in Networks of Rechargeable Sensors
"... Abstract—We consider a network of rechargeable sensors, deployed redundantly in a random sensing environment, and address the problem of how sensor nodes should be activated dynamically so as to maximize a generalized system performance objective. The optimal sensor activation problem is a very diff ..."
Abstract

Cited by 23 (4 self)
 Add to MetaCart
Abstract—We consider a network of rechargeable sensors, deployed redundantly in a random sensing environment, and address the problem of how sensor nodes should be activated dynamically so as to maximize a generalized system performance objective. The optimal sensor activation problem is a very difficult decision question, and under Markovian assumptions on the sensor discharge/recharge periods, it represents a complex semiMarkov decision problem. With the goal of developing a practical, distributed but efficient solution to this complex, global optimization problem, we first consider the activation question for a set of sensor nodes whose coverage areas overlap completely. For this scenario, we show analytically that there exists a simple threshold activation policy that achieves a performance of at least 3/4 of the optimum over all possible policies. We extend this threshold policy to a general network setting where the coverage areas of different sensors could have partial or no overlap with each other, and show by simulations that the performance of our policy is very close to that of the globally optimal policy. Our policy is fully distributed, and requires the sensor nodes to only keep track of the node activation states in its immediate neighborhood. We also consider the effects of spatial correlation on the performance of the threshold activation policy, and the choice of the optimal threshold. Index Terms—Rechargable sensors, sensor activation, spatial correlation.