Results 1 - 10
of
56
Efficient Solution Algorithms for Factored MDPs
, 2003
"... This paper addresses the problem of planning under uncertainty in large Markov Decision Processes (MDPs). Factored MDPs represent a complex state space using state variables and the transition model using a dynamic Bayesian network. This representation often allows an exponential reduction in the re ..."
Abstract
-
Cited by 99 (3 self)
- Add to MetaCart
This paper addresses the problem of planning under uncertainty in large Markov Decision Processes (MDPs). Factored MDPs represent a complex state space using state variables and the transition model using a dynamic Bayesian network. This representation often allows an exponential reduction in the representation size of structured MDPs, but the complexity of exact solution algorithms for such MDPs can grow exponentially in the representation size. In this paper, we present two approximate solution algorithms that exploit structure in factored MDPs. Both use an approximate value function represented as a linear combination of basis functions, where each basis function involves only a small subset of the domain variables. A key contribution of this paper is that it shows how the basic operations of both algorithms can be performed efficiently in closed form, by exploiting both additive and context-specific structure in a factored MDP. A central element of our algorithms is a novel linear program decomposition technique, analogous to variable elimination in Bayesian networks, which reduces an exponentially large LP to a provably equivalent, polynomial-sized one. One algorithm uses approximate linear programming, and the second approximate dynamic programming. Our dynamic programming algorithm is novel in that it uses an approximation based on max-norm, a technique that more directly minimizes the terms that appear in error bounds for approximate MDP algorithms. We provide experimental results on problems with over 10^40 states, demonstrating a promising indication of the scalability of our approach, and compare our algorithm to an existing state-of-the-art approach, showing, in some problems, exponential gains in computation time.
Generalizing plans to new environments in relational MDPs
- In International Joint Conference on Artificial Intelligence (IJCAI-03
, 2003
"... A longstanding goal in planning research is the ability to generalize plans developed for some set of environments to a new but similar environment, with minimal or no replanning. Such generalization can both reduce planning time and allow us to tackle larger domains than the ones tractable for dire ..."
Abstract
-
Cited by 74 (2 self)
- Add to MetaCart
A longstanding goal in planning research is the ability to generalize plans developed for some set of environments to a new but similar environment, with minimal or no replanning. Such generalization can both reduce planning time and allow us to tackle larger domains than the ones tractable for direct planning. In this paper, we present an approach to the generalization problem based on a new framework of relational Markov Decision Processes (RMDPs). An RMDP can model a set of similar environments by representing objects as instances of different classes. In order to generalize plans to multiple environments, we define an approximate value function specified in terms of classes of objects and, in a multiagent setting, by classes of agents. This class-based approximate value function is optimized relative to a sampled subset of environments, and computed using an efficient linear programming method. We prove that a polynomial number of sampled environments suffices to achieve performance close to the performance achievable when optimizing over the entire space. Our experimental results show that our method generalizes plans successfully to new, significantly larger, environments, with minimal loss of performance relative to environment-specific planning. We demonstrate our approach on a real strategic computer war game. 1
Convex approximations of chance constrained programs
- SIAM Journal of Optimization
, 2006
"... Abstract. We consider a chance constrained problem, where one seeks to minimize a convex objective over solutions satisfying, with a given close to one probability, a system of randomly perturbed convex constraints. This problem may happen to be computationally intractable; our goal is to build its ..."
Abstract
-
Cited by 38 (3 self)
- Add to MetaCart
Abstract. We consider a chance constrained problem, where one seeks to minimize a convex objective over solutions satisfying, with a given close to one probability, a system of randomly perturbed convex constraints. This problem may happen to be computationally intractable; our goal is to build its computationally tractable approximation, i.e., an efficiently solvable deterministic optimization program with the feasible set contained in the chance constrained problem. We construct a general class of such convex conservative approximations of the corresponding chance constrained problem. Moreover, under the assumptions that the constraints are affine in the perturbations and the entries in the perturbation vector are independent-of-each-other random variables, we build a large deviation-type approximation, referred to as “Bernstein approximation, ” of the chance constrained problem. This approximation is convex and efficiently solvable. We propose a simulation-based scheme for bounding the optimal value in the chance constrained problem and report numerical experiments aimed at comparing the Bernstein and well-known scenario approximation approaches. Finally, we extend our construction to the case of ambiguous chance constrained problems, where the random perturbations are independent with the collection of distributions known to belong to a given convex compact set rather than to be known exactly, while the chance constraint should be satisfied for every distribution given by this set.
Uncertain convex programs: Randomized solutions and confidence levels
- Mathematical Programming
, 2005
"... Many engineering problems can be cast as optimization problems subject to convex constraints that are parameterized by an uncertainty or ‘instance ’ parameter. A recently emerged successful paradigm for attacking these problems is robust optimization, where one seeks a solution which simultaneously ..."
Abstract
-
Cited by 28 (0 self)
- Add to MetaCart
Many engineering problems can be cast as optimization problems subject to convex constraints that are parameterized by an uncertainty or ‘instance ’ parameter. A recently emerged successful paradigm for attacking these problems is robust optimization, where one seeks a solution which simultaneously satisfies all possible constraint instances. In practice, however, the robust approach is effective only for problem families with rather simple dependence on the instance parameter (such as affine or polynomial), and leads in general to conservative answers, since the solution is usually computed by transforming the original semi-infinite problem into a standard one, by means of relaxation techniques. In this paper, we take an alternative ‘randomized ’ or ‘scenario ’ approach: by randomly sampling the uncertainty parameter, we substitute the original infinite constraint set with a finite set of N constraints. We show that the resulting randomized solution fails to satisfy only a small portion of the original constraints, provided that a sufficient number of samples is drawn. Our key result is to provide an efficient explicit bound on the measure (probability or volume) of the original constraints that are possibly violated by the randomized solution. This volume rapidly decreases to zero as N is increased.
Linear Program Approximations for Factored Continuous-State Markov Decision Processes
- In Advances in Neural Information Processing Systems 16
, 2003
"... Approximate linear programming (ALP) has emerged recently as one of the most promising methods for solving complex factored MDPs with finite state spaces. In this work we show that ALP solutions are not limited only to MDPs with finite state spaces, but that they can also be applied successfully to ..."
Abstract
-
Cited by 24 (10 self)
- Add to MetaCart
Approximate linear programming (ALP) has emerged recently as one of the most promising methods for solving complex factored MDPs with finite state spaces. In this work we show that ALP solutions are not limited only to MDPs with finite state spaces, but that they can also be applied successfully to factored continuous-state MDPs (CMDPs). We show how one can build an ALP-based approximation for such a model and contrast it to existing solution methods. We argue that this approach offers a robust alternative for solving high dimensional continuous-state space problems. The point is supported by experiments on three CMDP problems with 24-25 continuous state factors.
Second Order Cone Programming Approaches for Handling Missing and Uncertain Data
- JOURNAL OF MACHINE LEARNING RESEARCH
, 2006
"... We propose a novel second order cone programming formulation for designing robust classifiers which can handle uncertainty in observations. Similar formulations are also derived for designing regression functions which are robust to uncertainties in the regression setting. The proposed formulations ..."
Abstract
-
Cited by 22 (6 self)
- Add to MetaCart
We propose a novel second order cone programming formulation for designing robust classifiers which can handle uncertainty in observations. Similar formulations are also derived for designing regression functions which are robust to uncertainties in the regression setting. The proposed formulations are independent of the underlying distribution, requiring only the existence of second order moments. These formulations are then specialized to the case of missing values in observations for both classification and regression problems. Experiments show that the proposed formulations outperform imputation.
Solving Factored MDPs with Continuous and Discrete Variables
- In Proceedings of the 20th Conference on Uncertainty in Artificial Intelligence
, 2004
"... Although many real-world stochastic planning problems are more naturally formulated by hybrid models with both discrete and continuous variables, current state-of-the-art methods cannot adequately address these problems. We present the first framework that can exploit problem structure for modeling ..."
Abstract
-
Cited by 21 (7 self)
- Add to MetaCart
Although many real-world stochastic planning problems are more naturally formulated by hybrid models with both discrete and continuous variables, current state-of-the-art methods cannot adequately address these problems. We present the first framework that can exploit problem structure for modeling and solving hybrid problems efficiently. We formulate these problems as hybrid Markov decision processes (MDPs with continuous and discrete state and action variables), which we assume can be represented in a factored way using a hybrid dynamic Bayesian network (hybrid DBN). This formulation also allows us to apply our methods to collaborative multiagent settings. We present a new linear program approximation method that exploits the structure of the hybrid MDP and lets us compute approximate value functions more efficiently. In particular, we describe a new factored discretization of continuous variables that avoids the exponential blow-up of traditional approaches. We provide theoretical bounds on the quality of such an approximation and on its scale-up potential. We support our theoretical arguments with experiments on a set of control problems with up to 28-dimensional continuous state space and 22-dimensional action space.
Ambiguous Chance Constrained Problems And Robust Optimization
- Mathematical Programming
, 2004
"... In this paper we study ambiguous chance constrained problems where the distributions of the random parameters in the problem are themselves uncertain. We primarily focus on the special case where the uncertainty set Q of the distributions is of the form Q = {Q : # p (Q, Q 0 ) # #}, where # p denote ..."
Abstract
-
Cited by 17 (1 self)
- Add to MetaCart
In this paper we study ambiguous chance constrained problems where the distributions of the random parameters in the problem are themselves uncertain. We primarily focus on the special case where the uncertainty set Q of the distributions is of the form Q = {Q : # p (Q, Q 0 ) # #}, where # p denotes the Prohorov metric. The ambiguous chance constrained problem is approximated by a robust sampled problem where each constraint is a robust constraint centered at a sample drawn according to the central measure Q 0 . The main contribution of this paper is to show that the robust sampled problem is a good approximation for the ambiguous chance constrained problem with high probability. This result is established using the Strassen-Dudley Representation Theorem that states that when the distributions of two random variables are close in the Prohorov metric one can construct a coupling of the random variables such that the samples are close with high probability. We also show that the robust sampled problem can be solved e#ciently both in theory and in practice. 1
An MCMC Approach to Solving Hybrid Factored MDPs
- In Proceedings of the 19th International Joint Conference on Artificial Intelligence
, 2005
"... Hybrid approximate linear programming (HALP) has recently emerged as a promising framework for solving large factored Markov decision processes (MDPs) with discrete and continuous state and action variables. Our work addresses its major computational bottleneck - constraint satisfaction in large str ..."
Abstract
-
Cited by 15 (7 self)
- Add to MetaCart
Hybrid approximate linear programming (HALP) has recently emerged as a promising framework for solving large factored Markov decision processes (MDPs) with discrete and continuous state and action variables. Our work addresses its major computational bottleneck - constraint satisfaction in large structured domains of discrete and continuous variables. We analyze this problem and propose a novel Markov chain Monte Carlo (MCMC) method for finding the most violated constraint of a relaxed HALP. This method does not require the discretization of continuous variables, searches the space of constraints intelligently based on the structure of factored MDPs, and its space complexity is linear in the number of variables. We test the method on a set of large control problems and demonstrate improvements over alternative approaches.
A unifying framework for computational reinforcement learning theory
, 2009
"... Computational learning theory studies mathematical models that allow one to formally analyze and compare the performance of supervised-learning algorithms such as their sample complexity. While existing models such as PAC (Probably Approximately Correct) have played an influential role in understand ..."
Abstract
-
Cited by 13 (6 self)
- Add to MetaCart
Computational learning theory studies mathematical models that allow one to formally analyze and compare the performance of supervised-learning algorithms such as their sample complexity. While existing models such as PAC (Probably Approximately Correct) have played an influential role in understanding the nature of supervised learning, they have not been as successful in reinforcement learning (RL). Here, the fundamental barrier is the need for active exploration in sequential decision problems. An RL agent tries to maximize long-term utility by exploiting its knowledge about the problem, but this knowledge has to be acquired by the agent itself through exploring the problem that may reduce short-term utility. The need for active exploration is common in many problems in daily life, engineering, and sciences. For example, a Backgammon program strives to take good moves to maximize the probability of winning a game, but sometimes it may try novel and possibly harmful moves to discover how the opponent reacts in the hope of discovering a better game-playing strategy. It has been known since the early days of RL that a good tradeoff between exploration and exploitation is critical for the agent to learn fast (i.e., to reach near-optimal strategies

