Hard Constrained SemiMarkov Decision Processes
In multiple criteria Markov Decision Processes (MDP) where multiple costs are incurred at every decision point, current methods solve them by minimising the expected primary cost criterion while constraining the expectations of other cost criteria to some critical values. However, systems are often
Duality between Probability and Optimization
, 1997
this paper. The link between the weak convergence and the epigraph convergence used in convex analysis is done. The Cramer transform used in the large deviation literature is defined as the composition of the Laplace transform by the logarithm by the Fenchel transform. It transforms convolution
Asymptotic Properties of Constrained Markov Decision Processes
, 1991
We present in this paper several asymptotic properties of constrained Markov Decision Processes (MDPs) with a countable state space. We treat both the discounted and the expected average cost, with unbounded cost. We are interested in (1) the convergence of finite horizon MDPs to the infinite
Bounds for Markov Decision Processes
, 2011
We consider the problem of producing lower bounds on the optimal costtogo function of a Markov decision problem. We present two approaches to this problem: one based on the methodology of approximate linear programming (ALP) and another based on the socalled martingale duality approach. We show
Solving Uncertain Markov Decision Processes
 Carnegie Mellon University
, 2001
stochastic dynamic game is proposed, and the security equilibrium solution of the game is shown to correspond to the value function under the worst model and the optimal controller. The authors demonstrate that the uncertain model approach can be used to solve a class of nearly Markovian Decision
Integrating value functions and policy search for continuous Markov Decision Processes
Value function approaches for Markov decision processes have been used successfully to find optimal policies for a large number of problems. Recent findings have demonstrated that policy search can be used effectively in reinforcement learning when standard value function techniques become
Markov Decision Processes in Finance
, 2006
The market is arbitragefree without transaction costs and the underlying asset price process is assumed to possess a Markov chain structure. Under these assumptions, stochastic dynamic programming is exploited to price the European type option. By using the utility concept
Denumerable Constrained Markov Decision Problems And Finite Approximations
, 1992
The purpose of this paper is two fold. First to establish the Theory of discounted constrained Markov Decision Processes with a countable state and action spaces with general multichain structure. Second, to introduce finite approximation methods. We define the occupation measures and obtain
Applying Markov Decision Process
, 2011
considered the spatiotemporal environment under bounded rationality using Markov Decision process modeling to generalize patterns of agent behavior by analyzing the determinants of value functions, and of factors that modify policyactioninduced cognitive abilities. Since detecting patterns are central
