Results 1 
9 of
9
Pathwise Optimization for Optimal Stopping Problems
, 2010
"... We introduce the pathwise optimization (PO) method, a new convex optimization procedure to produce upper and lower bounds on the optimal value (the ‘price’) of a highdimensional optimal stopping problem. The PO method builds on a dual characterization of optimal stopping problems as optimization pr ..."
Abstract

Cited by 8 (1 self)
 Add to MetaCart
We introduce the pathwise optimization (PO) method, a new convex optimization procedure to produce upper and lower bounds on the optimal value (the ‘price’) of a highdimensional optimal stopping problem. The PO method builds on a dual characterization of optimal stopping problems as optimization problems over the space of martingales, which we dub the martingale duality approach. We demonstrate via numerical experiments that the PO method produces upper bounds of a quality comparable with stateoftheart approaches, but in a fraction of the time required for those approaches. As a byproduct, it yields lower bounds (and suboptimal exercise policies) that are substantially superior to those produced by stateoftheart methods. The PO method thus constitutes a practical and desirable approach to highdimensional pricing problems. Further, we develop an approximation theory relevant to martingale duality approaches in general and the PO method in particular. Our analysis provides a guarantee on the quality of upper bounds resulting from these approaches, and identifies three key determinants of their performance: the quality of an input value function approximation, the square root of the effective time horizon of the problem, and a certain spectral measure of ‘predictability ’ of the underlying Markov chain. As a corollary to this analysis we develop approximation guarantees specific to the PO method. Finally, we view the PO method and several approximate dynamic programming (ADP) methods for highdimensional pricing problems through a common lens and in doing so show that the PO method dominates those alternatives.
Imputing a Convex Objective Function
"... Abstract — We consider an optimizing process (or parametric optimization problem), i.e., an optimization problem that depends on some parameters. We present a method for imputing or estimating the objective function, based on observations of optimal or nearly optimal choices of the variable for seve ..."
Abstract

Cited by 7 (1 self)
 Add to MetaCart
(Show Context)
Abstract — We consider an optimizing process (or parametric optimization problem), i.e., an optimization problem that depends on some parameters. We present a method for imputing or estimating the objective function, based on observations of optimal or nearly optimal choices of the variable for several values of the parameter, and prior knowledge (or assumptions) about the objective. Applications include estimation of consumer utility functions from purchasing choices, estimation of value functions in control problems, given observations of an optimal (or just good) controller, and estimation of cost functions in a flow network. I.
A Dynamic Traveling Salesman Problem with Stochastic Arc Costs
 Operations Research
"... {toriello, wbhaskel, poremba} at usc dot edu ..."
(Show Context)
Nonparametric Approximate Dynamic Programming via the Kernel Method
, 2012
"... This paper presents a novel and practical nonparametric approximate dynamic programming (ADP) algorithm that enjoys graceful, dimensionindependent approximation and sample complexity guarantees. In particular, we establish both theoretically and computationally that our proposal can serve as a via ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
This paper presents a novel and practical nonparametric approximate dynamic programming (ADP) algorithm that enjoys graceful, dimensionindependent approximation and sample complexity guarantees. In particular, we establish both theoretically and computationally that our proposal can serve as a viable replacement to state of the art parametric ADP algorithms, freeing the designer from carefully specifying an approximation architecture. We accomplish this by ‘kernelizing ’ a recent mathematical program for ADP (the ‘smoothed ’ approximate LP) proposed by Desai et al. (2011). Our theoretical guarantees establish that the quality of the approximation produced by our procedure improves gracefully with sampling effort. Via a computational study on a controlled queueing network, we show that our nonparametric procedure outperforms the state of the art parametric ADP approaches and established heuristics. 1.
Equivalence of an Approximate Linear Programming Bound with the HeldKarp Bound for the Traveling Salesman Problem
"... atoriello at isye dot gatech dot edu ..."
(Show Context)
LargeScale Markov Decision Problems with KL Control Cost and its Application to Crowdsourcing
"... We study average and total cost Markov decision problems with large state spaces. Since the computational and statistical cost of finding the optimal policy scales with the size of the state space, we focus on searching for nearoptimality in a lowdimensional family of policies. In particular, w ..."
Abstract
 Add to MetaCart
We study average and total cost Markov decision problems with large state spaces. Since the computational and statistical cost of finding the optimal policy scales with the size of the state space, we focus on searching for nearoptimality in a lowdimensional family of policies. In particular, we show that for problems with a KullbackLeibler divergence cost function, we can recast policy optimization as a convex optimization and solve it approximately using a stochastic subgradient algorithm. This method scales in complexity with the family of policies but not the state space. We show that the performance of the resulting policy is close to the best in the lowdimensional family. We demonstrate the efficacy of our approach by optimizing a policy for budget allocation in crowd labeling, an important crowdsourcing application. 1.
RiskSensitive and Efficient Reinforcement Learning Algorithms
"... The research thesis was done under the supervision of Prof. Shie Mannor in ..."
Abstract
 Add to MetaCart
(Show Context)
The research thesis was done under the supervision of Prof. Shie Mannor in
Quadratic approximate . . . inputaffine systems
, 2012
"... We consider the use of quadratic approximate value functions for stochastic control problems with inputaffine dynamics and convex stage cost and constraints. Evaluating the approximate dynamic programming policy in such cases requires the solution of an explicit convex optimization problem, such as ..."
Abstract
 Add to MetaCart
We consider the use of quadratic approximate value functions for stochastic control problems with inputaffine dynamics and convex stage cost and constraints. Evaluating the approximate dynamic programming policy in such cases requires the solution of an explicit convex optimization problem, such as a quadratic program, which can be carried out efficiently. We describe a simple and general method for approximate value iteration that also relies on our ability to solve convex optimization problems, in this case, typically a semidefinite program. Although we have no theoretical guarantee on the performance attained using our method, we observe that very good performance can be obtained in practice.