Results 1 
3 of
3
Numerical Analysis in the Twentieth Century
 in Numerical Analysis: Historical Developments in the 20th Century, C. Brezinski e L. Wuytack, Editors, North–Holland
, 2001
"... This paper attracted much attention while a similar result obtained by William Karush in his Master's Thesis in 1939 [154] under the supervision of Lawrence M. Graves at the University of Chicago and by Fritz John (19101995) in 1948 [147] were almost totally ignored (John's paper was even rejected) ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
This paper attracted much attention while a similar result obtained by William Karush in his Master's Thesis in 1939 [154] under the supervision of Lawrence M. Graves at the University of Chicago and by Fritz John (19101995) in 1948 [147] were almost totally ignored (John's paper was even rejected)
Characterization and computation of restless bandit marginal productivity indices. SMCtools ’07
 Proc. 2007 Workshop on Tools for Solving Structured Markov Chains
"... Appl. Probab. 25A, 287298] yields a practical scheduling rule for the versatile yet intractable multiarmed restless bandit problem, involving the optimal dynamic priority allocation to multiple stochastic projects, modeled as restless bandits, i.e., binaryaction (active/passive) (semi) Markov de ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
Appl. Probab. 25A, 287298] yields a practical scheduling rule for the versatile yet intractable multiarmed restless bandit problem, involving the optimal dynamic priority allocation to multiple stochastic projects, modeled as restless bandits, i.e., binaryaction (active/passive) (semi) Markov decision processes. A growing body of evidence shows that such a rule is nearly optimal in a wide variety of applications, which raises the need to efficiently compute the Whittle index and more general marginal productivity index (MPI) extensions in largescale models. For such a purpose, this paper extends to restless bandits the parametric linear programming (LP) approach deployed 3 in [J. NiñoMora. A ( 2 / 3) n fastpivoting algorithm for the Gittins index and optimal stopping of a Markov chain, INFORMS J. Comp., in press], which yielded a fast Gittinsindex algorithm. Yet the extension is not straightforward, as the MPI is only defined for the limited range of socalled indexable bandits, which motivates the quest for methods to establish indexability. This paper furnishes algorithmic and analytical tools to realize the potential of MPI policies in largescale applications, presenting the following contributions: (i) a complete algorithmic
Computing a Classic Index for FiniteHorizon Bandits INFORMS holds copyright to this article and distributed this copy as a courtesy to the author(s).
, 2011
"... This paper considers the efficient exact computation of the counterpart of the Gittins index for a finitehorizon discretestate bandit, which measures for each initial state the average productivity, given by the maximum ratio of expected total discounted reward earned to expected total discounted t ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
This paper considers the efficient exact computation of the counterpart of the Gittins index for a finitehorizon discretestate bandit, which measures for each initial state the average productivity, given by the maximum ratio of expected total discounted reward earned to expected total discounted time expended that can be achieved through a number of successive plays stopping by the given horizon. Besides characterizing optimal policies for the finitehorizon onearmed bandit problem, such an index provides a suboptimal heuristic index rule for the intractable finitehorizon multiarmed bandit problem, which represents the natural extension of the Gittins index rule (optimal in the infinitehorizon case). Although such a finitehorizon index was introduced in classic work in the 1950s, investigation of its efficient exact computation has received scant attention. This paper introduces a recursive adaptivegreedy algorithm using only arithmetic operations that computes the index in (pseudo)polynomial time in the problem parameters (number of project states and time horizon length). In the special case of a project with limited transitions per state, the complexity is either reduced or depends only on the length of the time horizon. The proposed algorithm is benchmarked in a computational study against the conventional calibration method. Key words: dynamic programming, Markov; bandits, finitehorizon; index policies; analysis of algorithms; computational complexity