Results 1  10
of
40
Assortment planning: Review of literature and industry practice
 Retail Supply Chain Management
, 2005
"... This paper is an invited chapter to appear in Retail Supply Chain Management, Eds. N. Agrawal and S. A. ..."
Abstract

Cited by 44 (3 self)
 Add to MetaCart
This paper is an invited chapter to appear in Retail Supply Chain Management, Eds. N. Agrawal and S. A.
Combinatorial MultiArmed Bandit: General Framework, Results and Applications
"... We define a general framework for a large class of combinatorial multiarmed bandit (CMAB) problems, where simple arms with unknown distributions form super arms. In each round, a super arm is played and the outcomes of its related simple arms are observed, which helps the selection of super arms in ..."
Abstract

Cited by 29 (4 self)
 Add to MetaCart
We define a general framework for a large class of combinatorial multiarmed bandit (CMAB) problems, where simple arms with unknown distributions form super arms. In each round, a super arm is played and the outcomes of its related simple arms are observed, which helps the selection of super arms in future rounds. The reward of the super arm depends on the outcomes of played arms, and it only needs to satisfy two mild assumptions, which allow a large class of nonlinear reward instances. We assume the availability of an (α, β)approximation oracle that takes the means of the distributions of arms and outputs a super arm that with probability β generates an α fraction of the optimal expected reward. The objective of a CMAB algorithm is to minimize (α, β)approximation regret, which is the difference in total expected reward between the αβ fraction of expected reward when always playing the optimal super arm, and the expected reward of playing super arms according to the algorithm. We provide CUCB algorithm that achieves O(log n) regret, where n is the number of rounds played, and we further provide distributionindependent bounds for a large class of reward functions. Our regret analysis is tight in that it matches the bound for classical MAB problem up to a constant factor, and it significantly improves the regret bound Proceedings of the 30 th
Dynamic Assortment Optimization with a Multinomial Logit Choice Model and Capacity Constraint
, 2008
"... The paper considers a stylized model of a dynamic assortment optimization problem, where given a limited capacity constraint, we must decide the assortment of products to offer to customers to maximize the profit. Our model is motivated by the problem faced by retailers of stocking products on a she ..."
Abstract

Cited by 23 (3 self)
 Add to MetaCart
The paper considers a stylized model of a dynamic assortment optimization problem, where given a limited capacity constraint, we must decide the assortment of products to offer to customers to maximize the profit. Our model is motivated by the problem faced by retailers of stocking products on a shelf with limited capacities and by the problem of placing a limited number of ads on a web page. We assume that each customer chooses to purchase the product (or to click on the ad) that maximizes her utility. We use the multinomial logit choice model to represent demand. However, we do not know the demand for each product. We can learn the demand distribution by offering different product assortments, observing resulting selections, and inferring the demand distribution from past selections and assortment decisions. We present an adaptive policy for joint parameter estimation and assortment optimization. To evaluate our proposed policy, we define a benchmark profit as the maximum expected profit that we can earn if we know the underlying demand distribution in advance. We show that the running average expected profit generated by our policy converges to the benchmark profit and establish its convergence rate. Numerical experiments based on sales data from an online retailer indicate that our policy performs well, generating over 90 % of the optimal profit after less than two days of sales. 1.
Dynamic pricing for nonperishable products with demand learning
"... Abstract A retailer is endowed with a finite inventory of a nonperishable product. Demand for this product is driven by a pricesensitive Poisson process that depends on an unknown parameter, θ; a proxy for the market size. If θ is high then the retailer can take advantage of a large market chargi ..."
Abstract

Cited by 21 (0 self)
 Add to MetaCart
(Show Context)
Abstract A retailer is endowed with a finite inventory of a nonperishable product. Demand for this product is driven by a pricesensitive Poisson process that depends on an unknown parameter, θ; a proxy for the market size. If θ is high then the retailer can take advantage of a large market charging premium prices, but if θ is small then price markdowns can be applied to encourage sales. The retailer has a prior belief on the value of θ which he updates as time and available information (prices and sales) evolve. We also assume that the retailer faces an opportunity cost when selling this nonperishable product. This opportunity cost is given by the longterm average discounted profits that the retailer can make if he switches and starts selling a different assortment of products. The retailer's objective is to maximize the discounted longterm average profits of his operation using dynamic pricing policies. We consider two cases. In the first case, the retailer is constrained to sell the entire initial stock of the nonperishable product before a different assortment is considered. In the second case, the retailer is able to stop selling the nonperishable product at any time to switch to a different menu of products. In both cases, the retailer's pricing policy tradesoff immediate revenues and future profits based on active demand learning. We formulate the retailer's problem as a (Poisson) intensity control problem and derive structural properties of an optimal solution which we use to propose a simple approximated solution. This solution combines a pricing policy and a stopping rule (if stopping is an option) depending on the inventory position and the retailer's belief about the value of θ. We use numerical computations, together with asymptotic analysis, to evaluate the performance of our proposed solution.
Inventory Management of a FastFashion Retail Network
, 2007
"... Working in collaboration with Spainbased retailer Zara, we address the problem of distributing over time a limited amount of inventory across all the stores in a fastfashion retail network. Challenges specific to that environment include very short product lifecycles, and store policies whereby a ..."
Abstract

Cited by 15 (5 self)
 Add to MetaCart
Working in collaboration with Spainbased retailer Zara, we address the problem of distributing over time a limited amount of inventory across all the stores in a fastfashion retail network. Challenges specific to that environment include very short product lifecycles, and store policies whereby a reference is removed from display whenever one of its key sizes stocks out. We first formulate and analyze a stochastic model predicting the sales of a reference in a single store during a replenishment period as a function of demand forecasts, the inventory of each size initially available and the store inventory management policy just stated. Secondly, we formulate a mixedinteger program embedding a piecewise linear approximation of the first model applied to every store in the network and allowing to compute store shipment quantities maximizing overall predicted sales, subject to inventory availability and other constraints. We report the implementation of this optimization model by Zara to support its inventory distribution process, and the ensuing controlled field experiment performed to assess the impact of that model relative to the prior procedure used to determine weekly shipment quantities. The results of that experiment suggest that the new allocation process tested increases sales, reduces transhipments,
A Learning Approach for Interactive Marketing to a Customer Segment
"... doi 10.1287/opre.1070.0427 ..."
The Impact of Quick Response in InventoryBased Competition
, 2007
"... We propose a multiperiod extension of the competitive newsvendor model of Lippman and McCardle (1997) to investigate the impact of quick response under competition. For this purpose, we consider two retailers that compete in terms of inventory: customers that face a stockout at their firstchoice s ..."
Abstract

Cited by 8 (0 self)
 Add to MetaCart
We propose a multiperiod extension of the competitive newsvendor model of Lippman and McCardle (1997) to investigate the impact of quick response under competition. For this purpose, we consider two retailers that compete in terms of inventory: customers that face a stockout at their firstchoice store will look for the product at the other store. Consequently, the total demand that each retailer faces depends on the competitor’s inventory level. We allow for asymmetric reordering capabilities, and we are particularly interested in the case when one of the firms has a lower ordering cost but can only produce at the beginning of the selling season, whereas the second firm has higher costs but can replenish stock in a quick response manner taking advantage of any incremental knowledge about demand (if it is available). We visualize this problem as the competition between a traditional maketostock retailer that builds up inventory before the season starts versus a retailer with a responsive supply chain that can react to early demand information. We provide conditions for this game to have a unique purestrategy subgameperfect equilibrium, which then allows us to perform numerical comparative statics. Our results confirm in a competitive setting the intuitive fact that quick response is more beneficial when demand uncertainty is higher, or exhibits a higher correlation over time. Finally, we find that part of the competitive advantage from quick response arises from the asymmetry in response capabilities. 1.
Dynamic Assortment Customization with Limited Inventories
, 2010
"... We consider a retailer with limited inventories of identically priced, substitutable products. Customers arrive sequentially and the firm decides which subset of the products to o¤er to each arriving customer depending on the customer’s preferences, the inventory levels and the remaining time in the ..."
Abstract

Cited by 7 (0 self)
 Add to MetaCart
We consider a retailer with limited inventories of identically priced, substitutable products. Customers arrive sequentially and the firm decides which subset of the products to o¤er to each arriving customer depending on the customer’s preferences, the inventory levels and the remaining time in the season. We show that the optimal assortment policy is to offer all available products if the customer base is homogeneous with respect to their product preferences. However, with multiple customer segments characterized by different product preferences, it may be optimal to limit the choice set of some customers. That is, it may be optimal not to o¤er products with low inventories to some customer segments and reserve them for future customers (who may have a stronger preference for those products). For the case of two products and two customer segments and for a special case with multiple products and multiple customer segments, we show that the optimal assortment policy is a threshold policy under which a product is offered to a customer segment if its inventory level is higher than a threshold value. The threshold levels are decreasing in time and increasing in the inventory levels of other products. For the general case, we perform a large numerical study, and confirm that the optimal policy continues to be of the threshold type. We find that the revenue impact of assortment customization can be significant, especially when customer heterogeneity is high and the starting inventory levels of the products are asymmetric. This demonstrates the use of assortment customization as another lever for revenue maximization in addition to pricing.
Computing a Classic Index for FiniteHorizon Bandits
, 2011
"... This paper considers the efficient exact computation of the counterpart of the Gittins index for a finitehorizon discretestate bandit, which measures for each initial state the average productivity, given by the maximum ratio of expected total discounted reward earned to expected total discounted t ..."
Abstract

Cited by 6 (1 self)
 Add to MetaCart
This paper considers the efficient exact computation of the counterpart of the Gittins index for a finitehorizon discretestate bandit, which measures for each initial state the average productivity, given by the maximum ratio of expected total discounted reward earned to expected total discounted time expended that can be achieved through a number of successive plays stopping by the given horizon. Besides characterizing optimal policies for the finitehorizon onearmed bandit problem, such an index provides a suboptimal heuristic index rule for the intractable finitehorizon multiarmed bandit problem, which represents the natural extension of the Gittins index rule (optimal in the infinitehorizon case). Although such a finitehorizon index was introduced in classic work in the 1950s, investigation of its efficient exact computation has received scant attention. This paper introduces a recursive adaptivegreedy algorithm using only arithmetic operations that computes the index in (pseudo)polynomial time in the problem parameters (number of project states and time horizon length). In the special case of a project with limited transitions per state, the complexity is either reduced or depends only on the length of the time horizon. The proposed algorithm is benchmarked in a computational study against the conventional calibration method.
Combinatorial multiarmed bandit and its extension to probabilistically triggered arms.
 Journal of Machine Learning Research,
, 2016
"... Abstract We define a general framework for a large class of combinatorial multiarmed bandit (CMAB) problems, where subsets of base arms with unknown distributions form super arms. In each round, a super arm is played and the base arms contained in the super arm are played and their outcomes are ob ..."
Abstract

Cited by 5 (3 self)
 Add to MetaCart
(Show Context)
Abstract We define a general framework for a large class of combinatorial multiarmed bandit (CMAB) problems, where subsets of base arms with unknown distributions form super arms. In each round, a super arm is played and the base arms contained in the super arm are played and their outcomes are observed. We further consider the extension in which more base arms could be probabilistically triggered based on the outcomes of already triggered arms. The reward of the super arm depends on the outcomes of all played arms, and it only needs to satisfy two mild assumptions, which allow a large class of nonlinear reward instances. We assume the availability of an offline (α, β)approximation oracle that takes the means of the outcome distributions of arms and outputs a super arm that with probability β generates an α fraction of the optimal expected reward. The objective of an online learning algorithm for CMAB is to minimize (α, β)approximation regret, which is the difference in total expected reward between the αβ fraction of expected reward when always playing the optimal super arm, and the expected reward of playing super arms according to the algorithm. We provide CUCB algorithm that achieves O(log n) distributiondependent regret, where n is the number of rounds played, and we further provide distributionindependent bounds for a large class of reward functions. Our regret analysis is tight in that it matches the bound of UCB1 algorithm (up to a constant factor) for the classical MAB problem, and it significantly improves the regret bound in an earlier paper on combinatorial bandits * . A preliminary version of this paper has appeared in ICML