## On-line Learning and the Metrical Task System Problem (1997)

### Cached

### Download Links

- [www.cs.cmu.edu]
- [www.aladdin.cs.cmu.edu]
- [www.cs.cmu.edu]
- [www-cgi.cs.cmu.edu]
- [almond.srv.cs.cmu.edu]
- [pecan.srv.cs.cmu.edu]
- [www.cs.cmu.edu]
- DBLP

### Other Repositories/Bibliography

Venue: | Machine Learning |

Citations: | 36 - 7 self |

### BibTeX

@INPROCEEDINGS{Blum97on-linelearning,

author = {Avrim Blum and Carl Burch},

title = {On-line Learning and the Metrical Task System Problem},

booktitle = {Machine Learning},

year = {1997},

pages = {45--53}

}

### Years of Citing Articles

### OpenURL

### Abstract

We relate two problems that have been explored in two distinct communities. The first is the problem of combining expert advice, studied extensively in the computational learning theory literature, and in particular the problem of tracking the best expert in the clean "decision-theoretic" setting. The second is the Metrical Task System (MTS) problem, studied extensively in the On-line Algorithms literature, and in particular, variations on the setting of the uniform metric space. We show that these problems contain several interesting similarities and demonstrate how algorithms designed for each can be used to achieve good bounds and new approaches for solving the other. Specific contributions of this paper include: ffl An analysis showing how two recent algorithms for the MTS problem can be applied to the setting of tracking the best expert, providing good bounds with an approach of a much different flavor than the well-known multiplicative weighted-expert algorithms. ffl A version ...

### Citations

2470 | A decision-theoretic generalization of online learning and an application to boosting
- Freund, Schapire
- 1997
(Show Context)
Citation Context ...whether and where to move for the next minute. We could imagine modeling the question of when and where such a process should move in the "decision-theoretic" experts framework of Freund and=-= Schapire [FS95]-=- and Chung [Chu94] as follows. We view the loads on the machine as losses, where an unloaded machine has loss 0, indicating that the process would have wasted 0 time if it had been on that machine, an... |

698 | The weighted majority algorithm
- Littlestone, Warmuth
- 1994
(Show Context)
Citation Context ...blem is typically analyzed with the goal of performing nearly as well as the best expert on the given sequence of trials. We consider the partitioning bound, a stronger goal (considered by [HW95] and =-=[LW94]-=- for specific classes of loss functions) of performing nearly as well as the best sequence of experts. Specifically, given a partition of the trial sequence into k segments, we define the loss of the ... |

323 | Probabilistic approximation of metric spaces and its algorithmic applications
- Bartal
- 1996
(Show Context)
Citation Context ...ask sequence ; if A is randomized, then for a fixed , costn() is a random variable. We say that algorithm A has competitive ratio a if, for some b, for all task sequences , [costA()]sa. costoPT() +b, =-=(2)-=- where OPT is the optimal off-line algorithm (the optimal strategy in hindsight). This definition is called the oblivious adversary model since the order of quantifiers (VE[cost()]) can be viewed as o... |

321 | How to use expert advice
- Cesa-Bianchi, Freund, et al.
- 1997
(Show Context)
Citation Context ...e for w is if ;he penaRy for ;he experifs loss applies maximally ;o ;he required shared amoun;, when ;he penaRies all come aRer ;he sharing. For convenience, define II = H (1 - (1 - )(1 - cop t. t) , =-=(6)-=- t So inequality (4) can be written as W fi'z II. Using the fact that tt;fi i , plugging into inequality (5) we get: (-n) - (--)n We can now solve for II. II > /a > /a - (1-cOn+c- n This gives us 19 R... |

203 | Tracking the best expert
- Herbster, Warmuth
- 1998
(Show Context)
Citation Context ...g. This problem is typically analyzed with the goal of performing nearly as well as the best expert on the given sequence of trials. We consider the partitioning bound, a stronger goal (considered by =-=[HW95]-=- and [LW94] for specific classes of loss functions) of performing nearly as well as the best sequence of experts. Specifically, given a partition of the trial sequence into k segments, we define the l... |

188 | An Optimal Online Algorithm for Metrical Task Systems
- Borodin, Linial, et al.
- 1987
(Show Context)
Citation Context ...1 + 1 2r ' L+ ` r + 1 2 ' k as desired. 3.2 Marking For the MTS problem on a uniform metric space of more than two states, the standard algorithm is the Marking algorithm of Borodin, Linial, and Saks =-=[BLS92]-=- and Fiat et al. [FKL + 91]. Algorithm Marking [BLS92]: We maintain a counter for each state. At the beginning of each phase, the counters are reset to 0, and the algorithm occupies a random state. Gi... |

167 | Competitive paging algorithms
- Fiat, Karp, et al.
- 1991
(Show Context)
Citation Context ...e experiments of Section 6) Work-Function can perform well. Marking A simple randomized algorithm for the uniform metric space is the Marking algorithm of Borodin, Linial, and Saks [5] and Fiat et al =-=[9]-=-. Algorithm Marking. We maintain a counter for each state. At the beginning of each phase, the counters are reset to 0, and the algorithm occupies a random state. Given a cost vector , we increment th... |

48 | A polylog(n)-competitive algorithm for metrical task systems
- Bartal, Blum, et al.
- 1997
(Show Context)
Citation Context ...sulting in a much better bound for the Experts-DTF problem. 3.3 Odd-Exponent The good performance of Linear suggests using a generalization for more than two experts. Bartal, Blum, Burch, and Tomkins =-=[BBBT97]-=- analyze a generalization that can also be applied in an experts setting: Algorithm Odd-Exponent [BBBT97]: Let t be an odd integer, and letsL i represent the reduced loss of expert i. Then place p i =... |

20 | Randomized algorithms for metrical task systems - Irani, Seiden - 1998 |

13 | Approximate methods for sequential decision making using expert advice
- Chung
- 1994
(Show Context)
Citation Context ... to move for the next minute. We could imagine modeling the question of when and where such a process should move in the "decision-theoretic" experts framework of Freund and Schapire [FS95] =-=and Chung [Chu94]-=- as follows. We view the loads on the machine as losses, where an unloaded machine has loss 0, indicating that the process would have wasted 0 time if it had been on that machine, and a heavily loaded... |

13 | Unfair problems and randomized algorithms for metrical task systems
- Seiden
- 1996
(Show Context)
Citation Context ...tal cost is X i maxf0; p t i \Gamma p t+1 i g + p t+1 \Delta ` t : For a refined analysis of the MTS problem, we use the r-unfair competitive ratio considered in [BKRS92] and formalized explicitly in =-=[Sei96]-=-. Here the on-line algorithm pays the same amount as before, but OPT pays r times more for movement. That is, the off-line player pays rd i;j + ` j for a task. This parameter r is known to the on-line... |

12 |
A decomposition theorem and lower bounds for randomized server problems
- Blum, Karloff, et al.
(Show Context)
Citation Context ...s distribution to p t+1 . Then, its total cost is X i maxf0; p t i \Gamma p t+1 i g + p t+1 \Delta ` t : For a refined analysis of the MTS problem, we use the r-unfair competitive ratio considered in =-=[BKRS92]-=- and formalized explicitly in [Sei96]. Here the on-line algorithm pays the same amount as before, but OPT pays r times more for movement. That is, the off-line player pays rd i;j + ` j for a task. Thi... |

12 | M.S.: On-line choice of on-line algorithms
- Azar, Broder, et al.
- 1993
(Show Context)
Citation Context ...ow the standard randomized Weighted Majority (or Hedge) algorithm can be used for the problem of "combining on-line algorithms on-line", giving much stronger guarantees than the results of A=-=zar et al [1]-=- when the algorithms being combined occupy a state space of bounded diameter.sA generalization of the above, showing how (a simplified version of) Herbster and Warmuth's weight-sharing algorithm can b... |

4 |
Process migration in distributed systems: A comparative survey
- Eskicioglu
- 1990
(Show Context)
Citation Context ...ould be lost for movement between machines. In research process migration systems, the time for a process to move is roughly proportional to its size. For a 100-KB process, the time is about a second =-=[Esk90]-=-. Our distance corresponds to large but reasonable memory usage. Our simulations compared the performance of nine algorithms, including four simple control algorithms: Uniform The algorithm picks a ra... |

4 | Practical and Theoretical Issues in Prefetching and Caching
- Tomkins
- 1997
(Show Context)
Citation Context ...lementary, meaning that it is zero in all states except one, and in that state, the cost is ffi, where ffi can be chosen to be arbitrarily small. A proof of this folklore result is in the appendix of =-=[Tom97]-=-. Consider now some ffi-elementary task vector, with non-zero cost in state i. Say that before processing this task, the algorithm had weight w i at state i and the total weight was W , so the probabi... |

1 | Unfair problems and randomized algorithms for metrical task systems. aformation and Computation - Selden |