Results 1 -
5 of
5
Regret minimization under partial monitoring
- MATHEMATICS OF OPERATIONS RESEARCH
, 2004
"... We consider repeated games in which the player, instead of observing the action chosen by the opponent in each game round, receives a feedback generated by the combined choice of the two players. We study Hannan consistent players for this games; that is, randomized playing strategies whose per-roun ..."
Abstract
-
Cited by 24 (5 self)
- Add to MetaCart
We consider repeated games in which the player, instead of observing the action chosen by the opponent in each game round, receives a feedback generated by the combined choice of the two players. We study Hannan consistent players for this games; that is, randomized playing strategies whose per-round regret vanishes with probability one as the number n of game rounds goes to infinity. We prove a general lower bound of Ω(n^−1/3) on the convergence rate of the regret, and exhibit a specific strategy that attains this rate on any game for which a Hannan consistent player exists.
The communication complexity of uncoupled Nash equilibrium procedures
- Games and Economic Behavior
, 2006
"... We study the question of how long it takes players to reach a Nash equilibrium in uncoupled setups, where each player initially knows only his own payoff function. We derive lower bounds on the communication complexity of reaching a Nash equilibrium, i.e., on the number of bits that need to be trans ..."
Abstract
-
Cited by 11 (1 self)
- Add to MetaCart
We study the question of how long it takes players to reach a Nash equilibrium in uncoupled setups, where each player initially knows only his own payoff function. We derive lower bounds on the communication complexity of reaching a Nash equilibrium, i.e., on the number of bits that need to be transmitted, and thus also on the required number of steps. Specifically, we show lower bounds that are exponential in the number of players in each one of the following cases: (1) reaching a pure Nash equilibrium; (2) reaching a pure Nash equilibrium in a Bayesian setting; and (3) reaching a mixed Nash equilibrium. We then show that, in contrast, the communication complexity of reaching a correlated equilibrium is polynomial in the number of players.
Nonparametric kernel-based sequential investment strategies
- Mathematical Finance
, 2006
"... The purpose of this paper is to introduce sequential investment strategies that guarantee an optimal rate of growth of the capital, under minimal assumptions on the behavior of the market. The new strategies are analyzed both theoretically and empirically. The theoretical results show that the asymp ..."
Abstract
-
Cited by 9 (2 self)
- Add to MetaCart
The purpose of this paper is to introduce sequential investment strategies that guarantee an optimal rate of growth of the capital, under minimal assumptions on the behavior of the market. The new strategies are analyzed both theoretically and empirically. The theoretical results show that the asymptotic rate of growth matches the optimal one that one could achieve with a full knowledge of the statistical properties of the underlying process generating the market, under the only assumption that the market is stationary and ergodic. The empirical results show that the performance of the proposed investment strategies measured on past NYSE and currency exchange data is solid, and sometimes even spectacular. KEY WORDS: sequential investment, universal portfolios, kernel estimation 1.
Efficient algorithms for online game playing and universal portfolio management
, 2005
"... We introduce a new algorithm and a new analysis technique that is applicable to a variety of online optimization scenarios, including regret minimization for Lipschitz regret functions, universal portfolio management, online convex optimization and online utility maximization. In addition to being m ..."
Abstract
-
Cited by 5 (4 self)
- Add to MetaCart
We introduce a new algorithm and a new analysis technique that is applicable to a variety of online optimization scenarios, including regret minimization for Lipschitz regret functions, universal portfolio management, online convex optimization and online utility maximization. In addition to being more efficient and deterministic, our algorithm applies to a more general setting (e.g. when the payoff function is unknown). For the general online game playing setting it is the first to attain logarithmic regret, as opposed to previous algorithms attaining polynomial regret. The algorithm extends a natural online method studied in the 1950’s, called “follow the leader”, thus answering in the affirmative a conjecture about universal portfolios made by Cover and Ordentlich and independently by Kalai and Vempala. The techniques also leads to derandomization of an algorithm by Hannan, and Kalai and Vempala. Our analysis shows a surprising connection between interior point methods and online optimization by using the follow the leader method.
Switching Strategies for Sequential Decision Problems With Multiplicative Loss With Application to Portfolios
"... Abstract—A wide variety of problems in signal processing can be formulated such that decisions are made by sequentially taking convex combinations of vector-valued observations and these convex combinations are then multiplicatively compounded over time. A “universal ” approach to such problems migh ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
Abstract—A wide variety of problems in signal processing can be formulated such that decisions are made by sequentially taking convex combinations of vector-valued observations and these convex combinations are then multiplicatively compounded over time. A “universal ” approach to such problems might attempt to sequentially achieve the performance of the best fixed convex combination, as might be achievable noncausally, by observing all of the outcomes in advance. By permitting different piecewise-fixed strategies within contiguous regions of time, the best algorithm in this broader class would be able to switch between different fixed strategies to optimize performance to the changing behavior of each individual sequence of outcomes. Without knowledge of the data length or the number of switches necessary, the algorithms developed in this paper can achieve the performance of the best piecewise-fixed strategy that can choose both the partitioning of the sequence of outcomes in time as well as the best strategy within each time segment. We compete with an exponential number of such partitions, using only complexity linear in the data length and demonstrate that the regret with respect to the best such algorithm is at most (ln ()) in the exponent, where is the data length. Finally, we extend these results to include finite collections of candidate algorithms, rather than convex combinations and further investigate the use of an arbitrary side-information sequence. Index Terms—Convex combinations, portfolio, sequential decisions, side information, switching, universal. I.

