Results 1  10
of
6,589
Supplement: NonStochastic Bandit Slate Problems
"... Recall our special variant of Hedge: we are allowed to uses only distributions p(t) from some fixed convex subset P of the simplex of all distributions. The goal then is to minimize regret relative to an arbitrary distribution p ∈ P. Such a version of Hedge is given in Figure 1, and a statement of i ..."
Abstract

Cited by 13 (0 self)
 Add to MetaCart
Recall our special variant of Hedge: we are allowed to uses only distributions p(t) from some fixed convex subset P of the simplex of all distributions. The goal then is to minimize regret relative to an arbitrary distribution p ∈ P. Such a version of Hedge is given in Figure 1, and a statement of its performance below. This algorithm is implicit in the work of [4, 6]. Algorithm MW(P) Initialization: An arbitrary probability distribution p(1) ∈ P on the experts, and some η> 0. For t = 1, 2,..., T: 1. Choose distribution p(t) over experts, and observe the cost vector ℓ(t). 2. Compute the probability vector ˆp(t + 1) using the following multiplicative update rule: for every expert i, ˆpi(t + 1) = pi(t) exp(−ηℓi(t))/Z(t) (1) where Z(t) = ∑ i pi(t) exp(−ηℓi(t)) is the normalization factor. 3. Set p(t + 1) to be the projection of ˆp(t + 1) on the set P using the RE as a distance function, i.e. p(t + 1) = arg minp∈P RE(p ‖ ˆp(t + 1)). Figure 1: The Multiplicative Weights Algorithm with Restricted Distributions Theorem 1.1. Assume that η> 0 is chosen so that for all t and i, ηℓi(t) ≥ −1. Then algorithm MW(P) generates distributions p(1),..., p(T) ∈ P, such that for any p ∈ P, T∑ T∑ ℓ(t) · p(t) − ℓ(t) · p ≤ η (ℓ(t)) 2 · p(t) + t=1 Here, (ℓ(t)) 2 is the vector that is the coordinatewise square of ℓ(t). t=1 RE(p ‖ p(1)) η Proof. We use the relative entropy between p and p(t), RE(p ‖ p(t)): = ∑ i pi ln(pi/pi(t)) as a “potential ” function. We have RE(p ‖ ˆp t+1) − RE(p ‖ p(t)) = ∑ pi(t) pi ln
The Nonstochastic Multiarmed Bandit Problem
 SIAM JOURNAL OF COMPUTING
, 2002
"... In the multiarmed bandit problem, a gambler must decide which arm of K nonidentical slot machines to play in a sequence of trials so as to maximize his reward. This classical problem has received much attention because of the simple model it provides of the tradeoff between exploration (trying out ..."
Abstract

Cited by 492 (34 self)
 Add to MetaCart
In the multiarmed bandit problem, a gambler must decide which arm of K nonidentical slot machines to play in a sequence of trials so as to maximize his reward. This classical problem has received much attention because of the simple model it provides of the tradeoff between exploration (trying
Bandit based MonteCarlo Planning
 In: ECML06. Number 4212 in LNCS
, 2006
"... Abstract. For large statespace Markovian Decision Problems MonteCarlo planning is one of the few viable approaches to find nearoptimal solutions. In this paper we introduce a new algorithm, UCT, that applies bandit ideas to guide MonteCarlo planning. In finitehorizon or discounted MDPs the algo ..."
Abstract

Cited by 433 (7 self)
 Add to MetaCart
Abstract. For large statespace Markovian Decision Problems MonteCarlo planning is one of the few viable approaches to find nearoptimal solutions. In this paper we introduce a new algorithm, UCT, that applies bandit ideas to guide MonteCarlo planning. In finitehorizon or discounted MDPs
A comparative analysis of selection schemes used in genetic algorithms
 Foundations of Genetic Algorithms
, 1991
"... This paper considers a number of selection schemes commonly used in modern genetic algorithms. Specifically, proportionate reproduction, ranking selection, tournament selection, and Genitor (or «steady state") selection are compared on the basis of solutions to deterministic difference or d ..."
Abstract

Cited by 512 (32 self)
 Add to MetaCart
This paper considers a number of selection schemes commonly used in modern genetic algorithms. Specifically, proportionate reproduction, ranking selection, tournament selection, and Genitor (or «steady state") selection are compared on the basis of solutions to deterministic difference or differential equations, which are verified through computer simulations. The analysis provides convenient approximate or exact solutions as well as useful convergence time and growth ratio estimates. The paper recommends practical application of the analyses and suggests a number of paths for more detailed analytical investigation of selection techniques. Keywords: proportionate selection, ranking selection, tournament selection, Genitor, takeover time, time complexity, growth ratio. 1
The nonstochastic multiarmed bandit problem \Lambda
"... AbstractIn the multiarmed bandit problem, a gambler must decide which arm of K nonidenticalslot machines to play in a sequence of trials so as to maximize his reward. This classical problem has received much attention because of the simple model it provides of the tradeoffbetween exploration (try ..."
Abstract
 Add to MetaCart
AbstractIn the multiarmed bandit problem, a gambler must decide which arm of K nonidenticalslot machines to play in a sequence of trials so as to maximize his reward. This classical problem has received much attention because of the simple model it provides of the tradeoffbetween exploration
Bayes Factors
, 1995
"... In a 1935 paper, and in his book Theory of Probability, Jeffreys developed a methodology for quantifying the evidence in favor of a scientific theory. The centerpiece was a number, now called the Bayes factor, which is the posterior odds of the null hypothesis when the prior probability on the null ..."
Abstract

Cited by 1766 (74 self)
 Add to MetaCart
In a 1935 paper, and in his book Theory of Probability, Jeffreys developed a methodology for quantifying the evidence in favor of a scientific theory. The centerpiece was a number, now called the Bayes factor, which is the posterior odds of the null hypothesis when the prior probability on the null is onehalf. Although there has been much discussion of Bayesian hypothesis testing in the context of criticism of P values, less attention has been given to the Bayes factor as a practical tool of applied statistics. In this paper we review and discuss the uses of Bayes factors in the context of five scientific applications in genetics, sports, ecology, sociology and psychology.
Wireless Communications
, 2005
"... Copyright c ○ 2005 by Cambridge University Press. This material is in copyright. Subject to statutory exception and to the provisions of relevant collective licensing agreements, no reproduction of any part may take place without the written permission of Cambridge University ..."
Abstract

Cited by 1129 (32 self)
 Add to MetaCart
Copyright c ○ 2005 by Cambridge University Press. This material is in copyright. Subject to statutory exception and to the provisions of relevant collective licensing agreements, no reproduction of any part may take place without the written permission of Cambridge University
Endogenously Chosen Boards of Directors and Their Monitoring of the CEO
 AMERICAN ECONOMIC REVIEW
, 1998
"... This paper develops a model in which the effectiveness of the board's monitoring of the CEO depends on the board's structure or composition. The independence of new directors is determined through a bargaining process between the existing directors and the CEO. The CEO's bargaining po ..."
Abstract

Cited by 431 (18 self)
 Add to MetaCart
This paper develops a model in which the effectiveness of the board's monitoring of the CEO depends on the board's structure or composition. The independence of new directors is determined through a bargaining process between the existing directors and the CEO. The CEO's bargaining position, and thus his influence over the boardselection process, depends on an updated estimate of the CEO's ability based on his prior performance. Many empirical findings about board structure and performance arise as equilibrium phenomena in this model. We also explore the implications of this model for proposed regulations of corporate governance structures.
Results 1  10
of
6,589