Results 1 
8 of
8
A DecisionTheoretic Generalization of onLine Learning and an Application to Boosting
, 1997
"... In the first part of the paper we consider the problem of dynamically apportioning resources among a set of options in a worstcase online framework. The model we study can be interpreted as a broad, abstract extension of the wellstudied online prediction model to a general decisiontheoretic set ..."
Abstract

Cited by 2307 (59 self)
 Add to MetaCart
In the first part of the paper we consider the problem of dynamically apportioning resources among a set of options in a worstcase online framework. The model we study can be interpreted as a broad, abstract extension of the wellstudied online prediction model to a general decisiontheoretic setting. We show that the multiplicative weightupdate rule of Littlestone and Warmuth [20] can be adapted to this model yielding bounds that are slightly weaker in some cases, but applicable to a considerably more general class of learning problems. We show how the resulting learning algorithm can be applied to a variety of problems, including gambling, multipleoutcome prediction, repeated games and prediction of points in R n . In the second part of the paper we apply the multiplicative weightupdate technique to derive a new boosting algorithm. This boosting algorithm does not require any prior knowledge about the performance of the weak learning algorithm. We also study generalizations of...
A PrimalDual Perspective of Online Learning Algorithms
"... We describe a novel framework for the design and analysis of online learning algorithms based on the notion of duality in constrained optimization. We cast a subfamily of universal online bounds as an optimization problem. Using the weak duality theorem we reduce the process of online learning to ..."
Abstract

Cited by 19 (5 self)
 Add to MetaCart
We describe a novel framework for the design and analysis of online learning algorithms based on the notion of duality in constrained optimization. We cast a subfamily of universal online bounds as an optimization problem. Using the weak duality theorem we reduce the process of online learning to the task of incrementally increasing the dual objective function. The amount by which the dual increases serves as a new and natural notion of progress for analyzing online learning algorithms. We are thus able to tie the primal objective value and the number of prediction mistakes using the increase in the dual.
Online learning with imperfect monitoring
 In Proceedings of the 16th Annual Conference on Learning Theory
, 2003
"... Abstract. We study online play of repeated matrix games in which the observations of past actions of the other player and the obtained reward are partial and stochastic. We define the Partial Observation Bayes Envelope (POBE) as the best reward against the worstcase stationary strategy of the oppo ..."
Abstract

Cited by 8 (3 self)
 Add to MetaCart
Abstract. We study online play of repeated matrix games in which the observations of past actions of the other player and the obtained reward are partial and stochastic. We define the Partial Observation Bayes Envelope (POBE) as the best reward against the worstcase stationary strategy of the opponent that agrees with past observations. Our goal is to have the (unobserved) average reward above the POBE. For the case where the observations (but not necessarily the rewards) depend on the opponent play alone, an algorithm for attaining the POBE is derived. This algorithm is based on an application of Approachability theory combined with a worstcase view over the unobserved rewards. We also suggest a simplified solution concept for general signaling structure. This concept may fall short of the POBE. 1
Adaptive Strategies and Regret Minimization in arbitrarily varying Markov Environments
 In Proc. of 14th COLT
, 2001
"... We consider the problem of maximizing the average reward in a controlled Markov environment, which also contains some arbitrarily varying elements. This problem is captured by a twoperson stochastic game model involving the reward maximizing agent and a second player, which is free to use an arbitr ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
We consider the problem of maximizing the average reward in a controlled Markov environment, which also contains some arbitrarily varying elements. This problem is captured by a twoperson stochastic game model involving the reward maximizing agent and a second player, which is free to use an arbitrary (nonstationary and unpredictable) control strategy. While the minimax value of the associated zerosum game provides a guaranteed performance level, the fact that the second player's behavior is observed as the game unfolds opens up the opportunity to improve upon this minimax value if the second player is not playing a worstcase strategy. This basic idea has been formalized in the context of repeated matrix games by the classical notions of regret minimization with respect to the Bayes envelope, where an attainable performance goal is defined in terms of the empirical frequencies of the opponent's actions. This paper presents an extension of these ideas to problems with Markovian dynamics, under appropriate recurrence conditions. The Bayes envelope is first defined in a natural way in terms of the observed state action frequencies. As this envelope may not be attained in general, we define a proper convexification thereof as an attainable solution concept. In the specific case of singlecontroller games, where the opponent alone controls the state transitions, the Bayes envelope itself turns out to be convex and attainable. Some concrete examples are shown to fit in this framework.
On the measure of conflicts: Shapley inconsistency values
 Artificial Intelligence
"... There are relatively few proposals for inconsistency measures for propositional belief bases. However inconsistency measures are potentially as important as information measures for artificial intelligence, and more generally for computer science. In particular, they can be useful to define various ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
There are relatively few proposals for inconsistency measures for propositional belief bases. However inconsistency measures are potentially as important as information measures for artificial intelligence, and more generally for computer science. In particular, they can be useful to define various operators for belief revision, belief merging, and negotiation. The measures that have been proposed so far can be split into two classes. The first class of measures takes into account the number of formulae required to produce an inconsistency: the more formulae required to produce an inconsistency, the less inconsistent the base. The second class takes into account the proportion of the language that is affected by the inconsistency: the more propositional variables affected, the more inconsistent the base. Both approaches are sensible, but there is no proposal for combining them. We address this need in this paper: our proposal takes into account both the number of variables affected by the inconsistency and the distribution of the inconsistency among the formulae of the base. Our idea is to use existing inconsistency measures in order to define a game in coalitional form, and then to use the Shapley value to obtain an inconsistency measure that indicates the responsibility/contribution of each formula to the overall inconsistency in the base. This allows us to provide a more reliable image of the belief base and of the inconsistency in it. ⇤ This paper is a revised and extended version of the paper ”Shapley Inconcistency Values ” presented at KR’06. 1 1
A Note on Kuhn’s Theorem
, 2007
"... We revisit Kuhn’s classic theorem on mixed and behavior strategies in games. We frame Kuhn’s work in terms of two questions in decision theory: What is the relationship between global and local assessment of uncertainty? What is the relationship between global and local optimality of strategies? ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
We revisit Kuhn’s classic theorem on mixed and behavior strategies in games. We frame Kuhn’s work in terms of two questions in decision theory: What is the relationship between global and local assessment of uncertainty? What is the relationship between global and local optimality of strategies?
A Complete Axiomatization of Differential Game Logic for Hybrid Games
, 2013
"... not be interpreted as representing the official policies, either expressed or implied, of any sponsoring institution or government. Any opinions, findings, and conclusions or recommendations expressed in this publication are those of the author(s) We introduce differential game logic (dGL) for speci ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
not be interpreted as representing the official policies, either expressed or implied, of any sponsoring institution or government. Any opinions, findings, and conclusions or recommendations expressed in this publication are those of the author(s) We introduce differential game logic (dGL) for specifying and verifying properties of hybrid games, i.e. games on hybrid systems combining discrete and continuous dynamics. Unlike hybrid systems, hybrid games allow choices in the system dynamics to be resolved adversarially by different players with different objectives. The logic dGL can be used to study the existence of winning strategies for such hybrid games. We present a simple sound and complete axiomatization of dGL relative to the fixpoint logic of differential equations. We prove hybrid games to be determined and their winning regions to require higher closure ordinals and we identify separating Hybrid systems [Hen96] are dynamical systems combining discrete dynamics and continuous dynamics, which are important, e.g., for modeling how computers control physical systems. Hybrid systems combine difference equations and differential equations with conditional switching, nondeterminism,