Results 11 - 20
of
56
Bayesian inverse reinforcement learning
- in 20th Int. Joint Conf. Artificial Intelligence
, 2007
"... Inverse Reinforcement Learning (IRL) is the problem of learning the reward function underlying a Markov Decision Process given the dynamics of the system and the behaviour of an expert. IRL is motivated by situations where knowledge of the rewards is a goal by itself (as in preference elicitation) a ..."
Abstract
-
Cited by 31 (0 self)
- Add to MetaCart
Inverse Reinforcement Learning (IRL) is the problem of learning the reward function underlying a Markov Decision Process given the dynamics of the system and the behaviour of an expert. IRL is motivated by situations where knowledge of the rewards is a goal by itself (as in preference elicitation) and by the task of apprenticeship learning (learning policies from an expert). In this paper we show how to combine prior knowledge and evidence from the expert’s actions to derive a probability distribution over the space of reward functions. We present efficient algorithms that find solutions for the reward learning and apprenticeship learning tasks that generalize well over these distributions. Experimental results show strong improvement for our methods over previous heuristic-based approaches. 1
Mathematical foundations of the Markov chain Monte Carlo method
- in Probabilistic Methods for Algorithmic Discrete Mathematics
, 1998
"... 7.2 was jointly undertaken with Vivek Gore, and is published here for the first time. I also thank an anonymous referee for carefully reading and providing helpful comments on a draft of this chapter. 1. Introduction The classical Monte Carlo method is an approach to estimating quantities that a ..."
Abstract
-
Cited by 29 (1 self)
- Add to MetaCart
7.2 was jointly undertaken with Vivek Gore, and is published here for the first time. I also thank an anonymous referee for carefully reading and providing helpful comments on a draft of this chapter. 1. Introduction The classical Monte Carlo method is an approach to estimating quantities that are hard to compute exactly. The quantity z of interest is expressed as the expectation z = ExpZ of a random variable (r.v.) Z for which some efficient sampling procedure is available. By taking the mean of some sufficiently large set of independent samples of Z, one may obtain an approximation to z. For example, suppose S = \Phi (x; y) 2 [0; 1] 2 : p i (x; y) 0; for all i \Psi<F12
On The Complexity Of Computing Mixed Volumes
- SIAM J. Comput
, 1998
"... . This paper gives various (positive and negative) results on the complexity of the problem of computing and approximating mixed volumes of polytopes and more general convex bodies in arbitrary dimension. On the negative side, we present several #P-hardness results that focus on the di#erence of com ..."
Abstract
-
Cited by 28 (1 self)
- Add to MetaCart
. This paper gives various (positive and negative) results on the complexity of the problem of computing and approximating mixed volumes of polytopes and more general convex bodies in arbitrary dimension. On the negative side, we present several #P-hardness results that focus on the di#erence of computing mixed volumes versus computing the volume of polytopes. We show that computing the volume of zonotopes is #P-hard (while each corresponding mixed volume can be computed easily) but also give examples showing that computing mixed volumes is hard even when computing the volume is easy. On the positive side, we derive a randomized algorithm for computing the mixed volumes V ( m 1 z }| { K 1 , . . . , K 1 , m 2 z }| { K 2 , . . . , K 2 , . . . , ms z }| { Ks , . . . , Ks ) of well-presented convex bodies K 1 , . . . , Ks , where m 1 , . . . , ms # N 0 and m 1 # n - #(n) with #(n) = o( log n log log n ). The algorithm is an interpolation method based on polynomial-time ra...
The geometry of logconcave functions and an O∗(n³) sampling algorithm
"... The class of logconcave functions in Rn is a common generalization of Gaussians and of indicator functions of convex sets. Motivated by the problem of sampling from a logconcave density function, we study their geometry and introduce a technique for “smoothing” them out. This leads to an efficient s ..."
Abstract
-
Cited by 27 (9 self)
- Add to MetaCart
The class of logconcave functions in Rn is a common generalization of Gaussians and of indicator functions of convex sets. Motivated by the problem of sampling from a logconcave density function, we study their geometry and introduce a technique for “smoothing” them out. This leads to an efficient sampling algorithm (by a random walk) with no assumptions on the local smoothness of the density function. After appropriate preprocessing, the algorithm produces a point from approximately the right distribution in time O∗(n^4), and in amortized time O∗(n³) if many sample points are needed (where the asterisk indicates that dependence on the error parameter and factors of log n are not shown).
Exact Mixing in an Unknown Markov Chain
- ELECTRONIC JOURNAL OF COMBINATORICS
, 1995
"... We give a simple stopping rule which will stop an unknown, irreducible n-state Markov chain at a state whose probability distribution is exactly the stationary distribution of the chain. The expected stopping time of the rule is bounded by a polynomial in the maximum mean hitting time of the chai ..."
Abstract
-
Cited by 20 (2 self)
- Add to MetaCart
We give a simple stopping rule which will stop an unknown, irreducible n-state Markov chain at a state whose probability distribution is exactly the stationary distribution of the chain. The expected stopping time of the rule is bounded by a polynomial in the maximum mean hitting time of the chain. Our stopping rule can be made deterministic unless the chain itself has no random transitions.
Efficient algorithms for universal portfolios
- Proceedings of the 41st Annual Symposium on the Foundations of Computer Science
, 2000
"... A constant rebalanced portfolio is an investment strategy that keeps the same distribution of wealth among a set of stocks from day to day. There has been much work on Cover's Universal algorithm, which is competitive with the best constant rebalanced portfolio determined in hindsight (3, 9, 2, 8, 1 ..."
Abstract
-
Cited by 20 (8 self)
- Add to MetaCart
A constant rebalanced portfolio is an investment strategy that keeps the same distribution of wealth among a set of stocks from day to day. There has been much work on Cover's Universal algorithm, which is competitive with the best constant rebalanced portfolio determined in hindsight (3, 9, 2, 8, 16, 4, 5, 6). While this algorithm has good performance guarantees, all known implementations are exponential in the number of stocks, restricting the number of stocks used in experiments (9, 4, 2, 5, 6). We present an efficient implementation of the Universal algorithm that is based on non-uniform random walks that are rapidly mixing (1, 14, 7). This same implementation also works for non-financial applications of the Universal algorithm, such as data compression (6) and language modeling (11).
Geometric random walks: a survey
- Combinatorial and Computational Geometry
, 2005
"... Abstract. The developing theory of geometric random walks is outlined here. Three aspects —general methods for estimating convergence (the “mixing ” rate), isoperimetric inequalities in R n and their intimate connection to random walks, and algorithms for fundamental problems (volume computation and ..."
Abstract
-
Cited by 18 (1 self)
- Add to MetaCart
Abstract. The developing theory of geometric random walks is outlined here. Three aspects —general methods for estimating convergence (the “mixing ” rate), isoperimetric inequalities in R n and their intimate connection to random walks, and algorithms for fundamental problems (volume computation and convex optimization) that are based on sampling by random walks —are discussed. 1.
Simulated Annealing for Convex Optimization
- Mathematics of Operations Research
, 2004
"... informs ® ..."
Fast algorithms for logconcave functions: sampling, rounding, integration and optimization
- Proceedings of the 47th Annual IEEE Symposium on Foundations of Computer Science
, 2006
"... We prove that the hit-and-run random walk is rapidly mixing for an arbitrary logconcave distribution starting from any point in the support. This extends the work of [26], where this was shown for an important special case, and settles the main conjecture formulated there. From this result, we deriv ..."
Abstract
-
Cited by 17 (4 self)
- Add to MetaCart
We prove that the hit-and-run random walk is rapidly mixing for an arbitrary logconcave distribution starting from any point in the support. This extends the work of [26], where this was shown for an important special case, and settles the main conjecture formulated there. From this result, we derive asymptotically faster algorithms in the general oracle model for sampling, rounding, integration and maximization of logconcave functions, improving or generalizing the main results of [24, 25, 1] and [16] respectively. The algorithms for integration and optimization both use sampling and are surprisingly similar.

