Results 1  10
of
100
Reinforcement learning: a survey
 Journal of Artificial Intelligence Research
, 1996
"... This paper surveys the field of reinforcement learning from a computerscience perspective. It is written to be accessible to researchers familiar with machine learning. Both the historical basis of the field and a broad selection of current work are summarized. Reinforcement learning is the problem ..."
Abstract

Cited by 1309 (22 self)
 Add to MetaCart
This paper surveys the field of reinforcement learning from a computerscience perspective. It is written to be accessible to researchers familiar with machine learning. Both the historical basis of the field and a broad selection of current work are summarized. Reinforcement learning is the problem faced by an agent that learns behavior through trialanderror interactions with a dynamic environment. The work described here has a resemblance to work in psychology, but differs considerably in the details and in the use of the word "reinforcement." The paper discusses central issues of reinforcement learning, including trading off exploration and exploitation, establishing the foundations of the field via Markov decision theory, learning from delayed reinforcement, constructing empirical models to accelerate learning, making use of generalization and hierarchy, and coping with hidden state. It concludes with a survey of some implemented systems and an assessment of the practical utility of current methods for reinforcement learning.
On the complexity of solving Markov decision problems
 IN PROC. OF THE ELEVENTH INTERNATIONAL CONFERENCE ON UNCERTAINTY IN ARTIFICIAL INTELLIGENCE
, 1995
"... Markov decision problems (MDPs) provide the foundations for a number of problems of interest to AI researchers studying automated planning and reinforcement learning. In this paper, we summarize results regarding the complexity of solving MDPs and the running time of MDP solution algorithms. We argu ..."
Abstract

Cited by 130 (10 self)
 Add to MetaCart
Markov decision problems (MDPs) provide the foundations for a number of problems of interest to AI researchers studying automated planning and reinforcement learning. In this paper, we summarize results regarding the complexity of solving MDPs and the running time of MDP solution algorithms. We argue that, although MDPs can be solved efficiently in theory, more study is needed to reveal practical algorithms for solving large problems quickly. To encourage future research, we sketch some alternative methods of analysis that rely on the structure of MDPs.
Tiling Multidimensional Iteration Spaces for Multicomputers
, 1992
"... This paper addresses the problem of compiling perfectly nested loops for multicomputers (distributed memory machines). The relatively high communication startup costs in these machines renders frequent communication very expensive. Motivated by this, we present a method of aggregating a number of lo ..."
Abstract

Cited by 103 (20 self)
 Add to MetaCart
This paper addresses the problem of compiling perfectly nested loops for multicomputers (distributed memory machines). The relatively high communication startup costs in these machines renders frequent communication very expensive. Motivated by this, we present a method of aggregating a number of loop iterations into tiles where the tiles execute atomically  a processor executing the iterations belonging to a tile receives all the data it needs before executing any one of the iterations in the tile, executes all the iterations in the tile and then sends the data needed by other processors. Since synchronization is not allowed during the execution of a tile, partitioning the iteration space into tiles must not result in deadlock. We first show the equivalence between the problem of finding partitions and the problem of determining the cone for a given set of dependence vectors. We then present an approach to partitioning the iteration space into deadlockfree tiles so that communicati...
An Algorithmic Theory of Lattice Points in Polyhedra
, 1999
"... We discuss topics related to lattice points in rational polyhedra, including efficient enumeration of lattice points, “short” generating functions for lattice points in rational polyhedra, relations to classical and higherdimensional Dedekind sums, complexity of the Presburger arithmetic, efficien ..."
Abstract

Cited by 93 (6 self)
 Add to MetaCart
We discuss topics related to lattice points in rational polyhedra, including efficient enumeration of lattice points, “short” generating functions for lattice points in rational polyhedra, relations to classical and higherdimensional Dedekind sums, complexity of the Presburger arithmetic, efficient computations with rational functions, and others. Although the main slant is algorithmic, structural results are discussed, such as relations to the general theory of valuations on polyhedra and connections with the theory of toric varieties. The paper surveys known results and presents some new results and connections.
Generation of Efficient Nested Loops from Polyhedra
 International Journal of Parallel Programming
, 2000
"... Automatic parallelization in the polyhedral model is based on affine transformations from an original computation domain (iteration space) to a target spacetime domain, often with a different transformation for each variable. Code generation is an often ignored step in this process that has a signi ..."
Abstract

Cited by 72 (3 self)
 Add to MetaCart
Automatic parallelization in the polyhedral model is based on affine transformations from an original computation domain (iteration space) to a target spacetime domain, often with a different transformation for each variable. Code generation is an often ignored step in this process that has a significant impact on the quality of the final code. It involves making a tradeoff between code size and control code simplification/optimization. Previous methods of doing code generation are based on loop splitting, however they have nonoptimal behavior when working on parameterized programs. We present a general parameterized method for code generation based on dual representation of polyhedra. Our algorithm uses a simple recursion on the dimensions of the domains, and enables fine control over the tradeoff between code size and control overhead.
Effective lattice point counting in rational convex polytopes
 JOURNAL OF SYMBOLIC COMPUTATION
, 2003
"... This paper discusses algorithms and software for the enumeration of all lattice points inside a rational convex polytope: we describe LattE, a computer package for lattice point enumeration which contains the first implementation of A. Barvinok's algorithm [8]. We report on computational experiments ..."
Abstract

Cited by 67 (11 self)
 Add to MetaCart
This paper discusses algorithms and software for the enumeration of all lattice points inside a rational convex polytope: we describe LattE, a computer package for lattice point enumeration which contains the first implementation of A. Barvinok's algorithm [8]. We report on computational experiments with multiway contingency tables, knapsack type problems, rational polygons, and flow polytopes. We prove that this kind of symbolicalgebraic ideas surpasses the traditional branchandbound enumeration and in some instances LattE is the only software capable of counting. Using LattE, we have also computed new formulas of Ehrhart (quasi)polynomials for interesting families of polytopes (hypersimplices, truncated cubes, etc). We end with a survey of other "algebraicanalytic" algorithms, including a "polar" variation of Barvinok's algorithm which is very fast when the number of facetdefining inequalities is much smaller compared to the number of vertices.
The Mapping of Linear Recurrence Equations on Regular Arrays
 Journal of VLSI Signal Processing
, 1989
"... The parallelization of many algorithms can be obtained using spacetime transformations which are applied on nested doloops or on recurrence equations. In this paper, we analyze systems of linear recurrence equations, a generalization of uniform recurrence equations. The first part of the paper des ..."
Abstract

Cited by 66 (7 self)
 Add to MetaCart
The parallelization of many algorithms can be obtained using spacetime transformations which are applied on nested doloops or on recurrence equations. In this paper, we analyze systems of linear recurrence equations, a generalization of uniform recurrence equations. The first part of the paper describes a method for finding automatically whether such a system can be scheduled by an affine timing function, independent of the size parameter of the algorithm. In the second part, we describe a powerful method that makes it possible to transform linear recurrences into uniform recurrence equations. Both parts rely on results on integral convex polyhedra. Our results are illustrated on the Gauss elimination algorithm and on the GaussJordan diagonalization algorithm. 1 Introduction Designing efficient algorithms for parallel architectures is one of the main difficulties of the current research in computer science. As the architecture of supercomputers evolves towards massive parallelism...
An AutomataTheoretic Approach to Presburger Arithmetic Constraints (Extended Abstract)
 In Proc. Static Analysis Symposium, LNCS 983
, 1995
"... This paper introduces a finiteautomata based representation of Presburger arithmetic definable sets of integer vectors. The representation consists of concurrent automata operating on the binary encodings of the elements of the represented sets. This representation has several advantages. First, be ..."
Abstract

Cited by 47 (4 self)
 Add to MetaCart
This paper introduces a finiteautomata based representation of Presburger arithmetic definable sets of integer vectors. The representation consists of concurrent automata operating on the binary encodings of the elements of the represented sets. This representation has several advantages. First, being automatabased it is operational in nature and hence leads directly to algorithms, for instance all usual operations on sets of integer vectors translate naturally to operations on automata. Second, the use of concurrent automata makes it compact. Third, it is insensitive to the representation size of integers. Our representation can be used whenever arithmetic constraints are needed. To il...
The Witness Algorithm: Solving Partially Observable Markov Decision Processes
, 1994
"... This paper describes the POMDP framework and presents some wellknown results from the field. It then presents a novel method called the witness algorithm for solving POMDP problems and analyzes its computational complexity. We argue that the witness algorithm is superior to existing algorithms for s ..."
Abstract

Cited by 45 (3 self)
 Add to MetaCart
This paper describes the POMDP framework and presents some wellknown results from the field. It then presents a novel method called the witness algorithm for solving POMDP problems and analyzes its computational complexity. We argue that the witness algorithm is superior to existing algorithms for solving POMDP's in an important complexitytheoretic sense.
Nonunimodular Transformations of Nested Loops
 IN PROC. SUPERCOMPUTING 92
, 1992
"... This paper presents a linear algebraic approach to modeling loop transformations. The approach unifies apparently unrelated recent developments in supercompiler technology. Specifically we show the relationship between the dependence abstraction called dependence cones, and fully permutable loop nes ..."
Abstract

Cited by 45 (12 self)
 Add to MetaCart
This paper presents a linear algebraic approach to modeling loop transformations. The approach unifies apparently unrelated recent developments in supercompiler technology. Specifically we show the relationship between the dependence abstraction called dependence cones, and fully permutable loop nests. Compound transformations are modeled as matrices. Nonsingular linear transformations presented here subsumes the class of unimodular transformations. Nonunimodular transformations (with determinant 1) create "holes" in the transformed iteration space. We change the step size of loops in order to "step aside from these holes" when traversing the transformed iteration space. For the class of nonunimodular loop transformations, we present algorithms for deriving the loop bounds, the array access expressions and step sizes of loops in the nest. The algorithms are based on the Hermite Normal Form of the transformation matrix. We illustrate the use of this approach in several problems such a...