Results 1 - 10
of
21
Rule-based Evolutionary Online Learning Systems: LEARNING BOUNDS, CLASSIFICATION, AND PREDICTION
, 2004
"... Rule-based evolutionary online learning systems, often referred to as Michigan-style learning classifier systems (LCSs), were proposed nearly thirty years ago (Holland, 1976; Holland, 1977) originally calling them cognitive systems. LCSs combine the strength of reinforcement learning with the genera ..."
Abstract
-
Cited by 32 (8 self)
- Add to MetaCart
Rule-based evolutionary online learning systems, often referred to as Michigan-style learning classifier systems (LCSs), were proposed nearly thirty years ago (Holland, 1976; Holland, 1977) originally calling them cognitive systems. LCSs combine the strength of reinforcement learning with the generalization capabilities of genetic algorithms promising a flexible, online generalizing, solely reinforcement dependent learning system. However, despite several initial successful applications of LCSs and their interesting relations with animal learning and cognition, understanding of the systems remained somewhat obscured. Questions concerning learning complexity or convergence remained unanswered. Performance in different problem types, problem structures, concept spaces, and hypothesis spaces stayed nearly unpredictable. This thesis has the following three major objectives: (1) to establish a facetwise theory approach for LCSs that promotes system analysis, understanding, and design; (2) to analyze, evaluate, and enhance the XCS classifier system (Wilson, 1995) by the means of the facetwise approach establishing a fundamental XCS learning theory; (3) to identify both the major advantages of an LCS-based learning approach as well as the most promising potential application areas. Achieving these three objectives leads to a rigorous understanding
Approximate linear programming for first-order MDPs
- In Proc. UAI05, 509– 517
, 2005
"... We introduce a new approximate solution technique for first-order Markov decision processes (FOMDPs). Representing the value function linearly w.r.t. a set of first-order basis functions, we compute suitable weights by casting the corresponding optimization as a first-order linear program and show h ..."
Abstract
-
Cited by 26 (9 self)
- Add to MetaCart
We introduce a new approximate solution technique for first-order Markov decision processes (FOMDPs). Representing the value function linearly w.r.t. a set of first-order basis functions, we compute suitable weights by casting the corresponding optimization as a first-order linear program and show how off-the-shelf theorem prover and LP software can be effectively used. This technique allows one to solve FOMDPs independent of a specific domain instantiation; furthermore, it allows one to determine bounds on approximation error that apply equally to all domain instantiations. We apply this solution technique to the task of elevator scheduling with a rich feature space and multi-criteria additive reward, and demonstrate that it outperforms a number of intuitive, heuristicallyguided policies. 1
Relational reinforcement learning: An overview
- In Proceedings of the ICML’04 Workshop on Relational Reinforcement Learning
, 2004
"... Relational reinforcement learning (RRL) is both a young and an old eld. In this paper, we trace the history of the eld to related disciplines, outline some current work and promising new directions, and survey the research issues and opportunities that lie ahead. 1. ..."
Abstract
-
Cited by 23 (3 self)
- Add to MetaCart
Relational reinforcement learning (RRL) is both a young and an old eld. In this paper, we trace the history of the eld to related disciplines, outline some current work and promising new directions, and survey the research issues and opportunities that lie ahead. 1.
Online learning and exploiting relational models in reinforcement learning
- In M. Veloso (Ed.), Proceedings of the 20th International Joint Conference on Artificial Intelligence (p
, 2007
"... In recent years, there has been a growing interest in using rich representations such as relational languages for reinforcement learning. However, while expressive languages have many advantages in terms of generalization and reasoning, extending existing approaches to such a relational setting is a ..."
Abstract
-
Cited by 13 (2 self)
- Add to MetaCart
In recent years, there has been a growing interest in using rich representations such as relational languages for reinforcement learning. However, while expressive languages have many advantages in terms of generalization and reasoning, extending existing approaches to such a relational setting is a non-trivial problem. In this paper, we present a first step towards the online learning and exploitation of relational models. We propose a representation for the transition and reward function that can be learned online and present a method that exploits these models by augmenting Relational Reinforcement Learning algorithms with planning techniques. The benefits and robustness of our approach are evaluated experimentally. 1
First order decision diagrams for relational MDPs
- In Proceedings of the International Joint Conference of Artificial Intelligence
, 2007
"... Markov decision processes capture sequential decision making under uncertainty, where an agent must choose actions so as to optimize long term reward. The paper studies efficient reasoning mechanisms for Relational Markov Decision Processes (RMDP) where world states have an internal relational struc ..."
Abstract
-
Cited by 10 (3 self)
- Add to MetaCart
Markov decision processes capture sequential decision making under uncertainty, where an agent must choose actions so as to optimize long term reward. The paper studies efficient reasoning mechanisms for Relational Markov Decision Processes (RMDP) where world states have an internal relational structure that can be naturally described in terms of objects and relations among them. Two contributions are presented. First, the paper develops First Order Decision Diagrams (FODD), a new compact representation for functions over relational structures, together with a set of operators to combine FODDs, and novel reduction techniques to keep the representation small. Second, the paper shows how FODDs can be used to develop solutions for RMDPs, where reasoning is performed at the abstract level and the resulting optimal policy is independent of domain size (number of objects) or instantiation. In particular, a variant of the value iteration algorithm is developed by using special operations over FODDs, and the algorithm is shown to converge to the optimal policy. 1.
A heuristic search algorithm for solving first-order MDPs
- In UAI
, 2005
"... We present a heuristic search algorithm for solving first-order MDPs (FOMDPs). Our approach combines first-order state abstraction that avoids evaluating states individually, and heuristic search that avoids evaluating all states. Firstly, we apply state abstraction directly on the FOMDP avoiding pr ..."
Abstract
-
Cited by 9 (0 self)
- Add to MetaCart
We present a heuristic search algorithm for solving first-order MDPs (FOMDPs). Our approach combines first-order state abstraction that avoids evaluating states individually, and heuristic search that avoids evaluating all states. Firstly, we apply state abstraction directly on the FOMDP avoiding propositionalization. Such kind of abstraction is referred to as firstorder state abstraction. Secondly, guided by an admissible heuristic, the search is restricted only to those states that are reachable from the initial state. We demonstrate the usefullness of the above techniques for solving FOMDPs on a system, referred to as FC-Planner, that entered the probabilistic track of the International Planning Competition (IPC’2004). 1
Approximate solution techniques for factored first-order MDPs
- In (ICAPS-07), 288
, 2007
"... Most traditional approaches to probabilistic planning in relationally specified MDPs rely on grounding the problem w.r.t. specific domain instantiations, thereby incurring a combinatorial blowup in the representation. An alternative approach is to lift a relational MDP to a firstorder MDP (FOMDP) sp ..."
Abstract
-
Cited by 9 (3 self)
- Add to MetaCart
Most traditional approaches to probabilistic planning in relationally specified MDPs rely on grounding the problem w.r.t. specific domain instantiations, thereby incurring a combinatorial blowup in the representation. An alternative approach is to lift a relational MDP to a firstorder MDP (FOMDP) specification and develop solution approaches that avoid grounding. Unfortunately, state-of-the-art FOMDPs are inadequate for specifying factored transition models or additive rewards that scale with the domain size—structure that is very natural in probabilistic planning problems. To remedy these deficiencies, we propose an extension of the FOMDP formalism known as a factored FOMDP and present generalizations of symbolic dynamic programming and linear-value approximation solutions to exploit its structure. Along the way, we also make contributions to the field of first-order probabilistic inference (FOPI) by demonstrating novel first-order structures that can be exploited without domain grounding. We present empirical results to demonstrate that we can obtain solutions whose complexity scales polynomially in the logarithm of the domain size—results that are impossible to obtain with any previously proposed solution method.
Practical solution techniques for first-order mdps
- Artificial Intelligence
"... Many traditional solution approaches to relationally specified decision-theoretic planning problems (e.g., those stated in the probabilistic planning domain description language, or PPDDL) ground the specification with respect to a specific instantiation of domain objects and apply a solution approa ..."
Abstract
-
Cited by 9 (1 self)
- Add to MetaCart
Many traditional solution approaches to relationally specified decision-theoretic planning problems (e.g., those stated in the probabilistic planning domain description language, or PPDDL) ground the specification with respect to a specific instantiation of domain objects and apply a solution approach directly to the resulting ground Markov decision process (MDP). Unfortunately, the space and time complexity of these grounded solution approaches are polynomial in the number of domain objects and exponential in the predicate arity and the number of nested quantifiers in the relational problem specification. An alternative to grounding a relational planning problem is to tackle the problem directly at the relational level. In this article, we propose one such approach that translates an expressive subset of the PPDDL representation to a first-order MDP (FOMDP) specification and then derives a domain-independent policy without grounding at any intermediate step. However, such generality does not come without its own set of challenges—the purpose of this article is to explore practical solution techniques for solving FOMDPs. To demonstrate the applicability of our techniques, we present proof-of-concept results of our first-order approximate linear programming (FOALP) planner on problems from the probabilistic track
Transfer learning for reinforcement learning through goal and policy parametrization
- In ICML Workshop on Structural Knowledge Transfer for Machine Learning
, 2006
"... Relational reinforcement learning has allowed results from reinforcement learning tasks to be re-used in other, closely related, tasks. This transfer of knowledge is made possible by the use of parameters in the representations of the task-description and the learned policy. In this paper, we will g ..."
Abstract
-
Cited by 7 (1 self)
- Add to MetaCart
Relational reinforcement learning has allowed results from reinforcement learning tasks to be re-used in other, closely related, tasks. This transfer of knowledge is made possible by the use of parameters in the representations of the task-description and the learned policy. In this paper, we will give a description of the current state of the art of transfer learning with relational reinforcement learning, make some observations about the usefulness and limitations of this current state and discuss some directions for future research. We also present a first small step along one of those directions. 1.
Approximate Inference for Planning in Stochastic Relational Worlds
"... Relational world models that can be learned from experience in stochastic domains have received significant attention recently. However, efficient planning using these models remains a major issue. We propose to convert learned noisy probabilistic relational rules into a structured dynamic Bayesian ..."
Abstract
-
Cited by 7 (4 self)
- Add to MetaCart
Relational world models that can be learned from experience in stochastic domains have received significant attention recently. However, efficient planning using these models remains a major issue. We propose to convert learned noisy probabilistic relational rules into a structured dynamic Bayesian network representation. Predicting the effects of action sequences using approximate inference allows for planning in complex worlds. We evaluate the effectiveness of our approach for online planning in a 3D simulated blocksworld with an articulated manipulator and realistic physics. Empirical results show that our method can solve problems where existing methods fail. 1.

