Results 1  10
of
33
Learning to Take Actions
, 1998
"... We formalize a model for supervised learning of action strategies in dynamic stochastic domains and show that PAClearning results on Occam algorithms hold in this model as well. We then identify a class of rulebased action strategies for which polynomial time learning is possible. The representati ..."
Abstract

Cited by 49 (8 self)
 Add to MetaCart
We formalize a model for supervised learning of action strategies in dynamic stochastic domains and show that PAClearning results on Occam algorithms hold in this model as well. We then identify a class of rulebased action strategies for which polynomial time learning is possible. The representation of strategies is a generalization of decision lists; strategies include rules with existentially quantified conditions, simple recursive predicates, and small internal state, but are syntactically restricted. We also study the learnability of hierarchically composed strategies where a subroutine already acquired can be used as a basic action in a higher level strategy. We prove some positive results in this setting, but also show that in some cases the hierarchical learning problem is computationally hard. 1 Introduction We formalize a model for supervised learning of action strategies in dynamic stochastic domains, and study the learnability of strategies represented by rulebased syste...
Pruning Duplicate Nodes in DepthFirst Search
 In AAAI National Conference
, 1993
"... Bestfirst search algorithms require exponential memory, while depthfirst algorithms require only linear memory. On graphs with cycles, however, depthfirst searches do not detect duplicate nodes, and hence may generate asymptotically more nodes than bestfirst searches. We present a technique for ..."
Abstract

Cited by 37 (3 self)
 Add to MetaCart
Bestfirst search algorithms require exponential memory, while depthfirst algorithms require only linear memory. On graphs with cycles, however, depthfirst searches do not detect duplicate nodes, and hence may generate asymptotically more nodes than bestfirst searches. We present a technique for reducing the asymptotic complexity of depthfirst search by eliminating the generation of duplicate nodes. The automatic discovery and application of a finite state machine (FSM) that enforces pruning rules in a depthfirst search, has significantly extended the power of search in several domains. We have implemented and tested the technique on a grid, the Fifteen Puzzle, the TwentyFour Puzzle, and two versions of Rubik's Cube. In each case, the effective branching factor of the depthfirst search is reduced, reducing the asymptotic time complexity. IntroductionThe Problem Search techniques are fundamental to artificial intelligence. Bestfirst search algorithms such as breadthfirst se...
Learning GoalDecomposition Rules using Exercises
 IN PROCEEDINGS OF THE 14TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING
"... Exercises are problems ordered in increasing order of difficulty. Teaching problemsolving through exercises is a widely used pedagogic technique. A computational reason for this is that the knowledge gained by solving simple problems is useful in efficiently solving more difficult problems. We adopt ..."
Abstract

Cited by 37 (6 self)
 Add to MetaCart
Exercises are problems ordered in increasing order of difficulty. Teaching problemsolving through exercises is a widely used pedagogic technique. A computational reason for this is that the knowledge gained by solving simple problems is useful in efficiently solving more difficult problems. We adopt this approach of learning from exercises to acquire searchcontrol knowledge in the form of goaldecomposition rules (drules). Drules are first order, and are learned using a new "generalizeandtest" algorithm which is based on inductive logic programming techniques. We demonstrate the feasibility of the approach by applying it in two planning domains.
A Formal Framework for Speedup Learning from Problems and Solutions
 Journal of Artificial Intelligence Research
, 1996
"... Speedup learning seeks to improve the computational efficiency of problem solving with experience. In this paper, we develop a formal framework for learning efficient problem solving from random problems and their solutions. We apply this framework to two different representations of learned know ..."
Abstract

Cited by 20 (0 self)
 Add to MetaCart
Speedup learning seeks to improve the computational efficiency of problem solving with experience. In this paper, we develop a formal framework for learning efficient problem solving from random problems and their solutions. We apply this framework to two different representations of learned knowledge, namely control rules and macrooperators, and prove theorems that identify sufficient conditions for learning in each representation. Our proofs are constructive in that they are accompanied with learning algorithms. Our framework captures both empirical and explanationbased speedup learning in a unified fashion. We illustrate our framework with implementations in two domains: symbolic integration and Eight Puzzle. This work integrates many strands of experimental and theoretical work in machine learning, including empirical learning of control rules, macrooperator learning, ExplanationBased Learning (EBL), and Probably Approximately Correct (PAC) Learning. 1. Introduction ...
Toward an Experimental Science of Planning
 PROCEEDINGS OF THE WORKSHOP ON INNOVATIVE APPROACHES TO PLANNING, SCHEDULING, AND CONTROL
, 1990
"... In this paper we outline an experimental method for the study of planning. We argue that experimentation should occupy a central role in planning research, identify some dependent measures of planning behavior, and note some independent variables that can influence this behavior. We also discuss som ..."
Abstract

Cited by 17 (1 self)
 Add to MetaCart
In this paper we outline an experimental method for the study of planning. We argue that experimentation should occupy a central role in planning research, identify some dependent measures of planning behavior, and note some independent variables that can influence this behavior. We also discuss some issues of experimental design and different stages that may occur in the development of an experimental science of planning.
Learning generalized plans using abstract counting
 In Proc. of AAAI
"... Given the complexity of planning, it is often beneficial to create plans that work for a wide class of problems. This facilitates reuse of existing plans for different instances drawn from the same problem or from an infinite family of similar problems. We define a class of such planning problems ca ..."
Abstract

Cited by 17 (12 self)
 Add to MetaCart
Given the complexity of planning, it is often beneficial to create plans that work for a wide class of problems. This facilitates reuse of existing plans for different instances drawn from the same problem or from an infinite family of similar problems. We define a class of such planning problems called generalized planning problems and present a novel approach for transforming classical plans into generalized plans. These algorithmlike plans include loops and work for problem instances having varying numbers of objects that must be manipulated to reach the goal. Our approach takes as input a classical plan for a certain problem instance. It outputs a generalized plan along with a classification of the problem instances where it is guaranteed to work. We illustrate the utility of our approach through results of a working implementation on various practical examples.
Hierarchical ExplanationBased Reinforcement Learning
 In Proceedings of the Fourteenth International Conference on Machine Learning
, 1997
"... ExplanationBased Reinforcement Learning (EBRL) was introduced by Dietterich and Flann as a way of combining the ability of Reinforcement Learning (RL) to learn optimal plans with the generalization ability of ExplanationBased Learning (EBL) (Dietterich & Flann, 1995). We extend this work to domain ..."
Abstract

Cited by 16 (4 self)
 Add to MetaCart
ExplanationBased Reinforcement Learning (EBRL) was introduced by Dietterich and Flann as a way of combining the ability of Reinforcement Learning (RL) to learn optimal plans with the generalization ability of ExplanationBased Learning (EBL) (Dietterich & Flann, 1995). We extend this work to domains where the agent must order and achieve a sequence of subgoals in an optimal fashion. Hierarchical EBRL can effectively learn optimal policies in some of these sequential task domains even when the subgoals weakly interact with each other. We also show that when a planner that can achieve the individual subgoals is available, our method converges even faster. 1 Introduction Reinforcement Learning (RL) has emerged as the method of choice for building autonomous agents that improve their performance with experience. One obstacle to scaling this approach to large problems is the lack of a robust and justifiable method to generalize from one experience to another. Dietterich and Flann (Dietter...
A Selective Macrolearning Algorithm and its Application to the NxN SlidingTile Puzzle
 Journal of Artificial Intelligence Research
, 1998
"... One of the most common mechanisms used for speeding up problem solvers is macrolearning. Macros are sequences of basic operators acquired during problem solving. Macros are used by the problem solver as if they were basic operators. The major problem that macrolearning presents is the vast number o ..."
Abstract

Cited by 11 (4 self)
 Add to MetaCart
One of the most common mechanisms used for speeding up problem solvers is macrolearning. Macros are sequences of basic operators acquired during problem solving. Macros are used by the problem solver as if they were basic operators. The major problem that macrolearning presents is the vast number of macros that are available for acquisition. Macros increase the branching factor of the search space and can severely degrade problemsolving efficiency. To make macro learning useful, a program must be selective in acquiring and utilizing macros. This paper describes a general method for selective acquisition of macros. Solvable training problems are generated in increasing order of difficulty. The only macros acquired are those that take the problem solver out of a local minimum to a better state. The utility of the method is demonstrated in several domains, including the domain of N \Theta N slidingtile puzzles. After learning on small puzzles, the system is able to efficiently solve puz...
Loopdistill: Learning domainspecific planners from example plans
 In In ICAPS Workshop on Planning and Scheduling
, 2007
"... interpreted as representing the official policies, either expressed or implied, of any sponsoring institution, the U.S. government, or any other entity. Keywords: Planning, Learning, Domainspecific planning, Program synthesis, Looping Automated problem solving involves the ability to select actions ..."
Abstract

Cited by 10 (1 self)
 Add to MetaCart
interpreted as representing the official policies, either expressed or implied, of any sponsoring institution, the U.S. government, or any other entity. Keywords: Planning, Learning, Domainspecific planning, Program synthesis, Looping Automated problem solving involves the ability to select actions from a specific state to reach objectives. Classical planning research has addressed this problem in a domainindependent mannerâ€”the same algorithm generates a complete plan for any domain specification. While this generality is in principle desirable, it comes at a cost which domainindependent planners incur either in high search efforts or in tedious handcoded domain knowledge. Previous approaches to efficient generalpurpose planning have focused on reducing the search involved in an existing generalpurpose planning algorithm. Others abandoned the generalpurpose goal and developed specialpurpose planners highly optimized in efficiency for the specific aspects of a particular problem solving domain. An interesting alternative is to use example plans in a particular domain to demonstrate how to solve problems in that
An evaluation criterion for macro learning and some results (Technical Report TR9901
 Mindmaker Ltd., Budapest 1121, Konkoly Th. M
, 1999
"... It is known that a wellchosen set of macros makes it possible to considerably speedup the solution of planning problems. Recently, macros have been considered in the planning framework, built on Markovian deciflion problern. However, so far no systernatie approach waR put forth to investigate the ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
It is known that a wellchosen set of macros makes it possible to considerably speedup the solution of planning problems. Recently, macros have been considered in the planning framework, built on Markovian deciflion problern. However, so far no systernatie approach waR put forth to investigate the utility of macros within this framework. In this artide we begin to systematically study this problem by introducing the concept of multitask MDPs defined with a distribution over the tasks. We propose an evaluation criterion for macrosets that is based on the expected planning speedup due to the usage of a macroset, where the expectation is taken over the set of tasks. The consistency of the empirical speedup maximization algorithm is shown in the finite case. For acyclic systems, the expected planning speedup is shown to be proportional to the amount of "timecompression " due to the macros. Based on these observations a heuristic algorithm for learning of macros is proposed. The algorithm is shown to return macros identical with those that one would like to design by hand in the case of a particular navigation like multitask MDP. Some related questions, in particular the problem of breaking up MDPs into multiple tasks, factorizing MDPs and learning generalizations over actions to enhance the amount of transfer are also considered in brief at the end of the paper.