Results 1  10
of
53
Approximate Policy Iteration with a Policy Language Bias
 Journal of Artificial Intelligence Research
, 2003
"... We explore approximate policy iteration (API), replacing the usual costfunction learning step with a learning step in policy space. We give policylanguage biases that enable solution of very large relational Markov decision processes (MDPs) that no previous technique can solve. ..."
Abstract

Cited by 119 (14 self)
 Add to MetaCart
We explore approximate policy iteration (API), replacing the usual costfunction learning step with a learning step in policy space. We give policylanguage biases that enable solution of very large relational Markov decision processes (MDPs) that no previous technique can solve.
Learning Action Strategies for Planning Domains
 ARTIFICIAL INTELLIGENCE
, 1997
"... This paper reports on experiments where techniques of supervised machine learning are applied to the problem of planning. The input to the learning algorithm is composed of a description of a planning domain, planning problems in this domain, and solutions for them. The output is an efficient algori ..."
Abstract

Cited by 73 (3 self)
 Add to MetaCart
This paper reports on experiments where techniques of supervised machine learning are applied to the problem of planning. The input to the learning algorithm is composed of a description of a planning domain, planning problems in this domain, and solutions for them. The output is an efficient algorithm  a strategy  for solving problems in that domain. We test the strategy on an independent set of planning problems from the same domain, so that success is measured by its ability to solve complete problems. A system, L2Act, has been developed in order to perform these experiments. We have experimented with the blocks world domain, and the logistics domain, using strategies in the form of a generalization of decision lists, where the rules on the list are existentially quantified first order expressions. The learning algorithm is a variant of Rivest`s [39] algorithm, improved with several techniques that reduce its time complexity. As the experiments demonstrate, generalization is a...
Learning to reason
 Journal of the ACM
, 1994
"... Abstract. We introduce a new framework for the study of reasoning. The Learning (in order) to Reason approach developed here views learning as an integral part of the inference process, and suggests that learning and reasoning should be studied together. The Learning to Reason framework combines the ..."
Abstract

Cited by 57 (24 self)
 Add to MetaCart
Abstract. We introduce a new framework for the study of reasoning. The Learning (in order) to Reason approach developed here views learning as an integral part of the inference process, and suggests that learning and reasoning should be studied together. The Learning to Reason framework combines the interfaces to the world used by known learning models with the reasoning task and a performance criterion suitable for it. In this framework, the intelligent agent is given access to its favorite learning interface, and is also given a grace period in which it can interact with this interface and construct a representation KB of the world W. The reasoning performance is measured only after this period, when the agent is presented with queries � from some query language, relevant to the world, and has to answer whether W implies �. The approach is meant to overcome the main computational difficulties in the traditional treatment of reasoning which stem from its separation from the “world”. Since the agent interacts with the world when constructing its knowledge representation it can choose a representation that is useful for the task at hand. Moreover, we can now make explicit the dependence of the reasoning performance on the environment the agent interacts with. We show how previous results from learning theory and reasoning fit into this framework and
A neuroidal architecture for cognitive computation
 Journal of the ACM
, 2000
"... Abstract. An architecture is described for designing systems that acquire and manipulate large amounts of unsystematized, or socalled commonsense, knowledge. Its aim is to exploit to the full those aspects of computational learning that are known to offer powerful solutions in the acquisition and m ..."
Abstract

Cited by 35 (4 self)
 Add to MetaCart
Abstract. An architecture is described for designing systems that acquire and manipulate large amounts of unsystematized, or socalled commonsense, knowledge. Its aim is to exploit to the full those aspects of computational learning that are known to offer powerful solutions in the acquisition and maintenance of robust knowledge bases. The architecture makes explicit the requirements on the basic computational tasks that are to be performed and is designed to make these computationally tractable even for very large databases. The main claims are that (i) the basic learning and deduction tasks are provably tractable and (ii) tractable learning offers viable approaches to a range of issues that have been previously identified as problematic for artificial intelligence systems that are programmed. Among the issues that learning offers to resolve are robustness to inconsistencies, robustness to incomplete information and resolving among alternatives. Attributeefficient learning algorithms, which allow learning from few examples in large dimensional systems, are fundamental to the approach. Underpinning the overall architecture is a new principled approach to manipulating relations in learning systems. This approach, of independently quantified arguments, allows propositional learning algorithms to be applied systematically to learning relational concepts in polynomial time and in a modular fashion.
Robust Logics
"... Suppose that we wish to learn from examples and counterexamples a criterion for recognizing whether an assembly of wooden blocks constitutes an arch. Suppose also that we have preprogrammed recognizers for various relationships e.g. ontopof(x; y), above(x; y), etc. and believe that some possibl ..."
Abstract

Cited by 31 (6 self)
 Add to MetaCart
Suppose that we wish to learn from examples and counterexamples a criterion for recognizing whether an assembly of wooden blocks constitutes an arch. Suppose also that we have preprogrammed recognizers for various relationships e.g. ontopof(x; y), above(x; y), etc. and believe that some possibly complex expression in terms of these base relationships should suffice to approximate the desired notion of an arch. How can we formulate such a relational learning problem so as to exploit the benefits that are demonstrably available in propositional learning, such as attributeefficient learning by linear separators, and errorresilient learning? We believe that learning in a general setting that allows for multiple objects and relations in this way is a fundamental key to resolving the following dilemma that arises in the design of intelligent systems: Mathematical logic is an attractive language of description because it has clear semantics and sound proof procedures. However, as a basis for large programmed systems it leads to brittleness because, in practice, consistent usage of the various predicate names throughout a system cannot be guaranteed, except in application areas such as mathematics where the viability of the axiomatic method has been demonstrated independently. In this paper we develop the following approach to circumventing this dilemma. We suggest that brittleness can be overcome by using a new kind of logic in which each statement is learnable. By allowing the system to learn rules empirically from the environment, relative to any particular programs it may have for recognizing some base predicates, we enable the system to acquire a set of statements approximately consistent with each other and with the world, without the need for a globally knowledgeable and consistent programmer. We illustrate
Learning to Reason with a Restricted View
, 1998
"... The Learning to Reason framework combines the study of Learning and Reasoning into a single task. Within it, learning is done specifically for the purpose of reasoning with the learned knowledge. Computational considerations show that this is a useful paradigm; in some cases learning and reasoning p ..."
Abstract

Cited by 29 (15 self)
 Add to MetaCart
The Learning to Reason framework combines the study of Learning and Reasoning into a single task. Within it, learning is done specifically for the purpose of reasoning with the learned knowledge. Computational considerations show that this is a useful paradigm; in some cases learning and reasoning problems that are intractable when studied separately become tractable when performed as a task of Learning to Reason. In this paper we study Learning to Reason problems where the interaction with the world supplies the learner only partial information in the form of partial assignments. Several natural interpretations of partial assignments are considered and learning and reasoning algorithms using these are developed. The results presented exhibit a tradeoff between learnability, the strength of the oracles used in the interface, and the range of reasoning queries the learner is guaranteed to answer correctly.
Relating reinforcement learning performance to classification performance
 In 22nd International Conference on Machine Learning (ICML
, 2005
"... We prove a quantitative connection between the expected sum of rewards of a policy and binary classification performance on created subproblems. This connection holds without any unobservable assumptions (no assumption of independence, small mixing time, fully observable states, or even hidden state ..."
Abstract

Cited by 26 (4 self)
 Add to MetaCart
We prove a quantitative connection between the expected sum of rewards of a policy and binary classification performance on created subproblems. This connection holds without any unobservable assumptions (no assumption of independence, small mixing time, fully observable states, or even hidden states) and the resulting statement is independent of the number of states or actions. The statement is critically dependent on the size of the rewards and prediction performance of the created classifiers. We also provide some general guidelines for obtaining good classification performance on the created subproblems. In particular, we discuss possible methods for generating training examples for a classifier learning algorithm. 1.
Relational Representations that Facilitate Learning
, 2000
"... Given a collection of objects in the world, along with some relations that hold among them, a fundamental problem is how to learn denitions of some relations and concepts of interest in terms of the given relations. These denitions might be quite complex and, inevitably, might require the use ..."
Abstract

Cited by 22 (10 self)
 Add to MetaCart
Given a collection of objects in the world, along with some relations that hold among them, a fundamental problem is how to learn denitions of some relations and concepts of interest in terms of the given relations. These denitions might be quite complex and, inevitably, might require the use of quanti ed expressions. Attempts to use rst order languages for these purposes are hampered by the fact that relational inference is intractable and, consequently, so is the problem of learning relational denitions. This work develops an expressive relational representation language that allows the use of propositional learning algorithms when learning relational denitions. The representation serves as an intermediate level between a raw description of observations in the world and a propositional learning system that attempts to learn denitions for concepts and relations. It allows for hierarchical composition of relational expressions that can be evaluated ecientl...
A Formal Framework for Speedup Learning from Problems and Solutions
 JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH
, 1996
"... Speedup learning seeks to improve the computational efficiency of problem solving with experience. In this paper, we develop a formal framework for learning efficient problem solving from random problems and their solutions. We apply this framework to two different representations of learned know ..."
Abstract

Cited by 20 (0 self)
 Add to MetaCart
Speedup learning seeks to improve the computational efficiency of problem solving with experience. In this paper, we develop a formal framework for learning efficient problem solving from random problems and their solutions. We apply this framework to two different representations of learned knowledge, namely control rules and macrooperators, and prove theorems that identify sufficient conditions for learning in each representation. Our proofs are constructive in that they are accompanied with learning algorithms. Our framework captures both empirical and explanationbased speedup learning in a unified fashion. We illustrate our framework with implementations in two domains: symbolic integration and Eight Puzzle. This work integrates many strands of experimental and theoretical work in machine learning, including empirical learning of control rules, macrooperator learning, ExplanationBased Learning (EBL), and Probably Approximately Correct (PAC) Learning.
Practical solution techniques for firstorder mdps
 Artificial Intelligence
"... Many traditional solution approaches to relationally specified decisiontheoretic planning problems (e.g., those stated in the probabilistic planning domain description language, or PPDDL) ground the specification with respect to a specific instantiation of domain objects and apply a solution approa ..."
Abstract

Cited by 20 (1 self)
 Add to MetaCart
Many traditional solution approaches to relationally specified decisiontheoretic planning problems (e.g., those stated in the probabilistic planning domain description language, or PPDDL) ground the specification with respect to a specific instantiation of domain objects and apply a solution approach directly to the resulting ground Markov decision process (MDP). Unfortunately, the space and time complexity of these grounded solution approaches are polynomial in the number of domain objects and exponential in the predicate arity and the number of nested quantifiers in the relational problem specification. An alternative to grounding a relational planning problem is to tackle the problem directly at the relational level. In this article, we propose one such approach that translates an expressive subset of the PPDDL representation to a firstorder MDP (FOMDP) specification and then derives a domainindependent policy without grounding at any intermediate step. However, such generality does not come without its own set of challenges—the purpose of this article is to explore practical solution techniques for solving FOMDPs. To demonstrate the applicability of our techniques, we present proofofconcept results of our firstorder approximate linear programming (FOALP) planner on problems from the probabilistic track