Results 1  10
of
73
Formal Theory of Creativity, Fun, and Intrinsic Motivation (19902010)
"... The simple but general formal theory of fun & intrinsic motivation & creativity (1990) is based on the concept of maximizing intrinsic reward for the active creation or discovery of novel, surprising patterns allowing for improved prediction or data compression. It generalizes the traditio ..."
Abstract

Cited by 75 (15 self)
 Add to MetaCart
(Show Context)
The simple but general formal theory of fun & intrinsic motivation & creativity (1990) is based on the concept of maximizing intrinsic reward for the active creation or discovery of novel, surprising patterns allowing for improved prediction or data compression. It generalizes the traditional field of active learning, and is related to old but less formal ideas in aesthetics theory and developmental psychology. It has been argued that the theory explains many essential aspects of intelligence including autonomous development, science, art, music, humor. This overview first describes theoretically optimal (but not necessarily practical) ways of implementing the basic computational principles on exploratory, intrinsically motivated agents or robots, encouraging them to provoke event sequences exhibiting previously unknown but learnable algorithmic regularities. Emphasis is put on the importance of limited computational resources for online prediction and compression. Discrete and continuous time formulations are given. Previous practical but nonoptimal implementations (1991, 1995, 19972002) are reviewed, as well as several recent variants by others (2005). A simplified typology addresses current confusion concerning the precise nature of intrinsic motivation.
The Push3 execution stack and the evolution of control
 In Proc. Gen. and Evol. Comp. Conf
, 2005
"... The Push programming language was developed for use in genetic and evolutionary computation systems, as the representation within which evolving programs are expressed. It has been used in the production of several significant results, including results that were awarded a gold medal in the Human Co ..."
Abstract

Cited by 33 (9 self)
 Add to MetaCart
(Show Context)
The Push programming language was developed for use in genetic and evolutionary computation systems, as the representation within which evolving programs are expressed. It has been used in the production of several significant results, including results that were awarded a gold medal in the Human Competitive Results competition at GECCO2004. One of Push’s attractive features in this context is its transparent support for the expression and evolution of modular architectures and complex control structures, achieved through explicit code selfmanipulation. The latest version of Push, Push3, enhances this feature by permitting explicit manipulation of an execution stack that contains the expressions that are queued for execution in the interpreter. This paper provides a brief introduction to Push and to execution stack manipulation in Push3. It then presents a series of examples in which Push3 was used with a simple genetic programming system (PushGP) to evolve programs with nontrivial control structures.
Inductive Synthesis of Functional Programs: An Explanation Based Generalization Approach
 Journal of Machine Learning Research
, 2006
"... We describe an approach to the inductive synthesis of recursive equations from input/outputexamples which is based on the classical twostep approach to induction of functional Lisp programs of Summers (1977). In a first step, I/Oexamples are rewritten to traces which explain the outputs given t ..."
Abstract

Cited by 31 (12 self)
 Add to MetaCart
We describe an approach to the inductive synthesis of recursive equations from input/outputexamples which is based on the classical twostep approach to induction of functional Lisp programs of Summers (1977). In a first step, I/Oexamples are rewritten to traces which explain the outputs given the respective inputs based on a datatype theory. These traces can be integrated into one conditional expression which represents a nonrecursive program.
Universal Algorithmic Intelligence: A mathematical topdown approach
 Artificial General Intelligence
, 2005
"... Artificial intelligence; algorithmic probability; sequential decision theory; rational ..."
Abstract

Cited by 30 (6 self)
 Add to MetaCart
(Show Context)
Artificial intelligence; algorithmic probability; sequential decision theory; rational
On Universal Prediction and Bayesian Confirmation
 Theoretical Computer Science
, 2007
"... The Bayesian framework is a wellstudied and successful framework for inductive reasoning, which includes hypothesis testing and confirmation, parameter estimation, sequence prediction, classification, and regression. But standard statistical guidelines for choosing the model class and prior are not ..."
Abstract

Cited by 30 (14 self)
 Add to MetaCart
The Bayesian framework is a wellstudied and successful framework for inductive reasoning, which includes hypothesis testing and confirmation, parameter estimation, sequence prediction, classification, and regression. But standard statistical guidelines for choosing the model class and prior are not always available or can fail, in particular in complex situations. Solomonoff completed the Bayesian framework by providing a rigorous, unique, formal, and universal choice for the model class and the prior. I discuss in breadth how and in which sense universal (noni.i.d.) sequence prediction solves various (philosophical) problems of traditional Bayesian sequence prediction. I show that Solomonoff’s model possesses many desirable properties: Strong total and future bounds, and weak instantaneous bounds, and in contrast to most classical continuous prior densities has no zero p(oste)rior problem, i.e. can confirm universal hypotheses, is reparametrization and regrouping invariant, and avoids the oldevidence and updating problem. It even performs well
Dynamic Algorithm Portfolios
 ANNALS OF MATHEMATICS AND ARTIFICIAL INTELLIGENCE
, 2006
"... Traditional MetaLearning requires long training times, and is often focused on optimizing performance quality, neglecting computational complexity. Algorithm Portfolios are more robust, but present similar limitations. We reformulate algorithm selection as a time allocation problem: all candidate a ..."
Abstract

Cited by 29 (5 self)
 Add to MetaCart
(Show Context)
Traditional MetaLearning requires long training times, and is often focused on optimizing performance quality, neglecting computational complexity. Algorithm Portfolios are more robust, but present similar limitations. We reformulate algorithm selection as a time allocation problem: all candidate algorithms are run in parallel, and their relative priorities are continually updated based on runtime information, with the aim of minimizing the time to reach a desired performance level. Each algorithm's priority is set based on its current time to solution, estimated according to a parametric model that is trained and used while solving a sequence of problems, gradually increasing its impact on the priority attribution. The use of
A MonteCarlo AIXI Approximation
, 2009
"... This paper describes a computationally feasible approximation to the AIXI agent, a universal reinforcement learning agent for arbitrary environments. AIXI is scaled down in two key ways: First, the class of environment models is restricted to all prediction suffix trees of a fixed maximum depth. Thi ..."
Abstract

Cited by 28 (9 self)
 Add to MetaCart
This paper describes a computationally feasible approximation to the AIXI agent, a universal reinforcement learning agent for arbitrary environments. AIXI is scaled down in two key ways: First, the class of environment models is restricted to all prediction suffix trees of a fixed maximum depth. This allows a Bayesian mixture of environment models to be computed in time proportional to the logarithm of the size of the model class. Secondly, the finitehorizon expectimax search is approximated by an asymptotically convergent Monte Carlo Tree Search technique. This scaled down AIXI agent is empirically shown to be effective on a wide class of toy problem domains, ranging from simple fully observable games to small POMDPs. We explore the limits of this approximate agent and propose a general heuristic framework for scaling this technique to much larger problems.
Gödel machines: Fully selfreferential optimal universal selfimprovers
 Goertzel and C. Pennachin, Artificial General Intelligence
, 2006
"... Summary. We present the first class of mathematically rigorous, general, fully selfreferential, selfimproving, optimally efficient problem solvers. Inspired by Kurt Gödel’s celebrated selfreferential formulas (1931), such a problem solver rewrites any part of its own code as soon as it has found ..."
Abstract

Cited by 27 (13 self)
 Add to MetaCart
(Show Context)
Summary. We present the first class of mathematically rigorous, general, fully selfreferential, selfimproving, optimally efficient problem solvers. Inspired by Kurt Gödel’s celebrated selfreferential formulas (1931), such a problem solver rewrites any part of its own code as soon as it has found a proof that the rewrite is useful, where the problemdependent utility function and the hardware and the entire initial code are described by axioms encoded in an initial proof searcher which is also part of the initial code. The searcher systematically and efficiently tests computable proof techniques (programs whose outputs are proofs) until it finds a provably useful, computable selfrewrite. We show that such a selfrewrite is globally optimal—no local maxima!—since the code first had to prove that it is not useful to continue the proof search for alternative selfrewrites. Unlike previous nonselfreferential methods based on hardwired proof searchers, ours not only boasts an optimal order of complexity but can optimally reduce any slowdowns hidden by the O()notation, provided the utility of such speedups is provable at all. 1
Learning dynamic algorithm portfolios
 ANN MATH ARTIF INTELL (2006) 47:295–328
, 2006
"... Algorithm selection can be performed using a model of runtime distribution, learned during a preliminary training phase. There is a tradeoff between the performance of modelbased algorithm selection, and the cost of learning the model. In this paper, we treat this tradeoff in the context of bandi ..."
Abstract

Cited by 23 (1 self)
 Add to MetaCart
(Show Context)
Algorithm selection can be performed using a model of runtime distribution, learned during a preliminary training phase. There is a tradeoff between the performance of modelbased algorithm selection, and the cost of learning the model. In this paper, we treat this tradeoff in the context of bandit problems. We propose a fully dynamic and online algorithm selection technique, with no separate training phase: all candidate algorithms are run in parallel, while a model incrementally learns their runtime distributions. A redundant set of time allocators uses the partially trained model to propose machine time shares for the algorithms. A bandit problem solver mixes the modelbased shares with a uniform share, gradually increasing the impact of the best time allocators as the model improves. We present experiments with a set of SAT solvers on a mixed SATUNSAT benchmark; and with a set of solvers for the Auction Winner Determination problem.
Feature reinforcement learning: Part I. Unstructured MDPs
 Journal of General Artificial Intelligence
, 2009
"... www.hutter1.net Generalpurpose, intelligent, learning agents cycle through sequences of observations, actions, and rewards that are complex, uncertain, unknown, and nonMarkovian. On the other hand, reinforcement learning is welldeveloped for small finite state Markov decision processes (MDPs). Up ..."
Abstract

Cited by 23 (9 self)
 Add to MetaCart
www.hutter1.net Generalpurpose, intelligent, learning agents cycle through sequences of observations, actions, and rewards that are complex, uncertain, unknown, and nonMarkovian. On the other hand, reinforcement learning is welldeveloped for small finite state Markov decision processes (MDPs). Up to now, extracting the right state representations out of bare observations, that is, reducing the general agent setup to the MDP framework, is an art that involves significant effort by designers. The primary goal of this work is to automate the reduction process and thereby significantly expand the scope of many existing reinforcement learning algorithms and the agents that employ them. Before we can think of mechanizing this search for suitable MDPs, we need a formal objective criterion. The main contribution of this article is to develop such a criterion. I also integrate the various parts into one learning algorithm. Extensions to more realistic dynamic Bayesian networks are developed in Part