Results 11  20
of
73
A Monte Carlo AIXI Approximation
 J. Artif. Intell. Res
"... This paper describes a computationally feasible approximation to the AIXI agent, a universal reinforcement learning agent for arbitrary environments. AIXI is scaled down in two key ways: First, the class of environment models is restricted to all prediction suffix trees of a fixed maximum depth. Thi ..."
Abstract

Cited by 21 (11 self)
 Add to MetaCart
This paper describes a computationally feasible approximation to the AIXI agent, a universal reinforcement learning agent for arbitrary environments. AIXI is scaled down in two key ways: First, the class of environment models is restricted to all prediction suffix trees of a fixed maximum depth. This allows a Bayesian mixture of environment models to be computed in time proportional to the logarithm of the size of the model class. Secondly, the finitehorizon expectimax search is approximated by an asymptotically convergent Monte Carlo Tree Search technique. This scaled down AIXI agent is empirically shown to be effective on a wide class of toy problem domains, ranging from simple fully observable games to small POMDPs. We explore the limits of this approximate agent and propose a general heuristic framework for scaling this technique to much larger problems.
BiasOptimal Incremental Problem Solving
 In Advances in Neural Information Processing Systems 15
, 2003
"... Given is a problem sequence and a probability distribution (the bias) on programs computing solution candidates. We present an optimally fast way of incrementally solving each task in the sequence. Bias shifts are computed by program prefixes that modify the distribution on their suffixes by reusing ..."
Abstract

Cited by 19 (9 self)
 Add to MetaCart
(Show Context)
Given is a problem sequence and a probability distribution (the bias) on programs computing solution candidates. We present an optimally fast way of incrementally solving each task in the sequence. Bias shifts are computed by program prefixes that modify the distribution on their suffixes by reusing successful code for previous tasks (stored in nonmodifiable memory). No tested program gets more runtime than its probability times the total search time. In illustrative experiments, ours becomes the first general system to learn a universal solver for arbitrary disk Towers of Hanoi tasks (minimal solution size 2^n  1). It demonstrates the advantages of incremental learning by profiting from previously solved, simpler tasks involving samples of a simple context free language.
Gödel Machines: SelfReferential Universal Problem Solvers Making Provably Optimal SelfImprovements
, 2003
"... An old dream of computer scientists is to build an optimally efficient universal problem solver. We show how to solve arbitrary computational problems in an optimal fashion inspired by Kurt Gödel's celebrated selfreferential formulas (1931). Our Gödel machine's initial software includes ..."
Abstract

Cited by 19 (8 self)
 Add to MetaCart
(Show Context)
An old dream of computer scientists is to build an optimally efficient universal problem solver. We show how to solve arbitrary computational problems in an optimal fashion inspired by Kurt Gödel's celebrated selfreferential formulas (1931). Our Gödel machine's initial software includes an axiomatic description of: the Gödel machine's hardware, the problemspecific utility function (such as the expected future reward of a robot), known aspects of the environment, costs of actions and computations, and the initial software itself (this is possible without introducing circularity). It also includes a typically suboptimal initial problemsolving policy and an asymptotically optimal proof searcher searching the space of computable proof techniques  that is, programs whose outputs are proofs. Unlike previous approaches, the selfreferential Gödel machine will rewrite any part of its software, including axioms and proof searcher, as soon as it has found a proof that this will improve its future performance, given its typically limited computational resources. We show that selfrewrites are globally optimal  no local minima!since provably none of all the alternative rewrites and proofs (those that could be found by continuing the proof search) are worth waiting for.
The New AI: General & Sound & Relevant for Physics
 ARTIFICIAL GENERAL INTELLIGENCE (ACCEPTED 2002)
, 2003
"... Most traditional artificial intelligence (AI) systems of the past 50 years are either very limited, or based on heuristics, or both. The new millennium, however, has brought substantial progress in the field of theoretically optimal and practically feasible algorithms for prediction, search, induct ..."
Abstract

Cited by 18 (9 self)
 Add to MetaCart
Most traditional artificial intelligence (AI) systems of the past 50 years are either very limited, or based on heuristics, or both. The new millennium, however, has brought substantial progress in the field of theoretically optimal and practically feasible algorithms for prediction, search, inductive inference based on Occam’s razor, problem solving, decision making, and reinforcement learning in environments of a very general type. Since inductive inference is at the heart of all inductive sciences, some of the results are relevant not only for AI and computer science but also for physics, provoking nontraditional predictions based on Zuse’s thesis of the computergenerated universe.
Progress in Incremental Machine Learning
, 2003
"... We will describe recent developments in a system for machine learning that we've been working on for some time (Sol 86, Sol 89). It is meant to be a "Scientist's Assistant" of great power and versatility in many areas of science and mathematics. It di#ers from other ambitious ..."
Abstract

Cited by 12 (1 self)
 Add to MetaCart
We will describe recent developments in a system for machine learning that we've been working on for some time (Sol 86, Sol 89). It is meant to be a "Scientist's Assistant" of great power and versatility in many areas of science and mathematics. It di#ers from other ambitious work in this area in that we are not so much interested in knowledge itself, as we are in how it is acquired  how machines may learn. To start o#, the system will learn to solve two very general kinds of problems. Most, but perhaps not all problems in science and engineering are of these two kinds.
On the foundations of universal sequence prediction
 In Proc. 3rd Annual Conference on Theory and Applications of Models of Computation (TAMC’06), volume 3959 of LNCS
, 2006
"... Solomonoff completed the Bayesian framework by providing a rigorous, unique, formal, and universal choice for the model class and the prior. We discuss in breadth how and in which sense universal (noni.i.d.) sequence prediction solves various (philosophical) problems of traditional Bayesian sequenc ..."
Abstract

Cited by 11 (4 self)
 Add to MetaCart
Solomonoff completed the Bayesian framework by providing a rigorous, unique, formal, and universal choice for the model class and the prior. We discuss in breadth how and in which sense universal (noni.i.d.) sequence prediction solves various (philosophical) problems of traditional Bayesian sequence prediction. We show that Solomonoff’s model possesses many desirable properties: Fast convergence and strong bounds, and in contrast to most classical continuous prior densities has no zero p(oste)rior problem, i.e. can confirm universal hypotheses, is reparametrization and regrouping invariant, and avoids the oldevidence and updating problem. It even performs well (actually better) in noncomputable environments.
POWERPLAY: Training an Increasingly General Problem Solver by Continually Searching for the Simplest Still Unsolvable Problem
, 2011
"... Most of computer science focuses on automatically solving given computational problems. I focus on automatically inventing or discovering problems in a way inspired by the playful behavior of animals and humans, to train a more and more general problem solver from scratch in an unsupervised fashion. ..."
Abstract

Cited by 11 (4 self)
 Add to MetaCart
(Show Context)
Most of computer science focuses on automatically solving given computational problems. I focus on automatically inventing or discovering problems in a way inspired by the playful behavior of animals and humans, to train a more and more general problem solver from scratch in an unsupervised fashion. At any given time, the novel algorithmic framework POWERPLAY searches the space of possible pairs of new tasks and modifications of the current problem solver, until it finds a more powerful problem solver that provably solves all previously learned tasks plus the new one, while the unmodified predecessor does not. The new task and its corresponding tasksolving skill are those first found and validated. Newly invented tasks may require making previously learned skills more efficient. The greedy search of typical POWERPLAY variants orders candidate pairs of tasks and solver modifications by their conditional computational complexity, given the stored experience so far. This biases the search towards pairs that can be described compactly and validated quickly. Standard problem solver architectures of personal computers or neural networks tend to generalize by solving numerous tasks outside the selfinvented training set; POWERPLAY’s ongoing search for novelty keeps fighting to extend beyond the generalization abilities of its present solver. The continually increasing repertoire of problem solving procedures can be exploited
Feature Markov decision processes
 In Proc. 2nd Conf. on Artificial General Intelligence (AGI’09
, 2009
"... General purpose intelligent learning agents cycle through (complex,nonMDP) sequences of observations, actions, and rewards. On the other hand, reinforcement learning is welldeveloped for small finite state Markov Decision Processes (MDPs). So far it is an art performed by human designers to extrac ..."
Abstract

Cited by 8 (6 self)
 Add to MetaCart
General purpose intelligent learning agents cycle through (complex,nonMDP) sequences of observations, actions, and rewards. On the other hand, reinforcement learning is welldeveloped for small finite state Markov Decision Processes (MDPs). So far it is an art performed by human designers to extract the right state representation out of the bare observations, i.e. to reduce the agent setup to the MDP framework. Before we can think of mechanizing this search for suitable MDPs, we need a formal objective criterion. The main contribution of this article is to develop such a criterion. I also integrate the various parts into one learning algorithm. Extensions to more realistic dynamic Bayesian networks are developed in the companion article [Hut09].
Adaptive Online Time Allocation to Search Algorithms
 MACHINE LEARNING: ECML 2004. PROCEEDINGS OF THE 15TH EUROPEAN CONFERENCE ON MACHINE LEARNING
, 2004
"... Given is a search problem or a sequence of search problems, as well as a set of potentially useful search algorithms. We propose a general framework for online allocation of computation time to search algorithms based on experience with their performance so far. In an example instantiation, we use s ..."
Abstract

Cited by 7 (6 self)
 Add to MetaCart
Given is a search problem or a sequence of search problems, as well as a set of potentially useful search algorithms. We propose a general framework for online allocation of computation time to search algorithms based on experience with their performance so far. In an example instantiation, we use simple linear extrapolation of performance for allocating time to various simultaneously running genetic algorithms characterized by different parameter values. Despite the large number of searchers tested in parallel, on various tasks this rather general approach compares favorably to a more specialized stateoftheart heuristic; in one case it is nearly two orders of magnitude faster.
SelfProgramming: Operationalizing Autonomy
"... Lacking an operational definition of autonomy has considerably weakened the concept's impact in systems engineering. Most current “autonomous ” systems are built to operate in conditions more or less fully described a priori, which is insufficient for achieving highly autonomous systems that ad ..."
Abstract

Cited by 6 (6 self)
 Add to MetaCart
(Show Context)
Lacking an operational definition of autonomy has considerably weakened the concept's impact in systems engineering. Most current “autonomous ” systems are built to operate in conditions more or less fully described a priori, which is insufficient for achieving highly autonomous systems that adapt efficiently to unforeseen situations. In an effort to clarify the nature of autonomy we propose an operational definition of autonomy: a selfprogramming process. We introduce Ikon Flux, a protoarchitecture for selfprogramming systems and we describe how it meets key requirements for the construction of such systems. Structural Autonomy as SelfProgramming We aim at the construction of machines able to adapt to unforeseen situations in openended environments.