Results 1  10
of
36
Minimum Description Length Induction, Bayesianism, and Kolmogorov Complexity
 IEEE Transactions on Information Theory
, 1998
"... The relationship between the Bayesian approach and the minimum description length approach is established. We sharpen and clarify the general modeling principles MDL and MML, abstracted as the ideal MDL principle and defined from Bayes's rule by means of Kolmogorov complexity. The basic conditi ..."
Abstract

Cited by 81 (7 self)
 Add to MetaCart
The relationship between the Bayesian approach and the minimum description length approach is established. We sharpen and clarify the general modeling principles MDL and MML, abstracted as the ideal MDL principle and defined from Bayes's rule by means of Kolmogorov complexity. The basic condition under which the ideal principle should be applied is encapsulated as the Fundamental Inequality, which in broad terms states that the principle is valid when the data are random, relative to every contemplated hypothesis and also these hypotheses are random relative to the (universal) prior. Basically, the ideal principle states that the prior probability associated with the hypothesis should be given by the algorithmic universal probability, and the sum of the log universal probability of the model plus the log of the probability of the data given the model should be minimized. If we restrict the model class to the finite sets then application of the ideal principle turns into Kolmogorov's mi...
The Speed Prior: A New Simplicity Measure Yielding NearOptimal Computable Predictions
 Proceedings of the 15th Annual Conference on Computational Learning Theory (COLT 2002), Lecture Notes in Artificial Intelligence
, 2002
"... Solomonoff's optimal but noncomputable method for inductive inference assumes that observation sequences x are drawn from an recursive prior distribution p(x). Instead of using the unknown p() he predicts using the celebrated universal enumerable prior M() which for all exceeds any recursiv ..."
Abstract

Cited by 61 (21 self)
 Add to MetaCart
Solomonoff's optimal but noncomputable method for inductive inference assumes that observation sequences x are drawn from an recursive prior distribution p(x). Instead of using the unknown p() he predicts using the celebrated universal enumerable prior M() which for all exceeds any recursive p(), save for a constant factor independent of x. The simplicity measure M() naturally implements "Occam's razor " and is closely related to the Kolmogorov complexity of . However, M assigns high probability to certain data that are extremely hard to compute. This does not match our intuitive notion of simplicity. Here we suggest a more plausible measure derived from the fastest way of computing data. In absence of contrarian evidence, we assume that the physical world is generated by a computational process, and that any possibly infinite sequence of observations is therefore computable in the limit (this assumption is more radical and stronger than Solomonoff's).
Hierarchies Of Generalized Kolmogorov Complexities And Nonenumerable Universal Measures Computable In The Limit
 INTERNATIONAL JOURNAL OF FOUNDATIONS OF COMPUTER SCIENCE
, 2000
"... The traditional theory of Kolmogorov complexity and algorithmic probability focuses on monotone Turing machines with oneway writeonly output tape. This naturally leads to the universal enumerable SolomonoLevin measure. Here we introduce more general, nonenumerable but cumulatively enumerable m ..."
Abstract

Cited by 44 (21 self)
 Add to MetaCart
(Show Context)
The traditional theory of Kolmogorov complexity and algorithmic probability focuses on monotone Turing machines with oneway writeonly output tape. This naturally leads to the universal enumerable SolomonoLevin measure. Here we introduce more general, nonenumerable but cumulatively enumerable measures (CEMs) derived from Turing machines with lexicographically nondecreasing output and random input, and even more general approximable measures and distributions computable in the limit. We obtain a natural hierarchy of generalizations of algorithmic probability and Kolmogorov complexity, suggesting that the "true" information content of some (possibly in nite) bitstring x is the size of the shortest nonhalting program that converges to x and nothing but x on a Turing machine that can edit its previous outputs. Among other things we show that there are objects computable in the limit yet more random than Chaitin's "number of wisdom" Omega, that any approximable measure of x is small for any x lacking a short description, that there is no universal approximable distribution, that there is a universal CEM, and that any nonenumerable CEM of x is small for any x lacking a short enumerating program. We briey mention consequences for universes sampled from such priors.
Algorithmic Theories Of Everything
, 2000
"... The probability distribution P from which the history of our universe is sampled represents a theory of everything or TOE. We assume P is formally describable. Since most (uncountably many) distributions are not, this imposes a strong inductive bias. We show that P(x) is small for any universe x lac ..."
Abstract

Cited by 35 (15 self)
 Add to MetaCart
(Show Context)
The probability distribution P from which the history of our universe is sampled represents a theory of everything or TOE. We assume P is formally describable. Since most (uncountably many) distributions are not, this imposes a strong inductive bias. We show that P(x) is small for any universe x lacking a short description, and study the spectrum of TOEs spanned by two Ps, one reflecting the most compact constructive descriptions, the other the fastest way of computing everything. The former derives from generalizations of traditional computability, Solomonoff’s algorithmic probability, Kolmogorov complexity, and objects more random than Chaitin’s Omega, the latter from Levin’s universal search and a natural resourceoriented postulate: the cumulative prior probability of all x incomputable within time t by this optimal algorithm should be 1/t. Between both Ps we find a universal cumulatively enumerable measure that dominates traditional enumerable measures; any such CEM must assign low probability to any universe lacking a short enumerating program. We derive Pspecific consequences for evolving observers, inductive reasoning, quantum physics, philosophy, and the expected duration of our universe.
On Universal Prediction and Bayesian Confirmation
 Theoretical Computer Science
, 2007
"... The Bayesian framework is a wellstudied and successful framework for inductive reasoning, which includes hypothesis testing and confirmation, parameter estimation, sequence prediction, classification, and regression. But standard statistical guidelines for choosing the model class and prior are not ..."
Abstract

Cited by 31 (15 self)
 Add to MetaCart
The Bayesian framework is a wellstudied and successful framework for inductive reasoning, which includes hypothesis testing and confirmation, parameter estimation, sequence prediction, classification, and regression. But standard statistical guidelines for choosing the model class and prior are not always available or can fail, in particular in complex situations. Solomonoff completed the Bayesian framework by providing a rigorous, unique, formal, and universal choice for the model class and the prior. I discuss in breadth how and in which sense universal (noni.i.d.) sequence prediction solves various (philosophical) problems of traditional Bayesian sequence prediction. I show that Solomonoff’s model possesses many desirable properties: Strong total and future bounds, and weak instantaneous bounds, and in contrast to most classical continuous prior densities has no zero p(oste)rior problem, i.e. can confirm universal hypotheses, is reparametrization and regrouping invariant, and avoids the oldevidence and updating problem. It even performs well
The New AI: General & Sound & Relevant for Physics
 ARTIFICIAL GENERAL INTELLIGENCE (ACCEPTED 2002)
, 2003
"... Most traditional artificial intelligence (AI) systems of the past 50 years are either very limited, or based on heuristics, or both. The new millennium, however, has brought substantial progress in the field of theoretically optimal and practically feasible algorithms for prediction, search, induct ..."
Abstract

Cited by 18 (9 self)
 Add to MetaCart
Most traditional artificial intelligence (AI) systems of the past 50 years are either very limited, or based on heuristics, or both. The new millennium, however, has brought substantial progress in the field of theoretically optimal and practically feasible algorithms for prediction, search, inductive inference based on Occam’s razor, problem solving, decision making, and reinforcement learning in environments of a very general type. Since inductive inference is at the heart of all inductive sciences, some of the results are relevant not only for AI and computer science but also for physics, provoking nontraditional predictions based on Zuse’s thesis of the computergenerated universe.
Sequence prediction based on monotone complexity
 In Proc. 16th Annual Conference on Learning Theory (COLT’03), volume 2777 of LNAI
, 2003
"... This paper studies sequence prediction based on the monotone Kolmogorov complexity Km=−log m, i.e. based on universal deterministic/onepart MDL. m is extremely close to Solomonoff’s prior M, the latter being an excellent predictor in deterministic as well as probabilistic environments, where perfor ..."
Abstract

Cited by 14 (14 self)
 Add to MetaCart
(Show Context)
This paper studies sequence prediction based on the monotone Kolmogorov complexity Km=−log m, i.e. based on universal deterministic/onepart MDL. m is extremely close to Solomonoff’s prior M, the latter being an excellent predictor in deterministic as well as probabilistic environments, where performance is measured in terms of convergence of posteriors or losses. Despite this closeness to M, it is difficult to assess the prediction quality of m, since little is known about the closeness of their posteriors, which are the important quantities for prediction. We show that for deterministic computable environments, the “posterior ” and losses of m converge, but rapid convergence could only be shown onsequence; the offsequence behavior is unclear. In probabilistic environments, neither the posterior nor the losses converge, in general.
Relations between varieties of Kolmogorov complexity
 Mathematical Systems Theory
, 1996
"... Abstract. There are several sorts of Kolmogorov complexity, better to say several Kolmogorov complexities: decision complexity, simple complexity, prefix complexity, monotonic complexity, a priori complexity. The last three can and the first two cannot be used for defining randomness of an infinite ..."
Abstract

Cited by 13 (2 self)
 Add to MetaCart
(Show Context)
Abstract. There are several sorts of Kolmogorov complexity, better to say several Kolmogorov complexities: decision complexity, simple complexity, prefix complexity, monotonic complexity, a priori complexity. The last three can and the first two cannot be used for defining randomness of an infinite binary sequence. All those five versions of Kolmogorov complexity were considered, from a unified point of view, in a paper by the first author which appeared in Watanabe’s book [23]. Upper and lower bounds for those complexities and also for their differences were announced in that paper without proofs. (Some of those bounds are mentioned in Section 4.4.5 of [16].) The purpose of this paper (which can be read independently of [23]) is to give proofs for the bounds from [23]. The terminology used in this paper is somehow nonstandard: we call “Kolmogorov entropy ” what is usually called “Kolmogorov complexity. ” This is a Moscow tradition suggested by Kolmogorov himself. By this tradition the term “complexity ” relates to any mode of description and “entropy ” is the complexity related to an optimal mode (i.e., to a mode that, roughly speaking, gives the shortest descriptions).