Results 1  10
of
82
Algorithmic information theory
 IBM JOURNAL OF RESEARCH AND DEVELOPMENT
, 1977
"... This paper reviews algorithmic information theory, which is an attempt to apply informationtheoretic and probabilistic ideas to recursive function theory. Typical concerns in this approach are, for example, the number of bits of information required to specify an algorithm, or the probability that ..."
Abstract

Cited by 320 (19 self)
 Add to MetaCart
This paper reviews algorithmic information theory, which is an attempt to apply informationtheoretic and probabilistic ideas to recursive function theory. Typical concerns in this approach are, for example, the number of bits of information required to specify an algorithm, or the probability that a program whose bits are chosen by coin flipping produces a given output. During the past few years the definitions of algorithmic information theory have been reformulated. The basic features of the new formalism are presented here and certain results of R. M. Solovay are reported.
The Dimensions of Individual Strings and Sequences
 INFORMATION AND COMPUTATION
, 2003
"... A constructive version of Hausdorff dimension is developed using constructive supergales, which are betting strategies that generalize the constructive supermartingales used in the theory of individual random sequences. This constructive dimension is used to assign every individual (infinite, binary ..."
Abstract

Cited by 93 (10 self)
 Add to MetaCart
A constructive version of Hausdorff dimension is developed using constructive supergales, which are betting strategies that generalize the constructive supermartingales used in the theory of individual random sequences. This constructive dimension is used to assign every individual (infinite, binary) sequence S a dimension, which is a real number dim(S) in the interval [0, 1]. Sequences that
Minimum Description Length Induction, Bayesianism, and Kolmogorov Complexity
 IEEE Transactions on Information Theory
, 1998
"... The relationship between the Bayesian approach and the minimum description length approach is established. We sharpen and clarify the general modeling principles MDL and MML, abstracted as the ideal MDL principle and defined from Bayes's rule by means of Kolmogorov complexity. The basic condition un ..."
Abstract

Cited by 67 (7 self)
 Add to MetaCart
The relationship between the Bayesian approach and the minimum description length approach is established. We sharpen and clarify the general modeling principles MDL and MML, abstracted as the ideal MDL principle and defined from Bayes's rule by means of Kolmogorov complexity. The basic condition under which the ideal principle should be applied is encapsulated as the Fundamental Inequality, which in broad terms states that the principle is valid when the data are random, relative to every contemplated hypothesis and also these hypotheses are random relative to the (universal) prior. Basically, the ideal principle states that the prior probability associated with the hypothesis should be given by the algorithmic universal probability, and the sum of the log universal probability of the model plus the log of the probability of the data given the model should be minimized. If we restrict the model class to the finite sets then application of the ideal principle turns into Kolmogorov's mi...
Learning Simple Concepts Under Simple Distributions
 SIAM JOURNAL OF COMPUTING
, 1991
"... We aim at developing a learning theory where `simple' concepts are easily learnable. In Valiant's learning model, many concepts turn out to be too hard (like NP hard) to learn. Relatively few concept classes were shown to be learnable polynomially. In daily life, it seems that things we care to le ..."
Abstract

Cited by 56 (3 self)
 Add to MetaCart
We aim at developing a learning theory where `simple' concepts are easily learnable. In Valiant's learning model, many concepts turn out to be too hard (like NP hard) to learn. Relatively few concept classes were shown to be learnable polynomially. In daily life, it seems that things we care to learn are usually learnable. To model the intuitive notion of learning more closely, we do not require that the learning algorithm learns (polynomially) under all distributions, but only under all simple distributions. A distribution is simple if it is dominated by an enumerable distrib...
Algorithmic Statistics
 IEEE Transactions on Information Theory
, 2001
"... While Kolmogorov complexity is the accepted absolute measure of information content of an individual finite object, a similarly absolute notion is needed for the relation between an individual data sample and an individual model summarizing the information in the data, for example, a finite set (or ..."
Abstract

Cited by 52 (14 self)
 Add to MetaCart
While Kolmogorov complexity is the accepted absolute measure of information content of an individual finite object, a similarly absolute notion is needed for the relation between an individual data sample and an individual model summarizing the information in the data, for example, a finite set (or probability distribution) where the data sample typically came from. The statistical theory based on such relations between individual objects can be called algorithmic statistics, in contrast to classical statistical theory that deals with relations between probabilistic ensembles. We develop the algorithmic theory of statistic, sufficient statistic, and minimal sufficient statistic. This theory is based on twopart codes consisting of the code for the statistic (the model summarizing the regularity, the meaningful information, in the data) and the modeltodata code. In contrast to the situation in probabilistic statistical theory, the algorithmic relation of (minimal) sufficiency is an absolute relation between the individual model and the individual data sample. We distinguish implicit and explicit descriptions of the models. We give characterizations of algorithmic (Kolmogorov) minimal sufficient statistic for all data samples for both description modes in the explicit mode under some constraints. We also strengthen and elaborate earlier results on the "Kolmogorov structure function" and "absolutely nonstochastic objects" those rare objects for which the simplest models that summarize their relevant information (minimal sucient statistics) are at least as complex as the objects themselves. We demonstrate a close relation between the probabilistic notions and the algorithmic ones: (i) in both cases there is an "information nonincrease" law; (ii) it is shown that a function is a...
Computational mechanics: Pattern and prediction, structure and simplicity
 Journal of Statistical Physics
, 1999
"... Computational mechanics, an approach to structural complexity, defines a process’s causal states and gives a procedure for finding them. We show that the causalstate representation—an Emachine—is the minimal one consistent with ..."
Abstract

Cited by 43 (8 self)
 Add to MetaCart
Computational mechanics, an approach to structural complexity, defines a process’s causal states and gives a procedure for finding them. We show that the causalstate representation—an Emachine—is the minimal one consistent with
On initial segment complexity and degrees of randomness
 Trans. Amer. Math. Soc
"... Abstract. One approach to understanding the fine structure of initial segment complexity was introduced by Downey, Hirschfeldt and LaForte. They define X ≤K Y to mean that (∀n) K(X ↾ n) ≤ K(Y ↾ n) +O(1). The equivalence classes under this relation are the Kdegrees. We prove that if X ⊕ Y is 1rand ..."
Abstract

Cited by 32 (6 self)
 Add to MetaCart
Abstract. One approach to understanding the fine structure of initial segment complexity was introduced by Downey, Hirschfeldt and LaForte. They define X ≤K Y to mean that (∀n) K(X ↾ n) ≤ K(Y ↾ n) +O(1). The equivalence classes under this relation are the Kdegrees. We prove that if X ⊕ Y is 1random, then X and Y have no upper bound in the Kdegrees (hence, no join). We also prove that nrandomness is closed upward in the Kdegrees. Our main tool is another structure intended to measure the degree of randomness of real numbers: the vLdegrees. Unlike the Kdegrees, many basic properties of the vLdegrees are easy to prove. We show that X ≤K Y implies X ≤vL Y, so some results can be transferred. The reverse implication is proved to fail. The same analysis is also done for ≤C, the analogue of ≤K for plain Kolmogorov complexity. Two other interesting results are included. First, we prove that for any Z ∈ 2ω, a 1random real computable from a 1Zrandom real is automatically 1Zrandom. Second, we give a plain Kolmogorov complexity characterization of 1randomness. This characterization is related to our proof that X ≤C Y implies X ≤vL Y. 1.
Towards a universal theory of artificial intelligence based on algorithmic probability and sequential decisions
 Proceedings of the 12 th Eurpean Conference on Machine Learning (ECML2001
, 2001
"... Abstract. Decision theory formally solves the problem of rational agents in uncertain worlds if the true environmental probability distribution is known. Solomonoff’s theory of universal induction formally solves the problem of sequence prediction for unknown distributions. We unify both theories an ..."
Abstract

Cited by 26 (10 self)
 Add to MetaCart
Abstract. Decision theory formally solves the problem of rational agents in uncertain worlds if the true environmental probability distribution is known. Solomonoff’s theory of universal induction formally solves the problem of sequence prediction for unknown distributions. We unify both theories and give strong arguments that the resulting universal AIξ model behaves optimally in any computable environment. The major drawback of the AIξ model is that it is uncomputable. To overcome this problem, we construct a modified algorithm AIξ tl, which is still superior to any other time t and length l bounded agent. The computation time of AIξ tl is of the order t·2 l. 1
New Error Bounds for Solomonoff Prediction
 Journal of Computer and System Sciences
, 1999
"... Several new relations between universal Solomonoff sequence prediction and informed prediction and general probabilistic prediction schemes will be proved. Among others, they show that the number of errors in Solomonoff prediction is finite for computable prior probability, if finite in the informed ..."
Abstract

Cited by 23 (16 self)
 Add to MetaCart
Several new relations between universal Solomonoff sequence prediction and informed prediction and general probabilistic prediction schemes will be proved. Among others, they show that the number of errors in Solomonoff prediction is finite for computable prior probability, if finite in the informed case, where the prior is known. Deterministic variants will also be studied. The most interesting result is that the deterministic variant of Solomonoff prediction is optimal compared to any other probabilistic or deterministic prediction scheme apart from additive square root corrections only. This makes it well suited even for difficult prediction problems, where it does not suffice when the number of errors is minimal to within some factor greater than one. Solomonoff's original bound and the ones presented here complement each other in a useful way.