Results 1  10
of
17
Algorithmic Statistics
 IEEE Transactions on Information Theory
, 2001
"... While Kolmogorov complexity is the accepted absolute measure of information content of an individual finite object, a similarly absolute notion is needed for the relation between an individual data sample and an individual model summarizing the information in the data, for example, a finite set (or ..."
Abstract

Cited by 50 (13 self)
 Add to MetaCart
While Kolmogorov complexity is the accepted absolute measure of information content of an individual finite object, a similarly absolute notion is needed for the relation between an individual data sample and an individual model summarizing the information in the data, for example, a finite set (or probability distribution) where the data sample typically came from. The statistical theory based on such relations between individual objects can be called algorithmic statistics, in contrast to classical statistical theory that deals with relations between probabilistic ensembles. We develop the algorithmic theory of statistic, sufficient statistic, and minimal sufficient statistic. This theory is based on twopart codes consisting of the code for the statistic (the model summarizing the regularity, the meaningful information, in the data) and the modeltodata code. In contrast to the situation in probabilistic statistical theory, the algorithmic relation of (minimal) sufficiency is an absolute relation between the individual model and the individual data sample. We distinguish implicit and explicit descriptions of the models. We give characterizations of algorithmic (Kolmogorov) minimal sufficient statistic for all data samples for both description modes in the explicit mode under some constraints. We also strengthen and elaborate earlier results on the "Kolmogorov structure function" and "absolutely nonstochastic objects" those rare objects for which the simplest models that summarize their relevant information (minimal sucient statistics) are at least as complex as the objects themselves. We demonstrate a close relation between the probabilistic notions and the algorithmic ones: (i) in both cases there is an "information nonincrease" law; (ii) it is shown that a function is a...
Alignmentfree sequence comparisona review
 Bioinformatics
, 2003
"... Motivation: Genetic recombination and, in particular, genetic shuffling are at odds with sequence comparison by alignment, which assumes conservation of contiguity between homologous segments. A variety of theoretical foundations are being used to derive alignmentfree methods that overcome this lim ..."
Abstract

Cited by 42 (5 self)
 Add to MetaCart
Motivation: Genetic recombination and, in particular, genetic shuffling are at odds with sequence comparison by alignment, which assumes conservation of contiguity between homologous segments. A variety of theoretical foundations are being used to derive alignmentfree methods that overcome this limitation. The formulation of alternative metrics for dissimilarity between sequences and their algorithmic implementations are reviewed. Results: The overwhelming majority of work on alignmentfree sequence has taken place in the past two decades, with most reports published in the past 5 years. Two main categories of methods have been proposed—methods based on word (oligomer) frequency, and methods that do not require resolving the sequence with fixed word length segments. The first category is based on the statistics of word frequency, on the distances defined in a Cartesian space defined by the frequency vectors, and on the information content of frequency distribution. The second category includes the use of Kolmogorov complexity and Chaos Theory. Despite their low visibility, alignmentfree metrics are in fact already widely used as preselection filters for alignmentbased querying of large applications. Recent work is furthering their usage as a scaleindependent methodology that is capable of recognizing homology when loss of contiguity is beyond the possibility of alignment. Availability: Most of the alignmentfree algorithms reviewed were implemented in MATLAB code and are available
Kolmogorov’s structure functions and model selection
 IEEE Trans. Inform. Theory
"... approach to statistics and model selection. Let data be finite binary strings and models be finite sets of binary strings. Consider model classes consisting of models of given maximal (Kolmogorov) complexity. The “structure function ” of the given data expresses the relation between the complexity l ..."
Abstract

Cited by 33 (13 self)
 Add to MetaCart
approach to statistics and model selection. Let data be finite binary strings and models be finite sets of binary strings. Consider model classes consisting of models of given maximal (Kolmogorov) complexity. The “structure function ” of the given data expresses the relation between the complexity level constraint on a model class and the least logcardinality of a model in the class containing the data. We show that the structure function determines all stochastic properties of the data: for every constrained model class it determines the individual bestfitting model in the class irrespective of whether the “true ” model is in the model class considered or not. In this setting, this happens with certainty, rather than with high probability as is in the classical case. We precisely quantify the goodnessoffit of an individual model with respect to individual data. We show that—within the obvious constraints—every graph is realized by the structure function of some data. We determine the (un)computability properties of the various functions contemplated and of the “algorithmic minimal sufficient statistic.” Index Terms— constrained minimum description length (ML) constrained maximum likelihood (MDL) constrained bestfit model selection computability lossy compression minimal sufficient statistic nonprobabilistic statistics Kolmogorov complexity, Kolmogorov Structure function prediction sufficient statistic
Kolmogorov’s structure functions with an application to the foundations of model selection
 In Proc. 43rd Symposium on Foundations of Computer Science
, 2002
"... We vindicate, for the first time, the rightness of the original “structure function”, proposed by Kolmogorov in 1974, by showing that minimizing a twopart code consisting of a model subject to (Kolmogorov) complexity constraints, together with a datatomodel code, produces a model of best fit (for ..."
Abstract

Cited by 9 (0 self)
 Add to MetaCart
We vindicate, for the first time, the rightness of the original “structure function”, proposed by Kolmogorov in 1974, by showing that minimizing a twopart code consisting of a model subject to (Kolmogorov) complexity constraints, together with a datatomodel code, produces a model of best fit (for which the data is maximally “typical”). The method thus separates all possible model information from the remaining accidental information. This result gives a foundation for MDL, and related methods, in model selection. Settlement of this longstanding question is the more remarkable since the minimal randomness deficiency function (measuring maximal “typicality”) itself cannot be monotonically approximated, but the shortest twopart code can. We furthermore show that both the structure function and the minimum randomness deficiency function can assume all shapes over their full domain (improving an independent unpublished result of Levin on the former function of the early 70s, and extending a partial result of V’yugin on the latter function of the late 80s and also recent results on prediction loss measured by “snooping curves”). We give an explicit realization of optimal twopart codes at all levels of model complexity. We determine the (un)computability properties of the various functions and “algorithmic sufficient statistic ” considered. In our setting the models are finite sets, but the analysis is valid, up to logarithmic additive terms, for the model class of computable probability density functions, or the model class of total recursive functions. 1
Sophistication Revisited
 Proceedings of the 30th International Colloquium on Automata, Languages and Programming
, 2001
"... The Kolmogorov structure function divides the smallest program producing a string in two parts: the useful information present in the string, called sophistication if based on total functions, and the remaining accidental information. We revisit the notion of sophistication due to Koppel, formal ..."
Abstract

Cited by 5 (4 self)
 Add to MetaCart
The Kolmogorov structure function divides the smallest program producing a string in two parts: the useful information present in the string, called sophistication if based on total functions, and the remaining accidental information. We revisit the notion of sophistication due to Koppel, formalize a connection between sophistication and a variation of computational depth (intuitively the useful or nonrandom information in a string), prove the existence of strings with maximum sophistication and show that they encode solutions of the halting problem, i.e., they are the deepest of all strings.
An Algorithmic Complexity Interpretation of Lin’s Third Law of Information Theory
, 2008
"... Abstract: Instead of static entropy we assert that the Kolmogorov complexity of a static structure such as a solid is the proper measure of disorder (or chaoticity). A static structure in a surrounding perfectlyrandom universe acts as an interfering entity which introduces local disruption in rando ..."
Abstract

Cited by 4 (2 self)
 Add to MetaCart
Abstract: Instead of static entropy we assert that the Kolmogorov complexity of a static structure such as a solid is the proper measure of disorder (or chaoticity). A static structure in a surrounding perfectlyrandom universe acts as an interfering entity which introduces local disruption in randomness. This is modeled by a selection rule R which selects a subsequence of the random input sequence that hits the structure. Through the inequality that relates stochasticity and chaoticity of random binary sequences we maintain that Lin’s notion of stability corresponds to the stability of the frequency of 1s in the selected subsequence. This explains why more complex static structures are less stable. Lin’s third law is represented as the inevitable change that static structure undergo towards conforming to the universe’s perfect randomness.
Complexity Approximation Principle
 Computer Journal
, 1999
"... INTRODUCTION The subject of this note is another inductive principle, which can be regarded as a direct generalization of the minimum description length (MDL) and minimum message length (MML) principles. We will describe the work started at the Computer Learning Research Centre (Royal Holloway, Uni ..."
Abstract

Cited by 3 (2 self)
 Add to MetaCart
INTRODUCTION The subject of this note is another inductive principle, which can be regarded as a direct generalization of the minimum description length (MDL) and minimum message length (MML) principles. We will describe the work started at the Computer Learning Research Centre (Royal Holloway, University of London) related to this new principle, which we call the complexity approximation principle (CAP). Both MDL and MML principles can be interpreted as Kolmogorov complexity approximation principles (as explained in Rissanen [1, 2] and Wallace and Freeman [3]; see also [4]). It is shown in [5] and [6] that it is possible to generalize Kolmogorov complexity to describe the optimal performance in different `games of prediction'. Using this general notion, called predictive complexity,itis straightforward to extend the MDL and MML principles to our more general CAP. In Section 2 we define predictive complexity, in Section 3 several examples are given and in Section 4
On the combinatorial representation of information
 The Twelfth Annual International Computing and Combinatorics Conference (COCOON’06), volume LNCS 4112
, 2006
"... Abstract. Kolmogorov introduced a combinatorial measure of the information I(x: y) about the unknown value of a variable y conveyed by an input variable x taking a given value x. The paper extends this definition of information to a more general setting where ‘x = x ’ may provide a vaguer descriptio ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
Abstract. Kolmogorov introduced a combinatorial measure of the information I(x: y) about the unknown value of a variable y conveyed by an input variable x taking a given value x. The paper extends this definition of information to a more general setting where ‘x = x ’ may provide a vaguer description of the possible value of y. As an application, the space P({0, 1} n) of classes of binary functions f: [n] → {0, 1}, [n] = {1,..., n}, is considered where y represents an unknown function t ∈ {0, 1} [n] and as input, two extreme cases are considered: x = xM d and x = x M ′ d which indicate that t is an element of a set G ⊆ {0, 1} n that satisfies a property Md or M ′ d respectively. Property Md (or M ′ d) means that there exists an E ⊆ [n], E  = d, such that trE(G)  = 1 (or 2 d) where trE(G) denotes the trace of G on E. Estimates of the information value I(xM d: t) and I(x M ′ d: t) are obtained. When d is fixed, it is shown that I(xM d: t) ≈ d and I(x M ′ d: t) ≈ 1 as n → ∞. Key words: Information theory, combinatorial complexity, VCdimension 1
RANDOM SCATTERING OF BITS BY PREDICTION
, 909
"... Abstract. We investigate a population of binary mistake sequences that result from learning with parametric models of di erent order. We obtain estimates of their error, algorithmic complexity and divergence from a purely random Bernoulli sequence. We study the relationship of these variables to the ..."
Abstract
 Add to MetaCart
Abstract. We investigate a population of binary mistake sequences that result from learning with parametric models of di erent order. We obtain estimates of their error, algorithmic complexity and divergence from a purely random Bernoulli sequence. We study the relationship of these variables to the learner's information density parameter which is de ned as the ratio between the lengths of the compressed to uncompressed les that contain the learner's decision rule. The results indicate that good learners have a low information densityρ while bad learners have a high ρ. Bad learners generate atypically chaotic mistake sequences while good learners generate typically chaotic sequences that divide into two subgroups: the rst consists of the typically stochastic sequences (with low divergence) which includes the sequences generated by the Bayes optimal predictor. The second subgroup consists of the atypically stochastic (but still typically chaotic) sequences that deviate from truly random Bernoulli sequences. Based on the static algorithmic interference model of [15] the learner here acts as a static structure which scatters the