Results 1 - 10
of
62
The Similarity Metric
- IEEE TRANSACTIONS ON INFORMATION THEORY
, 2003
"... A new class of distances appropriate for measuring similarity relations between sequences, say one type of similarity per distance, is studied. We propose a new "normalized information distance", based on the noncomputable notion of Kolmogorov complexity, and show that it is in this class and it min ..."
Abstract
-
Cited by 137 (15 self)
- Add to MetaCart
A new class of distances appropriate for measuring similarity relations between sequences, say one type of similarity per distance, is studied. We propose a new "normalized information distance", based on the noncomputable notion of Kolmogorov complexity, and show that it is in this class and it minorizes every computable distance in the class (that is, it is universal in that it discovers all computable similarities). We demonstrate that it is a metric and call it the similarity metric. This theory forms the foundation for a new practical tool. To evidence generality and robustness we give two distinctive applications in widely divergent areas using standard compression programs like gzip and GenCompress. First, we compare whole mitochondrial genomes and infer their evolutionary history. This results in a first completely automatic computed whole mitochondrial phylogeny tree. Secondly, we fully automatically compute the language tree of 52 different languages.
Minimum Description Length Induction, Bayesianism, and Kolmogorov Complexity
- IEEE Transactions on Information Theory
, 1998
"... The relationship between the Bayesian approach and the minimum description length approach is established. We sharpen and clarify the general modeling principles MDL and MML, abstracted as the ideal MDL principle and defined from Bayes's rule by means of Kolmogorov complexity. The basic condition un ..."
Abstract
-
Cited by 60 (7 self)
- Add to MetaCart
The relationship between the Bayesian approach and the minimum description length approach is established. We sharpen and clarify the general modeling principles MDL and MML, abstracted as the ideal MDL principle and defined from Bayes's rule by means of Kolmogorov complexity. The basic condition under which the ideal principle should be applied is encapsulated as the Fundamental Inequality, which in broad terms states that the principle is valid when the data are random, relative to every contemplated hypothesis and also these hypotheses are random relative to the (universal) prior. Basically, the ideal principle states that the prior probability associated with the hypothesis should be given by the algorithmic universal probability, and the sum of the log universal probability of the model plus the log of the probability of the data given the model should be minimized. If we restrict the model class to the finite sets then application of the ideal principle turns into Kolmogorov's mi...
Discovering Neural Nets With Low Kolmogorov Complexity And High Generalization Capability
- Neural Networks
, 1997
"... Many neural net learning algorithms aim at finding "simple" nets to explain training data. The expectation is: the "simpler" the networks, the better the generalization on test data (! Occam's razor). Previous implementations, however, use measures for "simplicity" that lack the power, universali ..."
Abstract
-
Cited by 41 (23 self)
- Add to MetaCart
Many neural net learning algorithms aim at finding "simple" nets to explain training data. The expectation is: the "simpler" the networks, the better the generalization on test data (! Occam's razor). Previous implementations, however, use measures for "simplicity" that lack the power, universality and elegance of those based on Kolmogorov complexity and Solomonoff's algorithmic probability. Likewise, most previous approaches (especially those of the "Bayesian" kind) suffer from the problem of choosing appropriate priors. This paper addresses both issues. It first reviews some basic concepts of algorithmic complexity theory relevant to machine learning, and how the Solomonoff-Levin distribution (or universal prior) deals with the prior problem. The universal prior leads to a probabilistic method for finding "algorithmically simple" problem solutions with high generalization capability. The method is based on Levin complexity (a time-bounded generalization of Kolmogorov comple...
Algorithmic Statistics
- IEEE Transactions on Information Theory
, 2001
"... While Kolmogorov complexity is the accepted absolute measure of information content of an individual finite object, a similarly absolute notion is needed for the relation between an individual data sample and an individual model summarizing the information in the data, for example, a finite set (or ..."
Abstract
-
Cited by 41 (8 self)
- Add to MetaCart
While Kolmogorov complexity is the accepted absolute measure of information content of an individual finite object, a similarly absolute notion is needed for the relation between an individual data sample and an individual model summarizing the information in the data, for example, a finite set (or probability distribution) where the data sample typically came from. The statistical theory based on such relations between individual objects can be called algorithmic statistics, in contrast to classical statistical theory that deals with relations between probabilistic ensembles. We develop the algorithmic theory of statistic, sufficient statistic, and minimal sufficient statistic. This theory is based on two-part codes consisting of the code for the statistic (the model summarizing the regularity, the meaningful information, in the data) and the model-to-data code. In contrast to the situation in probabilistic statistical theory, the algorithmic relation of (minimal) sufficiency is an absolute relation between the individual model and the individual data sample. We distinguish implicit and explicit descriptions of the models. We give characterizations of algorithmic (Kolmogorov) minimal sufficient statistic for all data samples for both description modes in the explicit mode under some constraints. We also strengthen and elaborate earlier results on the "Kolmogorov structure function" and "absolutely non-stochastic objects" those rare objects for which the simplest models that summarize their relevant information (minimal sucient statistics) are at least as complex as the objects themselves. We demonstrate a close relation between the probabilistic notions and the algorithmic ones: (i) in both cases there is an "information non-increase" law; (ii) it is shown that a function is a...
Logical Depth and Physical Complexity
- THE UNIVERSAL TURING MACHINE: A HALF-CENTURY SURVEY
, 1988
"... Some mathematical and natural objects (a random sequence, a sequence of zeros, a perfect crystal, a gas) are intuitively trivial, while others (e.g. the human body, the digits of #) contain internal evidence of a nontrivial causal history. We formalize this ..."
Abstract
-
Cited by 39 (0 self)
- Add to MetaCart
Some mathematical and natural objects (a random sequence, a sequence of zeros, a perfect crystal, a gas) are intuitively trivial, while others (e.g. the human body, the digits of #) contain internal evidence of a nontrivial causal history. We formalize this
Discovering Solutions with Low Kolmogorov Complexity and High Generalization Capability
, 1995
"... Many machine learning algorithms aim at finding "simple" rules to explain training data. The expectation is: the "simpler" the rules, the better the generalization on test data (- Occam's razor). Most practi- cal implementations, however, use measures for "simplicity" that lack the power, univ ..."
Abstract
-
Cited by 34 (22 self)
- Add to MetaCart
Many machine learning algorithms aim at finding "simple" rules to explain training data. The expectation is: the "simpler" the rules, the better the generalization on test data (- Occam's razor). Most practi- cal implementations, however, use measures for "simplicity" that lack the power, universality and elegance of those based on Kolmogorov complexity and Solomonoff's algorithmic probability. Likewise, most pre- vious approaches (especially those of the "Bayesian" kind) suffer from the problem of choosing appropriate priors. This paper ad- dresses both issues. It first reviews some ba- sic concepts of algorithmic complexity theory relevant to machine learning, and how the Solomonoff-Levin distribution (or universal prior) deals with the prior problem. The uni- versal prior leads to a probabilistic method for finding "algorithmically simple" problem solutions with high generalization capability.
Average-case computational complexity theory
- Complexity Theory Retrospective II
, 1997
"... ABSTRACT Being NP-complete has been widely interpreted as being computationally intractable. But NP-completeness is a worst-case concept. Some NP-complete problems are \easy on average", but some may not be. How is one to know whether an NP-complete problem is \di cult on average"? The the ..."
Abstract
-
Cited by 30 (2 self)
- Add to MetaCart
ABSTRACT Being NP-complete has been widely interpreted as being computationally intractable. But NP-completeness is a worst-case concept. Some NP-complete problems are \easy on average", but some may not be. How is one to know whether an NP-complete problem is \di cult on average"? The theory of average-case computational complexity, initiated by Levin about ten years ago, is devoted to studying this problem. This paper is an attempt to provide an overview of the main ideas and results in this important new sub-area of complexity theory. 1
Hierarchies Of Generalized Kolmogorov Complexities And Nonenumerable Universal Measures Computable In The Limit
- INTERNATIONAL JOURNAL OF FOUNDATIONS OF COMPUTER SCIENCE
, 2000
"... The traditional theory of Kolmogorov complexity and algorithmic probability focuses on monotone Turing machines with one-way write-only output tape. This naturally leads to the universal enumerable Solomono-Levin measure. Here we introduce more general, nonenumerable but cumulatively enumerable m ..."
Abstract
-
Cited by 30 (13 self)
- Add to MetaCart
The traditional theory of Kolmogorov complexity and algorithmic probability focuses on monotone Turing machines with one-way write-only output tape. This naturally leads to the universal enumerable Solomono-Levin measure. Here we introduce more general, nonenumerable but cumulatively enumerable measures (CEMs) derived from Turing machines with lexicographically nondecreasing output and random input, and even more general approximable measures and distributions computable in the limit. We obtain a natural hierarchy of generalizations of algorithmic probability and Kolmogorov complexity, suggesting that the "true" information content of some (possibly in nite) bitstring x is the size of the shortest nonhalting program that converges to x and nothing but x on a Turing machine that can edit its previous outputs. Among other things we show that there are objects computable in the limit yet more random than Chaitin's "number of wisdom" Omega, that any approximable measure of x is small for any x lacking a short description, that there is no universal approximable distribution, that there is a universal CEM, and that any nonenumerable CEM of x is small for any x lacking a short enumerating program. We briey mention consequences for universes sampled from such priors.
Uniform test of algorithmic randomness over a general space
- Theoretical Computer Science
"... ABSTRACT. The algorithmic theory of randomness is well developed when the underlying space is the set of finite or infinite sequences and the underlying probability distribution is the uniform distribution or a computable distribution. These restrictions seem artificial. Some progress has been made ..."
Abstract
-
Cited by 28 (4 self)
- Add to MetaCart
ABSTRACT. The algorithmic theory of randomness is well developed when the underlying space is the set of finite or infinite sequences and the underlying probability distribution is the uniform distribution or a computable distribution. These restrictions seem artificial. Some progress has been made to extend the theory to arbitrary Bernoulli distributions (by Martin-Löf), and to arbitrary distributions (by Levin). We recall the main ideas and problems of Levin’s theory, and report further progress in the same framework. The issues are the following: – Allow non-compact spaces (like the space of continuous functions, underlying the Brownian motion). – The uniform test (deficiency of randomness) dP (x) (depending both on the outcome x and the measure P) should be defined in a general and natural way. – See which of the old results survive: existence of universal tests, conservation of randomness, expression of tests in terms of description complexity, existence of a universal measure, expression of mutual information as ”deficiency of independence”. – The negative of the new randomness test is shown to be a generalization of complexity in continuous spaces; we show that the addition theorem survives. The paper’s main contribution is introducing an appropriate framework for studying these questions and related ones (like statistics for a general family of distributions). 1.
Reversibility and adiabatic computation: trading time and space for energy, Proc
- Royal Society of London, Series A
"... Future miniaturization and mobilization of computing devices requires energy parsimonious ‘adiabatic ’ computation. This is contingent on logical reversibility of computation. An example is the idea of quantum computations which are reversible except for the irreversible observation steps. We propos ..."
Abstract
-
Cited by 26 (10 self)
- Add to MetaCart
Future miniaturization and mobilization of computing devices requires energy parsimonious ‘adiabatic ’ computation. This is contingent on logical reversibility of computation. An example is the idea of quantum computations which are reversible except for the irreversible observation steps. We propose to study quantitatively the exchange of computational resources like time and space for irreversibility in computations. Reversible simulations of irreversible computations are memory intensive. Such (polynomial time) simulations are analysed here in terms of ‘reversible ’ pebble games. We show that Bennett’s pebbling strategy uses least additional space for the greatest number of simulated steps. We derive a trade-off for storage space versus irreversible erasure. Next we consider reversible computation itself. An alternative proof is provided for the precise expression of the ultimate irreversibility cost of an otherwise reversible computation without restrictions on time and space use. A time-irreversibility trade-off hierarchy in the exponential time region is exhibited. Finally, extreme timeirreversibility trade-offs for reversible computations in the thoroughly unrealistic range of computable versus noncomputable time-bounds are given.

