Results 1 - 10
of
135
A Theory of Program Size Formally Identical to Information Theory
, 1975
"... A new definition of program-size complexity is made. H(A;B=C;D) is defined to be the size in bits of the shortest self-delimiting program for calculating strings A and B if one is given a minimal-size selfdelimiting program for calculating strings C and D. This differs from previous definitions: (1) ..."
Abstract
-
Cited by 274 (16 self)
- Add to MetaCart
A new definition of program-size complexity is made. H(A;B=C;D) is defined to be the size in bits of the shortest self-delimiting program for calculating strings A and B if one is given a minimal-size selfdelimiting program for calculating strings C and D. This differs from previous definitions: (1) programs are required to be self-delimiting, i.e. no program is a prefix of another, and (2) instead of being given C and D directly, one is given a program for calculating them that is minimal in size. Unlike previous definitions, this one has precisely the formal 2 G. J. Chaitin properties of the entropy concept of information theory. For example, H(A;B) = H(A) + H(B=A) + O(1). Also, if a program of length k is assigned measure 2 \Gammak , then H(A) = \Gamma log 2 (the probability that the standard universal computer will calculate A) +O(1). Key Words and Phrases: computational complexity, entropy, information theory, instantaneous code, Kraft inequality, minimal program, probab...
Algorithmic information theory
- IBM JOURNAL OF RESEARCH AND DEVELOPMENT
, 1977
"... This paper reviews algorithmic information theory, which is an attempt to apply information-theoretic and probabilistic ideas to recursive function theory. Typical concerns in this approach are, for example, the number of bits of information required to specify an algorithm, or the probability that ..."
Abstract
-
Cited by 264 (18 self)
- Add to MetaCart
This paper reviews algorithmic information theory, which is an attempt to apply information-theoretic and probabilistic ideas to recursive function theory. Typical concerns in this approach are, for example, the number of bits of information required to specify an algorithm, or the probability that a program whose bits are chosen by coin flipping produces a given output. During the past few years the definitions of algorithmic information theory have been reformulated. The basic features of the new formalism are presented here and certain results of R. M. Solovay are reported.
The induction of dynamical recognizers
- Machine Learning
, 1991
"... A higher order recurrent neural network architecture learns to recognize and generate languages after being "trained " on categorized exemplars. Studying these networks from the perspective of dynamical systems yields two interesting discoveries: First, a longitudinal examination of the learning pro ..."
Abstract
-
Cited by 197 (15 self)
- Add to MetaCart
A higher order recurrent neural network architecture learns to recognize and generate languages after being "trained " on categorized exemplars. Studying these networks from the perspective of dynamical systems yields two interesting discoveries: First, a longitudinal examination of the learning process illustrates a new form of mechanical inference: Induction by phase transition. A small weight adjustment causes a "bifurcation" in the limit behavior of the network. This phase transition corresponds to the onset of the network’s capacity for generalizing to arbitrary-length strings. Second, a study of the automata resulting from the acquisition of previously published training sets indicates that while the architecture is not guaranteed to find a minimal finite automaton consistent with the given exemplars, which is an NP-Hard problem, the architecture does appear capable of generating non-regular languages by exploiting fractal and chaotic dynamics. I end the paper with a hypothesis relating linguistic generative capacity to the behavioral regimes of non-linear dynamical systems.
Almost Everywhere High Nonuniform Complexity
, 1992
"... . We investigate the distribution of nonuniform complexities in uniform complexity classes. We prove that almost every problem decidable in exponential space has essentially maximum circuit-size and space-bounded Kolmogorov complexity almost everywhere. (The circuit-size lower bound actually exceeds ..."
Abstract
-
Cited by 158 (34 self)
- Add to MetaCart
. We investigate the distribution of nonuniform complexities in uniform complexity classes. We prove that almost every problem decidable in exponential space has essentially maximum circuit-size and space-bounded Kolmogorov complexity almost everywhere. (The circuit-size lower bound actually exceeds, and thereby strengthens, the Shannon 2 n n lower bound for almost every problem, with no computability constraint.) In exponential time complexity classes, we prove that the strongest relativizable lower bounds hold almost everywhere for almost all problems. Finally, we show that infinite pseudorandom sequences have high nonuniform complexity almost everywhere. The results are unified by a new, more powerful formulation of the underlying measure theory, based on uniform systems of density functions, and by the introduction of a new nonuniform complexity measure, the selective Kolmogorov complexity. This research was supported in part by NSF Grants CCR-8809238 and CCR-9157382 and in ...
The complexity of finite objects and the development of the concepts of information and randomness by means of the theory of algorithms
- Russian Math. Surveys
, 1970
"... In 1964 Kolmogorov introduced the concept of the complexity of a finite object (for instance, the words in a certain alphabet). He defined complexity as the minimum number of binary signs containing all the information about a given object that are sufficient for its recovery (decoding). This defini ..."
Abstract
-
Cited by 138 (1 self)
- Add to MetaCart
In 1964 Kolmogorov introduced the concept of the complexity of a finite object (for instance, the words in a certain alphabet). He defined complexity as the minimum number of binary signs containing all the information about a given object that are sufficient for its recovery (decoding). This definition depends essentially on the method of decoding. However, by means of the general theory of algorithms, Kolmogorov was able to give an invariant (universal) definition of complexity. Related concepts were investigated by Solotionoff (U.S.A.) and Markov. Using the concept of complexity, Kolmogorov gave definitions of the quantity of information in finite objects and of the concept of a random sequence (which was then defined more precisely by Martin-Lof). Afterwards, this circle of questions developed rapidly. In particular, an interesting development took place of the ideas of Markov on the application of the concept of complexity to the study of quantitative questions in the theory of algorithms. The present article is a survey of the fundamental results connected with the brief remarks above.
Computation at the onset of chaos
- The Santa Fe Institute, Westview
, 1988
"... Computation at levels beyond storage and transmission of information appears in physical systems at phase transitions. We investigate this phenomenon using minimal computational models of dynamical systems that undergo a transition to chaos as a function of a nonlinearity parameter. For period-doubl ..."
Abstract
-
Cited by 77 (14 self)
- Add to MetaCart
Computation at levels beyond storage and transmission of information appears in physical systems at phase transitions. We investigate this phenomenon using minimal computational models of dynamical systems that undergo a transition to chaos as a function of a nonlinearity parameter. For period-doubling and band-merging cascades, we derive expressions for the entropy, the interdependence of-machine complexity and entropy, and the latent complexity of the transition to chaos. At the transition deterministic finite automaton models diverge in size. Although there is no regular or context-free Chomsky grammar in this case, we give finite descriptions at the higher computational level of context-free Lindenmayer systems. We construct a restricted indexed context-free grammar and its associated one-way nondeterministic nested stack automaton for the cascade limit language. This analysis of a family of dynamical systems suggests a complexity theoretic description of phase transitions based on the informational diversity and computational complexity of observed data that is independent of particular system control parameters. The approach gives a much more refined picture of the architecture of critical states than is available via
The calculi of emergence: Computation, dynamics, and induction
- Physica D
, 1994
"... Defining structure and detecting the emergence of complexity in nature are inherently subjective, though essential, scientific activities. Despite the difficulties, these problems can be analyzed in terms of how model-building observers infer from measurements the computational capabilities embedded ..."
Abstract
-
Cited by 65 (13 self)
- Add to MetaCart
Defining structure and detecting the emergence of complexity in nature are inherently subjective, though essential, scientific activities. Despite the difficulties, these problems can be analyzed in terms of how model-building observers infer from measurements the computational capabilities embedded in nonlinear processes. An observer’s notion of what is ordered, what is random, and what is complex in its environment depends directly on its computational resources: the amount of raw measurement data, of memory, and of time available for estimation and inference. The discovery of structure in an environment depends more critically and subtlely, though, on how those resources are organized. The descriptive power of the observer’s chosen (or implicit) computational model class, for example, can be an overwhelming determinant in finding regularity in data. This paper presents an overview of an inductive framework — hierarchical-machine reconstruction — in which the emergence of complexity is associated with the innovation of new computational model classes. Complexity metrics for detecting structure and quantifying emergence, along with an analysis of the constraints on the dynamics of innovation, are outlined. Illustrative examples are drawn from the onset of unpredictability in nonlinear systems, finitary nondeterministic processes, and
Shifting Inductive Bias with Success-Story Algorithm, Adaptive Levin Search, and Incremental Self-Improvement
- MACHINE LEARNING
, 1997
"... We study task sequences that allow for speeding up the learner's average reward intake through appropriate shifts of inductive bias (changes of the learner's policy). To evaluate long-term effects of bias shifts setting the stage for later bias shifts we use the "success-story algorithm" (SSA). SSA ..."
Abstract
-
Cited by 58 (27 self)
- Add to MetaCart
We study task sequences that allow for speeding up the learner's average reward intake through appropriate shifts of inductive bias (changes of the learner's policy). To evaluate long-term effects of bias shifts setting the stage for later bias shifts we use the "success-story algorithm" (SSA). SSA is occasionally called at times that may depend on the policy itself. It uses backtracking to undo those bias shifts that have not been empirically observed to trigger longterm reward accelerations (measured up until the current SSA call). Bias shifts that survive SSA represent a lifelong success history. Until the next SSA call, they are considered useful and build the basis for additional bias shifts. SSA allows for plugging in a wide variety of learning algorithms. We plug in (1) a novel, adaptive extension of Levin search and (2) a method for embedding the learner's policy modification strategy within the policy itself (incremental self-improvement). Our inductive transfer case studies...
Discovering Neural Nets With Low Kolmogorov Complexity And High Generalization Capability
- Neural Networks
, 1997
"... Many neural net learning algorithms aim at finding "simple" nets to explain training data. The expectation is: the "simpler" the networks, the better the generalization on test data (! Occam's razor). Previous implementations, however, use measures for "simplicity" that lack the power, universali ..."
Abstract
-
Cited by 41 (23 self)
- Add to MetaCart
Many neural net learning algorithms aim at finding "simple" nets to explain training data. The expectation is: the "simpler" the networks, the better the generalization on test data (! Occam's razor). Previous implementations, however, use measures for "simplicity" that lack the power, universality and elegance of those based on Kolmogorov complexity and Solomonoff's algorithmic probability. Likewise, most previous approaches (especially those of the "Bayesian" kind) suffer from the problem of choosing appropriate priors. This paper addresses both issues. It first reviews some basic concepts of algorithmic complexity theory relevant to machine learning, and how the Solomonoff-Levin distribution (or universal prior) deals with the prior problem. The universal prior leads to a probabilistic method for finding "algorithmically simple" problem solutions with high generalization capability. The method is based on Levin complexity (a time-bounded generalization of Kolmogorov comple...
A tutorial introduction to the minimum description length principle
- in Advances in Minimum Description Length: Theory and Applications. 2005
"... ..."

