Results 1  10
of
36
Spatiotemporal coding for wireless communications
 IEEE Transactions on Communications
, 1998
"... ..."
(Show Context)
SecretKey Reconciliation by Public Discussion
, 1994
"... . Assuming that Alice and Bob use a secret noisy channel (modelled by a binary symmetric channel) to send a key, reconciliation is the process of correcting errors between Alice's and Bob's version of the key. This is done by public discussion, which leaks some information about the secret ..."
Abstract

Cited by 124 (3 self)
 Add to MetaCart
(Show Context)
. Assuming that Alice and Bob use a secret noisy channel (modelled by a binary symmetric channel) to send a key, reconciliation is the process of correcting errors between Alice's and Bob's version of the key. This is done by public discussion, which leaks some information about the secret key to an eavesdropper. We show how to construct protocols that leak a minimum amount of information. However this construction cannot be implemented efficiently. If Alice and Bob are willing to reveal an arbitrarily small amount of additional information (beyond the minimum) then they can implement polynomialtime protocols. We also present a more efficient protocol, which leaks an amount of information acceptably close to the minimum possible for sufficiently reliable secret channels (those with probability of any symbol being transmitted incorrectly as large as 15%). This work improves on earlier reconciliation approaches [R, BBR, BBBSS]. 1 Introduction Unlike public key cryptosystems, the securi...
Formal Theory of Creativity, Fun, and Intrinsic Motivation (19902010)
"... The simple but general formal theory of fun & intrinsic motivation & creativity (1990) is based on the concept of maximizing intrinsic reward for the active creation or discovery of novel, surprising patterns allowing for improved prediction or data compression. It generalizes the traditio ..."
Abstract

Cited by 59 (15 self)
 Add to MetaCart
The simple but general formal theory of fun & intrinsic motivation & creativity (1990) is based on the concept of maximizing intrinsic reward for the active creation or discovery of novel, surprising patterns allowing for improved prediction or data compression. It generalizes the traditional field of active learning, and is related to old but less formal ideas in aesthetics theory and developmental psychology. It has been argued that the theory explains many essential aspects of intelligence including autonomous development, science, art, music, humor. This overview first describes theoretically optimal (but not necessarily practical) ways of implementing the basic computational principles on exploratory, intrinsically motivated agents or robots, encouraging them to provoke event sequences exhibiting previously unknown but learnable algorithmic regularities. Emphasis is put on the importance of limited computational resources for online prediction and compression. Discrete and continuous time formulations are given. Previous practical but nonoptimal implementations (1991, 1995, 19972002) are reviewed, as well as several recent variants by others (2005). A simplified typology addresses current confusion concerning the precise nature of intrinsic motivation.
Discovering Neural Nets With Low Kolmogorov Complexity And High Generalization Capability
 Neural Networks
, 1997
"... Many neural net learning algorithms aim at finding "simple" nets to explain training data. The expectation is: the "simpler" the networks, the better the generalization on test data (! Occam's razor). Previous implementations, however, use measures for "simplicity&quo ..."
Abstract

Cited by 52 (30 self)
 Add to MetaCart
(Show Context)
Many neural net learning algorithms aim at finding "simple" nets to explain training data. The expectation is: the "simpler" the networks, the better the generalization on test data (! Occam's razor). Previous implementations, however, use measures for "simplicity" that lack the power, universality and elegance of those based on Kolmogorov complexity and Solomonoff's algorithmic probability. Likewise, most previous approaches (especially those of the "Bayesian" kind) suffer from the problem of choosing appropriate priors. This paper addresses both issues. It first reviews some basic concepts of algorithmic complexity theory relevant to machine learning, and how the SolomonoffLevin distribution (or universal prior) deals with the prior problem. The universal prior leads to a probabilistic method for finding "algorithmically simple" problem solutions with high generalization capability. The method is based on Levin complexity (a timebounded generalization of Kolmogorov comple...
On Learning How to Learn Learning Strategies
, 1995
"... This paper introduces the "incremental selfimprovement paradigm". Unlike previous methods, incremental selfimprovement encourages a reinforcement learning system to improve the way it learns, and to improve the way it improves the way it learns ..., without significant theoretical limita ..."
Abstract

Cited by 42 (15 self)
 Add to MetaCart
This paper introduces the "incremental selfimprovement paradigm". Unlike previous methods, incremental selfimprovement encourages a reinforcement learning system to improve the way it learns, and to improve the way it improves the way it learns ..., without significant theoretical limitations  the system is able to "shift its inductive bias" in a universal way. Its major features are: (1) There is no explicit difference between "learning", "metalearning", and other kinds of information processing. Using a Turing machine equivalent programming language, the system itself occasionally executes selfdelimiting, initially highly random "selfmodification programs" which modify the contextdependent probabilities of future action sequences (including future selfmodification programs). (2) The system keeps only those probability modifications computed by "useful" selfmodification programs: those which bring about more payoff (reward, reinforcement) per time than all previous selfmodi...
A computer scientist’s view of life, the universe, and everything
 Foundations of Computer Science: Potential  Theory  Cognition
, 1997
"... Is the universe computable? If so, it may be much cheaper in terms of information requirements to compute all computable universes instead of just ours. I apply basic concepts of Kolmogorov complexity theory to the set of possible universes, and chat about perceived and true randomness, life, genera ..."
Abstract

Cited by 41 (15 self)
 Add to MetaCart
(Show Context)
Is the universe computable? If so, it may be much cheaper in terms of information requirements to compute all computable universes instead of just ours. I apply basic concepts of Kolmogorov complexity theory to the set of possible universes, and chat about perceived and true randomness, life, generalization, and learning in a given universe. Preliminaries Assumptions. A long time ago, the Great Programmer wrote a program that runs all possible universes on His Big Computer. “Possible ” means “computable”: (1) Each universe evolves on a discrete time scale. (2) Any universe’s state at a given time is describable by a finite number of bits. One of the many universes is ours, despite some who evolved in it and claim it is incomputable. Computable universes. Let TM denote an arbitrary universal Turing machine with unidirectional output tape. TM’s input and output symbols are “0”, “1”, and “, ” (comma). TM’s possible input programs can be ordered
Hierarchies Of Generalized Kolmogorov Complexities And Nonenumerable Universal Measures Computable In The Limit
 INTERNATIONAL JOURNAL OF FOUNDATIONS OF COMPUTER SCIENCE
, 2000
"... The traditional theory of Kolmogorov complexity and algorithmic probability focuses on monotone Turing machines with oneway writeonly output tape. This naturally leads to the universal enumerable SolomonoLevin measure. Here we introduce more general, nonenumerable but cumulatively enumerable m ..."
Abstract

Cited by 40 (21 self)
 Add to MetaCart
(Show Context)
The traditional theory of Kolmogorov complexity and algorithmic probability focuses on monotone Turing machines with oneway writeonly output tape. This naturally leads to the universal enumerable SolomonoLevin measure. Here we introduce more general, nonenumerable but cumulatively enumerable measures (CEMs) derived from Turing machines with lexicographically nondecreasing output and random input, and even more general approximable measures and distributions computable in the limit. We obtain a natural hierarchy of generalizations of algorithmic probability and Kolmogorov complexity, suggesting that the "true" information content of some (possibly in nite) bitstring x is the size of the shortest nonhalting program that converges to x and nothing but x on a Turing machine that can edit its previous outputs. Among other things we show that there are objects computable in the limit yet more random than Chaitin's "number of wisdom" Omega, that any approximable measure of x is small for any x lacking a short description, that there is no universal approximable distribution, that there is a universal CEM, and that any nonenumerable CEM of x is small for any x lacking a short enumerating program. We briey mention consequences for universes sampled from such priors.
Algorithmic Theories Of Everything
, 2000
"... The probability distribution P from which the history of our universe is sampled represents a theory of everything or TOE. We assume P is formally describable. Since most (uncountably many) distributions are not, this imposes a strong inductive bias. We show that P(x) is small for any universe x lac ..."
Abstract

Cited by 32 (15 self)
 Add to MetaCart
The probability distribution P from which the history of our universe is sampled represents a theory of everything or TOE. We assume P is formally describable. Since most (uncountably many) distributions are not, this imposes a strong inductive bias. We show that P(x) is small for any universe x lacking a short description, and study the spectrum of TOEs spanned by two Ps, one reflecting the most compact constructive descriptions, the other the fastest way of computing everything. The former derives from generalizations of traditional computability, Solomonoff’s algorithmic probability, Kolmogorov complexity, and objects more random than Chaitin’s Omega, the latter from Levin’s universal search and a natural resourceoriented postulate: the cumulative prior probability of all x incomputable within time t by this optimal algorithm should be 1/t. Between both Ps we find a universal cumulatively enumerable measure that dominates traditional enumerable measures; any such CEM must assign low probability to any universe lacking a short enumerating program. We derive Pspecific consequences for evolving observers, inductive reasoning, quantum physics, philosophy, and the expected duration of our universe.
Learning Unambiguous Reduced Sequence Descriptions
 Advances in Neural Information Processing Systems 4
, 1992
"... You want your neural net algorithm to learn sequences? Do not just use conventional gradient descent (or approximations thereof) in recurrent nets, timedelay nets etc. Instead, use your sequence learning algorithm to implement the following method: No matter what your final goals are, train a netwo ..."
Abstract

Cited by 29 (8 self)
 Add to MetaCart
You want your neural net algorithm to learn sequences? Do not just use conventional gradient descent (or approximations thereof) in recurrent nets, timedelay nets etc. Instead, use your sequence learning algorithm to implement the following method: No matter what your final goals are, train a network to predict its next input from the previous ones. Since only unpredictable inputs convey new information, ignore all predictable inputs but let all unexpected inputs (plus information about the time step at which they occurred) become inputs to a higherlevel network of the same kind (working on a slower, selfadjusting time scale). Go on building a hierarchy of such networks. This principle reduces the descriptions of event sequences without loss of information, thus easing supervised or reinforcement learning tasks. Experiments show that systems based on this principle can require less computation per time step and many fewer training sequences than conventional training algorithms for ...