Results 1 
8 of
8
MachineLearning Applications of Algorithmic Randomness
 In Proceedings of the Sixteenth International Conference on Machine Learning
, 1999
"... Most machine learning algorithms share the following drawback: they only output bare predictions but not the confidence in those predictions. In the 1960s algorithmic information theory supplied universal measures of confidence but these are, unfortunately, noncomputable. In this paper we com ..."
Abstract

Cited by 23 (13 self)
 Add to MetaCart
Most machine learning algorithms share the following drawback: they only output bare predictions but not the confidence in those predictions. In the 1960s algorithmic information theory supplied universal measures of confidence but these are, unfortunately, noncomputable. In this paper we combine the ideas of algorithmic information theory with the theory of Support Vector machines to obtain practicable approximations to universal measures of confidence. We show that in some standard problems of pattern recognition our approximations work well. 1 INTRODUCTION Two important differences of most modern methods of machine learning (such as statistical learning theory, see Vapnik [21], 1998, or PAC theory) from classical statistical methods are that: ffl machine learning methods produce bare predictions, without estimating confidence in those predictions (unlike, eg, prediction of future observations in traditional statistics (Guttman [5], 1970)); ffl many machine learning ...
Algorithmic Complexity and Stochastic Properties of Finite Binary Sequences
, 1999
"... This paper is a survey of concepts and results related to simple Kolmogorov complexity, prefix complexity and resourcebounded complexity. We also consider a new type of complexity statistical complexity closely related to mathematical statistics. Unlike other discoverers of algorithmic complexit ..."
Abstract

Cited by 17 (0 self)
 Add to MetaCart
This paper is a survey of concepts and results related to simple Kolmogorov complexity, prefix complexity and resourcebounded complexity. We also consider a new type of complexity statistical complexity closely related to mathematical statistics. Unlike other discoverers of algorithmic complexity, A. N. Kolmogorov's leading motive was developing on its basis a mathematical theory more adequately substantiating applications of probability theory, mathematical statistics and information theory. Kolmogorov wanted to deduce properties of a random object from its complexity characteristics without use of the notion of probability. In the first part of this paper we present several results in this direction. Though the subsequent development of algorithmic complexity and randomness was different, algorithmic complexity has successful applications in a traditional probabilistic framework. In the second part of the paper we consider applications to the estimation of parameters and the definition of Bernoulli sequences. All considerations have finite combinatorial character. 1.
Kolmogorov Complexity: Sources, Theory and Applications
 The Computer Journal
, 1999
"... ing applications based on different ways of approximating Kolmogorov complexity. 2. BEGINNINGS As we have already mentioned, the two main originators of the theory of Kolmogorov complexity were Ray Solomonoff (born 1926) and Andrei Nikolaevich Kolmogorov (1903 1987). The motivations behind their ..."
Abstract

Cited by 11 (1 self)
 Add to MetaCart
ing applications based on different ways of approximating Kolmogorov complexity. 2. BEGINNINGS As we have already mentioned, the two main originators of the theory of Kolmogorov complexity were Ray Solomonoff (born 1926) and Andrei Nikolaevich Kolmogorov (1903 1987). The motivations behind their work were completely different; Solomonoff was interested in inductive inference and artificial intelligence and Kolmogorov was interested in the foundations of probability theory and, also, of information theory. They arrived, nevertheless, at the same mathematical notion, which is now known as Kolmogorov complexity. In 1964 Solomonoff published his model of inductive inference. He argued that any inference problem can be presented as a problem of extrapolating a very long sequence of binary symbols; `given a very long sequence, represented by T , what is the probability that it will be followed by a ... sequence A?'. Solomonoff assumed
Relations between varieties of Kolmogorov complexity
 Mathematical Systems Theory
, 1996
"... Abstract. There are several sorts of Kolmogorov complexity, better to say several Kolmogorov complexities: decision complexity, simple complexity, prefix complexity, monotonic complexity, a priori complexity. The last three can and the first two cannot be used for defining randomness of an infinite ..."
Abstract

Cited by 6 (2 self)
 Add to MetaCart
Abstract. There are several sorts of Kolmogorov complexity, better to say several Kolmogorov complexities: decision complexity, simple complexity, prefix complexity, monotonic complexity, a priori complexity. The last three can and the first two cannot be used for defining randomness of an infinite binary sequence. All those five versions of Kolmogorov complexity were considered, from a unified point of view, in a paper by the first author which appeared in Watanabe’s book [23]. Upper and lower bounds for those complexities and also for their differences were announced in that paper without proofs. (Some of those bounds are mentioned in Section 4.4.5 of [16].) The purpose of this paper (which can be read independently of [23]) is to give proofs for the bounds from [23]. The terminology used in this paper is somehow nonstandard: we call “Kolmogorov entropy ” what is usually called “Kolmogorov complexity. ” This is a Moscow tradition suggested by Kolmogorov himself. By this tradition the term “complexity ” relates to any mode of description and “entropy ” is the complexity related to an optimal mode (i.e., to a mode that, roughly speaking, gives the shortest descriptions).
Complexity Approximation Principle
 Computer Journal
, 1999
"... INTRODUCTION The subject of this note is another inductive principle, which can be regarded as a direct generalization of the minimum description length (MDL) and minimum message length (MML) principles. We will describe the work started at the Computer Learning Research Centre (Royal Holloway, Uni ..."
Abstract

Cited by 3 (2 self)
 Add to MetaCart
INTRODUCTION The subject of this note is another inductive principle, which can be regarded as a direct generalization of the minimum description length (MDL) and minimum message length (MML) principles. We will describe the work started at the Computer Learning Research Centre (Royal Holloway, University of London) related to this new principle, which we call the complexity approximation principle (CAP). Both MDL and MML principles can be interpreted as Kolmogorov complexity approximation principles (as explained in Rissanen [1, 2] and Wallace and Freeman [3]; see also [4]). It is shown in [5] and [6] that it is possible to generalize Kolmogorov complexity to describe the optimal performance in different `games of prediction'. Using this general notion, called predictive complexity,itis straightforward to extend the MDL and MML principles to our more general CAP. In Section 2 we define predictive complexity, in Section 3 several examples are given and in Section 4
All Entropies Agree For An Sft
"... this paper I discuss a number of "entropies" which have definitions which are respectively probabilistic, topological, algebraic, and algorithmic. I shall explain how these entropies are all defined in the setting of shift dynamical systems. The main result of the paper, which should be regarded as ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
this paper I discuss a number of "entropies" which have definitions which are respectively probabilistic, topological, algebraic, and algorithmic. I shall explain how these entropies are all defined in the setting of shift dynamical systems. The main result of the paper, which should be regarded as part of the folklore, is the fact that for topologically transitive shifts of finite type (the definitions of these terms will be found below) all these entropies agree numerically.
What Is Information?
, 1995
"... this paper knows that Shannon did no such thing. It must not be forgotten that Shannon called his theory "a general theory of communication ", not a theory of information. The distinction is crucial. As Shannon put it in [15]: The fundamental problem of communication is that of reproducing at one po ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
this paper knows that Shannon did no such thing. It must not be forgotten that Shannon called his theory "a general theory of communication ", not a theory of information. The distinction is crucial. As Shannon put it in [15]: The fundamental problem of communication is that of reproducing at one point either exactly or approximately a message selected at another point. Frequently the messages have meaning; that is, they refer to or are correlated according to some system with certain physical or conceptual entities. These semantic aspects of communication are irrelevant to the engineering problem. The significant aspect is that the actual message is one selected from a set of possible messages. It would be impossible to overstress the fact that all aspects of "information" other than statistical phenomena are completely irrelevant to communication theory.
The Aggregating Algorithm and Predictive Complexity
"... This thesis is devoted to online learning. An online learning algorithm receives elements of a sequence one by one and tries to predict every element before it arrives. The performance of such an algorithm is measured by the discrepancies between its predictions and the outcomes. Discrepancies ove ..."
Abstract
 Add to MetaCart
This thesis is devoted to online learning. An online learning algorithm receives elements of a sequence one by one and tries to predict every element before it arrives. The performance of such an algorithm is measured by the discrepancies between its predictions and the outcomes. Discrepancies over several trials sum up to total cumulative loss. The starting point is the Aggregating Algorithm (AA). This algorithm deals with the problem of prediction with expert advice. In this thesis the existing theory of the AA is surveyed and some further properties are established. The concept of predictive complexity introduced by V. Vovk is a natural development of the theory of prediction with expert advice. Predictive complexity bounds the loss of every algorithm from below. Generally this bound does not correspond to the loss of an algorithm but it is optimal ‘in the limit’. Thus it is an intrinsic measure of ‘learnability ’ of a finite sequence. It is similar to Kolmogorov complexity, which is a measure of the descriptive complexity of a string independent of a particular description method. Different approaches to optimality give