Results 1  10
of
16
How to Use Expert Advice
 JOURNAL OF THE ASSOCIATION FOR COMPUTING MACHINERY
, 1997
"... We analyze algorithms that predict a binary value by combining the predictions of several prediction strategies, called experts. Our analysis is for worstcase situations, i.e., we make no assumptions about the way the sequence of bits to be predicted is generated. We measure the performance of the ..."
Abstract

Cited by 313 (65 self)
 Add to MetaCart
We analyze algorithms that predict a binary value by combining the predictions of several prediction strategies, called experts. Our analysis is for worstcase situations, i.e., we make no assumptions about the way the sequence of bits to be predicted is generated. We measure the performance of the algorithm by the difference between the expected number of mistakes it makes on the bit sequence and the expected number of mistakes made by the best expert on this sequence, where the expectation is taken with respect to the randomization in the predictions. We show that the minimum achievable difference is on the order of the square root of the number of mistakes of the best expert, and we give efficient algorithms that achieve this. Our upper and lower bounds have matching leading constants in most cases. We then show howthis leads to certain kinds of pattern recognition/learning algorithms with performance bounds that improve on the best results currently known in this context. We also compare our analysis to the case in which log loss is used instead of the expected number of mistakes.
Learning by Transduction
 In Uncertainty in Artificial Intelligence
, 1998
"... We describe a method for predicting a classification of an object given classifications of the objects in the training set, assuming that the pairs object /classification are generated by an i.i.d. process from a continuous probability distribution. Our method is a modification of Vapnik's supp ..."
Abstract

Cited by 72 (8 self)
 Add to MetaCart
We describe a method for predicting a classification of an object given classifications of the objects in the training set, assuming that the pairs object /classification are generated by an i.i.d. process from a continuous probability distribution. Our method is a modification of Vapnik's supportvector machine; its main novelty is that it gives not only the prediction itself but also a practicable measure of the evidence found in support of that prediction. We also describe a procedure for assigning degrees of confidence to predictions made by the support vector machine. Some experimental results are presented, and possible extensions of the algorithms are discussed. 1 THE PROBLEM Suppose labeled points (x i ; y i ) (i = 1; 2; : : :), where x i 2 IR n (our objects are specified by n realvalued attributes) and y i 2 f\Gamma1; 1g, are generated independently from an unknown (but the same for all points) probability distribution. We are given l points x i , i = 1; : : : ; l, toge...
Competitive online statistics
 International Statistical Review
, 1999
"... A radically new approach to statistical modelling, which combines mathematical techniques of Bayesian statistics with the philosophy of the theory of competitive online algorithms, has arisen over the last decade in computer science (to a large degree, under the influence of Dawid’s prequential sta ..."
Abstract

Cited by 63 (10 self)
 Add to MetaCart
A radically new approach to statistical modelling, which combines mathematical techniques of Bayesian statistics with the philosophy of the theory of competitive online algorithms, has arisen over the last decade in computer science (to a large degree, under the influence of Dawid’s prequential statistics). In this approach, which we call “competitive online statistics”, it is not assumed that data are generated by some stochastic mechanism; the bounds derived for the performance of competitive online statistical procedures are guaranteed to hold (and not just hold with high probability or on the average). This paper reviews some results in this area; the new material in it includes the proofs for the performance of the Aggregating Algorithm in the problem of linear regression with square loss. Keywords: Bayes’s rule, competitive online algorithms, linear regression, prequential statistics, worstcase analysis.
MachineLearning Applications of Algorithmic Randomness
 In Proceedings of the Sixteenth International Conference on Machine Learning
, 1999
"... Most machine learning algorithms share the following drawback: they only output bare predictions but not the confidence in those predictions. In the 1960s algorithmic information theory supplied universal measures of confidence but these are, unfortunately, noncomputable. In this paper we com ..."
Abstract

Cited by 23 (13 self)
 Add to MetaCart
Most machine learning algorithms share the following drawback: they only output bare predictions but not the confidence in those predictions. In the 1960s algorithmic information theory supplied universal measures of confidence but these are, unfortunately, noncomputable. In this paper we combine the ideas of algorithmic information theory with the theory of Support Vector machines to obtain practicable approximations to universal measures of confidence. We show that in some standard problems of pattern recognition our approximations work well. 1 INTRODUCTION Two important differences of most modern methods of machine learning (such as statistical learning theory, see Vapnik [21], 1998, or PAC theory) from classical statistical methods are that: ffl machine learning methods produce bare predictions, without estimating confidence in those predictions (unlike, eg, prediction of future observations in traditional statistics (Guttman [5], 1970)); ffl many machine learning ...
Algorithmic Complexity and Stochastic Properties of Finite Binary Sequences
, 1999
"... This paper is a survey of concepts and results related to simple Kolmogorov complexity, prefix complexity and resourcebounded complexity. We also consider a new type of complexity statistical complexity closely related to mathematical statistics. Unlike other discoverers of algorithmic complexit ..."
Abstract

Cited by 17 (0 self)
 Add to MetaCart
This paper is a survey of concepts and results related to simple Kolmogorov complexity, prefix complexity and resourcebounded complexity. We also consider a new type of complexity statistical complexity closely related to mathematical statistics. Unlike other discoverers of algorithmic complexity, A. N. Kolmogorov's leading motive was developing on its basis a mathematical theory more adequately substantiating applications of probability theory, mathematical statistics and information theory. Kolmogorov wanted to deduce properties of a random object from its complexity characteristics without use of the notion of probability. In the first part of this paper we present several results in this direction. Though the subsequent development of algorithmic complexity and randomness was different, algorithmic complexity has successful applications in a traditional probabilistic framework. In the second part of the paper we consider applications to the estimation of parameters and the definition of Bernoulli sequences. All considerations have finite combinatorial character. 1.
Prequential randomness
"... This paper studies Dawid’s prequential framework from the point of view of the algorithmic theory of randomness. The main result is that two natural notions of randomness coincide. One notion is the prequential version of the standard definition due to MartinLöf, and the other is the prequential ..."
Abstract

Cited by 4 (2 self)
 Add to MetaCart
This paper studies Dawid’s prequential framework from the point of view of the algorithmic theory of randomness. The main result is that two natural notions of randomness coincide. One notion is the prequential version of the standard definition due to MartinLöf, and the other is the prequential version of the martingale definition of randomness due to Schnorr. This is another manifestation of the close relation between the two main paradigms of randomness, typicalness and unpredictability. The algorithmic theory of randomness can be stripped of the algorithms and still give meaningful results; the typicalness paradigm then corresponds to Kolmogorov’s measuretheoretic probability and the unpredictability paradigm corresponds to gametheoretic probability. It is an open problem whether the main result of this paper continues to hold in the stripped version of the theory.
Mathematical foundations for probability and causality
 In Mathematical Aspects of Artificial Intelligence. Providence, Rhode Island: American Mathematical Society
, 1997
"... ABSTRACT. Event trees, and more generally, event spaces, can be used to provide a foundation for mathematical probability that includes a systematic understanding of causality. This foundation justifies the use of statistics in causal investigation and provides a rigorous semantics for causal reason ..."
Abstract

Cited by 4 (4 self)
 Add to MetaCart
ABSTRACT. Event trees, and more generally, event spaces, can be used to provide a foundation for mathematical probability that includes a systematic understanding of causality. This foundation justifies the use of statistics in causal investigation and provides a rigorous semantics for causal reasoning. Causal reasoning, always important in applied statistics and increasingly important in artificial intelligence, has never been respectable in mathematical treatments of probability. But, as this article shows, a home can be made for causal reasoning in the very foundations of mathematical probability. The key is to bring the event tree, basic to the thinking of Pascal, Huygens, and other pioneers of probability, back into probability’s foundations. An event tree represents the possibilities for the stepbystep evolution of an observer’s knowledge. If that observer is nature, then the steps in the tree are causes. If we add branching probabilities, we obtain a probability tree, which can express nature’s limited ability to predict the effects of causes. As a foundation for the statistical investigation of causality, event and probability trees provide a language for causal explanation, which gives rigorous meaning to causal claims and clarifies the relevance of different kinds of evidence to those claims. As a foundation for probability theory, they allow an elementary treatment of martingales,
Testing exchangeability online
 Proceedings of the Twentieth International Conference on Machine Learning
, 2003
"... praktiqeskie vyvody teorii vero�tnoste� mogut bytь obosnovany v kaqestve sledstvi� gipotez o predelьno� pri dannyh ograniqeni�h sloжnosti izuqaemyh �vleni� ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
praktiqeskie vyvody teorii vero�tnoste� mogut bytь obosnovany v kaqestve sledstvi� gipotez o predelьno� pri dannyh ograniqeni�h sloжnosti izuqaemyh �vleni�
Pricing European Options Without Probability
, 1995
"... It is well known that in the case where the stock price S t is governed by the equation dS t =S t = dt + oedW t , any European option satisfying weak regularity conditions has a fair price (the BlackScholes formula and its generalizations). We consider the case where no probabilistic assumptions ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
It is well known that in the case where the stock price S t is governed by the equation dS t =S t = dt + oedW t , any European option satisfying weak regularity conditions has a fair price (the BlackScholes formula and its generalizations). We consider the case where no probabilistic assumptions are made about S t ; instead, we assume that the derivative security D which pays a dividend of (dS t =S t ) 2 (the squared relative increase in the price of S t ) each instant dt is traded in the market. We prove that the "regular" European options have fair prices provided that both S t and D t (the price process of D) are continuous and the fractal dimensions of the graphs of S t and D t satisfy certain inequalities. Intuitively our assumptions are much weaker than the usual assumption dS t =S t = dt + oedW t . Key Words: BlackScholes formula, fractal dimension, pathwise stochastic integral, nonstandard analysis The final version of this paper was prepared for the seminar on the f...