Results 1 - 10
of
14
Competitive on-line statistics
- International Statistical Review
, 1999
"... A radically new approach to statistical modelling, which combines mathematical techniques of Bayesian statistics with the philosophy of the theory of competitive on-line algorithms, has arisen over the last decade in computer science (to a large degree, under the influence of Dawid’s prequential sta ..."
Abstract
-
Cited by 39 (7 self)
- Add to MetaCart
A radically new approach to statistical modelling, which combines mathematical techniques of Bayesian statistics with the philosophy of the theory of competitive on-line algorithms, has arisen over the last decade in computer science (to a large degree, under the influence of Dawid’s prequential statistics). In this approach, which we call “competitive on-line statistics”, it is not assumed that data are generated by some stochastic mechanism; the bounds derived for the performance of competitive on-line statistical procedures are guaranteed to hold (and not just hold with high probability or on the average). This paper reviews some results in this area; the new material in it includes the proofs for the performance of the Aggregating Algorithm in the problem of linear regression with square loss. Keywords: Bayes’s rule, competitive on-line algorithms, linear regression, prequential statistics, worst-case analysis.
Ridge Regression Confidence Machine
- In Proceedings of the Eighteenth International Conference on Machine Learning
, 2001
"... Recently kernel based methods for machine ..."
Inductive Confidence Machines for Regression
- IN TAPIO ELOMAA, HEIKKI MANNILA, AND HANNU TOIVONEN, EDITORS, PROCEEDINGS OF THE THIRTEENTH EUROPEAN CONFERENCE ON MACHINE LEARNING
, 2002
"... The existing methods of predicting with confidence give good accuracy and confidence values, but quite often are computationally inefficient. Some partial ..."
Abstract
-
Cited by 8 (5 self)
- Add to MetaCart
The existing methods of predicting with confidence give good accuracy and confidence values, but quite often are computationally inefficient. Some partial
Comparing the bayes and typicalness frameworks
- In Proceedings of the 12th European Conference on Machine Learning (ECML-2001
, 2001
"... Abstract. When correct priors are known, Bayesian algorithms give optimal decisions, and accurate confidence values for predictions can be obtained. If the prior is incorrect however, these confidence values have no theoretical base – even though the algorithms ’ predictive performance may be good. ..."
Abstract
-
Cited by 7 (1 self)
- Add to MetaCart
Abstract. When correct priors are known, Bayesian algorithms give optimal decisions, and accurate confidence values for predictions can be obtained. If the prior is incorrect however, these confidence values have no theoretical base – even though the algorithms ’ predictive performance may be good. There also exist many successful learning algorithms which only depend on the iid assumption. Often however they produce no confidence values for their predictions. Bayesian frameworks are often applied to these algorithms in order to obtain such values, however they can rely on unjustified priors. In this paper 1 we outline the typicalness framework which can be used in conjunction with many other machine learning algorithms. The framework provides confidence information based only on the standard iid assumption and so is much more robust to different underlying data distributions. We show how the framework can be applied to existing algorithms. We also present experimental results which show that the typicalness approach performs close to Bayes when the prior is known to be correct. Unlike Bayes however, the method still gives accurate confidence values even when different data distributions are considered. 1
Computationally efficient transductive machines
- ALT'00 Proceedings
, 2000
"... Abstract. In this paper 1 we propose a new algorithm for providing confidence and credibility values for predictions on a multi-class pattern recognition problem which uses Support Vector machines in its implementation. Previous algorithms which have been proposed to achieve this are very processing ..."
Abstract
-
Cited by 5 (4 self)
- Add to MetaCart
Abstract. In this paper 1 we propose a new algorithm for providing confidence and credibility values for predictions on a multi-class pattern recognition problem which uses Support Vector machines in its implementation. Previous algorithms which have been proposed to achieve this are very processing intensive and are only practical for small data sets. We present here a method which overcomes these limitations and can deal with larger data sets (such as the US Postal Service database). The measures of confidence and credibility given by the algorithm are shown empirically to reflect the quality of the predictions obtained by the algorithm, and are comparable to those given by the less computationally efficient method. In addition to this the overall performance of the algorithm is shown to be comparable to other techniques (such as standard Support Vector machines), which simply give flat predictions and do not provide the extra confidence/credibility measures. 1
Open-set face recognition using transduction
- PAMI
, 2005
"... Abstract: This paper motivates and describes a novel realization of transductive inference that can address the Open Set face recognition task. Open Set operates under the assumption that not all the test probes have mates in the gallery. It either detects the presence of some biometric signature wi ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
Abstract: This paper motivates and describes a novel realization of transductive inference that can address the Open Set face recognition task. Open Set operates under the assumption that not all the test probes have mates in the gallery. It either detects the presence of some biometric signature within the gallery and finds its identity or rejects it, i.e., it provides for the “none of the above ” answer. The main contribution of the paper is Open Set TCM – kNN (Transduction Confidence Machine – k Nearest Neighbors), which is suitable for multi-class authentication operational scenarios that have to include a rejection option for classes never enrolled in the gallery. Open Set TCM – kNN, driven by the relation between transduction and Kolmogorov complexity, provides a local estimation of the likelihood ratio needed for detection tasks. We provide extensive experimental data to show the feasibility, robustness, and comparative advantages of Open Set TCM – kNN on Open Set identification and watch list (surveillance) tasks using challenging FERET data. Last, we analyze the error structure driven by the fact that most of the errors in identification are due to a relatively small number of face patterns. Open Set TCM- kNN is shown to be suitable for PSEI (pattern specific error inhomogeneities) error analysis in order to identify difficult to recognize faces. PSEI analysis improves biometric performance by removing a small number of those difficult to recognize faces responsible for much of the original error in performance and/or by using data fusion.
The typicalness framework: a comparison with the Bayesian approach
- Department of Computer Science, Royal Holloway, University of London
, 2001
"... When correct priors are known, Bayesian algorithms give optimal decisions, and accurate confidence values for predictions can be obtained. If the prior is incorrect however, these confidence values have no theoretical base -- even though the algorithms' predictive performance may be good. There ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
When correct priors are known, Bayesian algorithms give optimal decisions, and accurate confidence values for predictions can be obtained. If the prior is incorrect however, these confidence values have no theoretical base -- even though the algorithms' predictive performance may be good. There also exist many successful learning algorithms which only depend on the iid assumption. Often however they produce no confidence values for their predictions. Bayesian frameworks are often applied to these algorithms in order to obtain such values, however they can rely on unjustified priors. In this paper we outline the typicalness framework which can be used in conjunction with many other machine learning algorithms. The framework provides confidence information based only on the standard iid assumption and so is much more robust to different underlying data distributions. We show how the framework can be applied to existing algorithms. We also present experimental results which show that the typicalness approach performs close to Bayes when the prior is known to be correct. Unlike Bayes however, the method still gives accurate confidence values even when different data distributions are considered. 1
Plant promoter prediction with confidence estimation
- Nucleic Acids Res
, 2005
"... Accurate prediction of promoters is fundamental to understanding gene expression patterns, where confidence estimation is one of the main requirements. Using recently developed transductive confidence machine (TCM) techniques, we developed a new program TSSP-TCM for the prediction of plant promoters ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
Accurate prediction of promoters is fundamental to understanding gene expression patterns, where confidence estimation is one of the main requirements. Using recently developed transductive confidence machine (TCM) techniques, we developed a new program TSSP-TCM for the prediction of plant promoters that also provides confidence of the prediction. The program was trained on 132 and 104 sequences and tested on 40 and 25 sequences (containing TATA and TATA-less promoters, respectively) with known transcription start sites (TSSs). As negative training samples for TCM learning we used coding and intron sequences of plant genes annotated in the GenBank. In the test set of TATA promoters, the program correctlypredictedTSSfor35outof40(87.5%)geneswith a median deviation of several base pairs from the true site location. For 25 TATA-less promoters, TSSs were predicted for 21 out of 25 (84%) genes, including 14 cases of 5 bp distance between annotated and predicted TSSs. Using TSSP-TCM program we annotated promoters in the whole Arabidopsis genome. The predicted promoters were in good agreement with the start position of known Arabidopsis mRNAs. Thus, TCM technique has produced a plant-oriented promoter prediction tool of high accuracy. TSSP-TCM program and annotated promoters are available at
Testing exchangeability on-line
- Proceedings of the Twentieth International Conference on Machine Learning
, 2003
"... praktiqeskie vyvody teorii vero�tnoste� mogut bytь obosnovany v kaqestve sledstvi� gipotez o predelьno� pri dannyh ograniqeni�h sloжnosti izuqaemyh �vleni� ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
praktiqeskie vyvody teorii vero�tnoste� mogut bytь obosnovany v kaqestve sledstvi� gipotez o predelьno� pri dannyh ograniqeni�h sloжnosti izuqaemyh �vleni�
Meta-classifier approach to reliable text classification
, 2005
"... A problem with automatic classifiers is that there is no way to know if a particular classification is just a guess or a certain answer. Reliable classification is the task of predicting whether a certain instance is correctly classified or not, i.e., a classification is classified as either reliabl ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
A problem with automatic classifiers is that there is no way to know if a particular classification is just a guess or a certain answer. Reliable classification is the task of predicting whether a certain instance is correctly classified or not, i.e., a classification is classified as either reliable or unreliable. When the classification is classified as unreliable, it is like saying “I do not know”, and the instance does not receive a classification. Given a base classifier, the meta-classifier approach is to train a metaclassifier that predicts the correctness of each classification of the base classifier. The classification rule of the meta-classifier approach is to assign a class predicted by the base classifier to an instance if the meta-classifier decides that the base classification is reliable. The meta-classifier approach is applied on text classification tasks provided by the CBS to answer the following problem statement: Does the meta-classifier approach provide a practical solution to

