Results 1 - 10
of
16
A tutorial on support vector machines for pattern recognition
- Data Mining and Knowledge Discovery
, 1998
"... The tutorial starts with an overview of the concepts of VC dimension and structural risk minimization. We then describe linear Support Vector Machines (SVMs) for separable and non-separable data, working through a non-trivial example in detail. We describe a mechanical analogy, and discuss when SV ..."
Abstract
-
Cited by 1656 (11 self)
- Add to MetaCart
The tutorial starts with an overview of the concepts of VC dimension and structural risk minimization. We then describe linear Support Vector Machines (SVMs) for separable and non-separable data, working through a non-trivial example in detail. We describe a mechanical analogy, and discuss when SVM solutions are unique and when they are global. We describe how support vector training can be practically implemented, and discuss in detail the kernel mapping technique which is used to construct SVM solutions which are nonlinear in the data. We show how Support Vector machines can have very large (even infinite) VC dimension by computing the VC dimension for homogeneous polynomial and Gaussian radial basis function kernels. While very high VC dimension would normally bode ill for generalization performance, and while at present there exists no theory which shows that good generalization performance is guaranteed for SVMs, there are several arguments which support the observed high accuracy of SVMs, which we review. Results of some experiments which were inspired by these arguments are also presented. We give numerous examples and proofs of most of the key theorems. There is new material, and I hope that the reader will find that even old material is cast in a fresh light.
Bayesian Methods for Mixtures of Experts
- In
, 1996
"... We present a Bayesian framework for inferring the parameters of a mixture of experts model based on ensemble learning by variational free energy minimisation. The Bayesian approach avoids the over-fitting and noise level under-estimation problems of traditional maximum likelihood inference. We demon ..."
Abstract
-
Cited by 56 (1 self)
- Add to MetaCart
We present a Bayesian framework for inferring the parameters of a mixture of experts model based on ensemble learning by variational free energy minimisation. The Bayesian approach avoids the over-fitting and noise level under-estimation problems of traditional maximum likelihood inference. We demonstrate these methods on artificial problems and sunspot time series prediction. INTRODUCTION The task of estimating the parameters of adaptive models such as artificial neural networks using Maximum Likelihood (ML) is well documented eg. Geman, Bienenstock
Combinations of Weak Classifiers
, 1997
"... To obtain classification systems with both good generalizatìon performance and efficiency in space and time, we propose a learning method based on combinations of weak classifiers, where weak classifiers are linear classifiers (perceptrons) which can do a little better than making random guesses. A ..."
Abstract
-
Cited by 32 (1 self)
- Add to MetaCart
To obtain classification systems with both good generalizatìon performance and efficiency in space and time, we propose a learning method based on combinations of weak classifiers, where weak classifiers are linear classifiers (perceptrons) which can do a little better than making random guesses. A randomized algorithm is proposed to find the weak classifiers. They are then combined through a majority vote. As demonstrated through systematic experiments, the method developed is able to obtain combinations of weak classifiers with good generalization performance and a fast training time on a variety of test problems and real applications. Theoretical analysis on one of the test problems investigated in our experiments provides insights on when and why the proposed method works. In particular, when the strength of weak classifiers is properly chosen, combinations of weak classifiers can achieve a good generalization performance with polynomial space- and time-complexity.
The Global Dimensionality of Face Space
- In: Proceedings of the 4th Intl. Conference on Automatic Face and Gesture Recognition, IEEE CS
, 2000
"... Low-dimensional representations of sensory signals are key to solving many of the computational problems encountered in high-level vision. Principal Component Analysis (PCA) has been used in the past to derive such compact representations for the object class of human faces. Here, with an interpreta ..."
Abstract
-
Cited by 31 (3 self)
- Add to MetaCart
Low-dimensional representations of sensory signals are key to solving many of the computational problems encountered in high-level vision. Principal Component Analysis (PCA) has been used in the past to derive such compact representations for the object class of human faces. Here, with an interpretation of PCA as a probabilistic model, we employ two objective criteria to study its generalization properties in the context of large frontal-pose face databases. We find that the eigenfaces, the eigenspectrum, and the generalization depend strongly on the ensemble composition and size, with statistics for populations as large as 5500, still not stationary. Further, the assumption of mirror symmetry of the ensemble improves the quality of the results substantially in the low-statistics regime, and is also essential in the high-statistics regime. We employ a perceptual criterion and argue that, even with large statistics, the dimensionality of the PCA subspace necessary for adequate represent...
Homo Heuristicus: Why Biased Minds Make Better Inferences
, 2008
"... Heuristics are efficient cognitive processes that ignore information. In contrast to the widely held view that less processing reduces accuracy, the study of heuristics shows that less information, computation, and time can in fact improve accuracy. We review the major progress made so far: (a) the ..."
Abstract
-
Cited by 22 (3 self)
- Add to MetaCart
Heuristics are efficient cognitive processes that ignore information. In contrast to the widely held view that less processing reduces accuracy, the study of heuristics shows that less information, computation, and time can in fact improve accuracy. We review the major progress made so far: (a) the discovery of less-is-more effects; (b) the study of the ecological rationality of heuristics, which examines in which environments a given strategy succeeds or fails, and why; (c) an advancement from vague labels to computational models of heuristics; (d) the development of a systematic theory of heuristics that identifies their building blocks and the evolved capacities they exploit, and views the cognitive system as relying on an ‘‘adaptive toolbox;’ ’ and (e) the development of an empirical methodology that accounts for individual differences, conducts competitive tests, and has provided evidence for people’s adaptive use of heuristics. Homo heuristicus has a biased mind and ignores part of the available information, yet a biased mind can handle uncertainty more efficiently and robustly than an unbiased mind relying on more resource-intensive and general-purpose processing strategies.
Heterogeneities in Macroparasite Infections: Patterns and Processes
, 2002
"... ome rather complex. Some of the variation in parasite loads we observe is predictable. For example, in mammals and some other taxa, males tend to be more heavily infected than females, perhaps due to differences in immune function (Potdin 1996a, Schalk and Forbes 1997, McCurdy et al. 1998). Parasit ..."
Abstract
-
Cited by 8 (1 self)
- Add to MetaCart
ome rather complex. Some of the variation in parasite loads we observe is predictable. For example, in mammals and some other taxa, males tend to be more heavily infected than females, perhaps due to differences in immune function (Potdin 1996a, Schalk and Forbes 1997, McCurdy et al. 1998). Parasite loads tend to increase with age and may plateau in older animals, though if acquired immunity is important (or there is parasite-induced host mortality) then they may tdtimately decline again, so reducing the degree of parasite aggregation. Genetic differences in susceptibility to infection may also be important, though their extent and direction are much more difficult to predict. Other factors that may contribute to the observed heterogeneities in worm burdens are the condition of the host (which may be a function of parasite load), host behaviour, parasite genetics and seasonality. Comparative studies of aggregation suggest that the infection process and the habitat of the host may make
Financial markets: very noisy information processing
- Proceedings of the IEEE
, 1998
"... We report new results about the impact of noise on information processing with application to financial markets. These results quantify the tradeoff between the amount of data and the noise level in the data. They also provide estimates for the performance of a learning system in terms of the noise ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
We report new results about the impact of noise on information processing with application to financial markets. These results quantify the tradeoff between the amount of data and the noise level in the data. They also provide estimates for the performance of a learning system in terms of the noise level. We use these results to derive a method for detecting the change in market volatility from period to period. We successfully apply these results to the four major foreign exchange (FX) markets. The results hold for linear as well as nonlinear learning models and algorithms and for different noise models.
On the dimensionality of face space
- IEEE Transactions of Pattern Analysis and Machine Intelligence
, 2007
"... The dimensionality of face space is measured objectively in a psychophysical study. Within this framework we obtain a measurement of the dimension for the human visual system. Using an eigenface basis, evidence is presented that talented human observers are able to identify familiar faces that lie i ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
The dimensionality of face space is measured objectively in a psychophysical study. Within this framework we obtain a measurement of the dimension for the human visual system. Using an eigenface basis, evidence is presented that talented human observers are able to identify familiar faces that lie in a space of roughly 100 dimensions, and the average observer requires a space of between 100 and 200 dimensions. This is below most current estimates. It is further argued that these estimates give an upper bound for face space dimension, and this might be lowered by better constructed "eigenfaces", and by talented observers. I.
Troika – An Improved Stacking Schema for Classification Tasks
"... Stacking is a general ensemble method in which a number of base classifiers are combined using one meta-classifier which learns their outputs. Such an approach provides certain advantages: simplicity; performance that is similar to the best classifier; and the capability of combining classifiers ind ..."
Abstract
-
Cited by 3 (3 self)
- Add to MetaCart
Stacking is a general ensemble method in which a number of base classifiers are combined using one meta-classifier which learns their outputs. Such an approach provides certain advantages: simplicity; performance that is similar to the best classifier; and the capability of combining classifiers induced by different inducers. The disadvantage of stacking is that on multiclass problems, stacking seems to perform worse than other meta-learning approaches. In this paper we present Troika, a new stacking method for improving ensemble classifiers. The new scheme is built from three layers of combining classifiers. The new method was tested on various datasets and the results indicate the superiority of the proposed method to other legacy ensemble schemes, Stacking and StackingC, especially when the classification task consists of more than two classes

