Results 1  10
of
34
A tutorial on support vector machines for pattern recognition
 Data Mining and Knowledge Discovery
, 1998
"... The tutorial starts with an overview of the concepts of VC dimension and structural risk minimization. We then describe linear Support Vector Machines (SVMs) for separable and nonseparable data, working through a nontrivial example in detail. We describe a mechanical analogy, and discuss when SV ..."
Abstract

Cited by 2497 (11 self)
 Add to MetaCart
The tutorial starts with an overview of the concepts of VC dimension and structural risk minimization. We then describe linear Support Vector Machines (SVMs) for separable and nonseparable data, working through a nontrivial example in detail. We describe a mechanical analogy, and discuss when SVM solutions are unique and when they are global. We describe how support vector training can be practically implemented, and discuss in detail the kernel mapping technique which is used to construct SVM solutions which are nonlinear in the data. We show how Support Vector machines can have very large (even infinite) VC dimension by computing the VC dimension for homogeneous polynomial and Gaussian radial basis function kernels. While very high VC dimension would normally bode ill for generalization performance, and while at present there exists no theory which shows that good generalization performance is guaranteed for SVMs, there are several arguments which support the observed high accuracy of SVMs, which we review. Results of some experiments which were inspired by these arguments are also presented. We give numerous examples and proofs of most of the key theorems. There is new material, and I hope that the reader will find that even old material is cast in a fresh light.
Bayesian Methods for Mixtures of Experts
 In
, 1996
"... We present a Bayesian framework for inferring the parameters of a mixture of experts model based on ensemble learning by variational free energy minimisation. The Bayesian approach avoids the overfitting and noise level underestimation problems of traditional maximum likelihood inference. We demon ..."
Abstract

Cited by 62 (1 self)
 Add to MetaCart
We present a Bayesian framework for inferring the parameters of a mixture of experts model based on ensemble learning by variational free energy minimisation. The Bayesian approach avoids the overfitting and noise level underestimation problems of traditional maximum likelihood inference. We demonstrate these methods on artificial problems and sunspot time series prediction. INTRODUCTION The task of estimating the parameters of adaptive models such as artificial neural networks using Maximum Likelihood (ML) is well documented eg. Geman, Bienenstock
Homo Heuristicus: Why Biased Minds Make Better Inferences
, 2009
"... Heuristics are efficient cognitive processes that ignore information. In contrast to the widely held view that less processing reduces accuracy, the study of heuristics shows that less information, computation, and time can in fact improve accuracy. We review the major progress made so far: (a) the ..."
Abstract

Cited by 48 (6 self)
 Add to MetaCart
(Show Context)
Heuristics are efficient cognitive processes that ignore information. In contrast to the widely held view that less processing reduces accuracy, the study of heuristics shows that less information, computation, and time can in fact improve accuracy. We review the major progress made so far: (a) the discovery of lessismore effects; (b) the study of the ecological rationality of heuristics, which examines in which environments a given strategy succeeds or fails, and why; (c) an advancement from vague labels to computational models of heuristics; (d) the development of a systematic theory of heuristics that identifies their building blocks and the evolved capacities they exploit, and views the cognitive system as relying on an ‘‘adaptive toolbox;’ ’ and (e) the development of an empirical methodology that accounts for individual differences, conducts competitive tests, and has provided evidence for people’s adaptive use of heuristics. Homo heuristicus has a biased mind and ignores part of the available information, yet a biased mind can handle uncertainty more efficiently and robustly than an unbiased mind relying on more resourceintensive and generalpurpose processing strategies.
The Global Dimensionality of Face Space
 In: Proceedings of the 4th Intl. Conference on Automatic Face and Gesture Recognition, IEEE CS
, 2000
"... Lowdimensional representations of sensory signals are key to solving many of the computational problems encountered in highlevel vision. Principal Component Analysis (PCA) has been used in the past to derive such compact representations for the object class of human faces. Here, with an interpreta ..."
Abstract

Cited by 40 (4 self)
 Add to MetaCart
Lowdimensional representations of sensory signals are key to solving many of the computational problems encountered in highlevel vision. Principal Component Analysis (PCA) has been used in the past to derive such compact representations for the object class of human faces. Here, with an interpretation of PCA as a probabilistic model, we employ two objective criteria to study its generalization properties in the context of large frontalpose face databases. We find that the eigenfaces, the eigenspectrum, and the generalization depend strongly on the ensemble composition and size, with statistics for populations as large as 5500, still not stationary. Further, the assumption of mirror symmetry of the ensemble improves the quality of the results substantially in the lowstatistics regime, and is also essential in the highstatistics regime. We employ a perceptual criterion and argue that, even with large statistics, the dimensionality of the PCA subspace necessary for adequate represent...
Combinations of Weak Classifiers
, 1997
"... To obtain classification systems with both good generalizatìon performance and efficiency in space and time, we propose a learning method based on combinations of weak classifiers, where weak classifiers are linear classifiers (perceptrons) which can do a little better than making random guesses. A ..."
Abstract

Cited by 37 (1 self)
 Add to MetaCart
To obtain classification systems with both good generalizatìon performance and efficiency in space and time, we propose a learning method based on combinations of weak classifiers, where weak classifiers are linear classifiers (perceptrons) which can do a little better than making random guesses. A randomized algorithm is proposed to find the weak classifiers. They are then combined through a majority vote. As demonstrated through systematic experiments, the method developed is able to obtain combinations of weak classifiers with good generalization performance and a fast training time on a variety of test problems and real applications. Theoretical analysis on one of the test problems investigated in our experiments provides insights on when and why the proposed method works. In particular, when the strength of weak classifiers is properly chosen, combinations of weak classifiers can achieve a good generalization performance with polynomial space and timecomplexity.
Heterogeneities in Macroparasite Infections: Patterns and Processes
, 2002
"... ome rather complex. Some of the variation in parasite loads we observe is predictable. For example, in mammals and some other taxa, males tend to be more heavily infected than females, perhaps due to differences in immune function (Potdin 1996a, Schalk and Forbes 1997, McCurdy et al. 1998). Parasit ..."
Abstract

Cited by 18 (1 self)
 Add to MetaCart
ome rather complex. Some of the variation in parasite loads we observe is predictable. For example, in mammals and some other taxa, males tend to be more heavily infected than females, perhaps due to differences in immune function (Potdin 1996a, Schalk and Forbes 1997, McCurdy et al. 1998). Parasite loads tend to increase with age and may plateau in older animals, though if acquired immunity is important (or there is parasiteinduced host mortality) then they may tdtimately decline again, so reducing the degree of parasite aggregation. Genetic differences in susceptibility to infection may also be important, though their extent and direction are much more difficult to predict. Other factors that may contribute to the observed heterogeneities in worm burdens are the condition of the host (which may be a function of parasite load), host behaviour, parasite genetics and seasonality. Comparative studies of aggregation suggest that the infection process and the habitat of the host may make
On the dimensionality of face space
 IEEE Transactions of Pattern Analysis and Machine Intelligence
"... Abstract—The dimensionality of face space is measured objectively in a psychophysical study. Within this framework, we obtain a measurement of the dimension for the human visual system. Using an eigenface basis, evidence is presented that talented human observers are able to identify familiar faces ..."
Abstract

Cited by 7 (1 self)
 Add to MetaCart
(Show Context)
Abstract—The dimensionality of face space is measured objectively in a psychophysical study. Within this framework, we obtain a measurement of the dimension for the human visual system. Using an eigenface basis, evidence is presented that talented human observers are able to identify familiar faces that lie in a space of roughly 100 dimensions and the average observer requires a space of between 100 and 200 dimensions. This is below most current estimates. It is further argued that these estimates give an upper bound for face space dimension and this might be lowered by better constructed “eigenfaces ” and by talented observers. Index Terms—Face and gesture recognition, computational models of vision, psychology, singular value decomposition. 1
A mixture model for pose clustering
 Patt. Recogn. Let
, 1999
"... This paper describes a structural method for object alignment by pose clustering. The idea underlying pose clustering is to decompose the objects under consideration into ktuples of primitive parts. By bringing pairs of ktuples into correspondence, sets of alignment parameters are estimated. The g ..."
Abstract

Cited by 6 (0 self)
 Add to MetaCart
(Show Context)
This paper describes a structural method for object alignment by pose clustering. The idea underlying pose clustering is to decompose the objects under consideration into ktuples of primitive parts. By bringing pairs of ktuples into correspondence, sets of alignment parameters are estimated. The global alignment corresponds to the set of parameters with maximum votes. The work reported here oers two novel contributions. Firstly, we impose structural constraints on the arrangement of the ktuples of primitives used for pose clustering. This limits problems of combinatorial nature and eases the search for consistent pose clusters. Secondly, we use the EM algorithm to estimate maximum likelihood alignment parameters. Here we fit a mixture model to the set of transformation parameter votes. We control the order of the underlying mixture model using a minimum description length criterion. The new alignment method is illustrated
Financial markets: very noisy information processing
 Proceedings of the IEEE
, 1998
"... We report new results about the impact of noise on information processing with application to financial markets. These results quantify the tradeoff between the amount of data and the noise level in the data. They also provide estimates for the performance of a learning system in terms of the noise ..."
Abstract

Cited by 6 (2 self)
 Add to MetaCart
We report new results about the impact of noise on information processing with application to financial markets. These results quantify the tradeoff between the amount of data and the noise level in the data. They also provide estimates for the performance of a learning system in terms of the noise level. We use these results to derive a method for detecting the change in market volatility from period to period. We successfully apply these results to the four major foreign exchange (FX) markets. The results hold for linear as well as nonlinear learning models and algorithms and for different noise models.