Results 1  10
of
18
The WEKA Data Mining Software: An Update
"... More than twelve years have elapsed since the first public release of WEKA. In that time, the software has been rewritten entirely from scratch, evolved substantially and now accompanies a text on data mining [35]. These days, WEKA enjoys widespread acceptance in both academia and business, has an a ..."
Abstract

Cited by 704 (11 self)
 Add to MetaCart
(Show Context)
More than twelve years have elapsed since the first public release of WEKA. In that time, the software has been rewritten entirely from scratch, evolved substantially and now accompanies a text on data mining [35]. These days, WEKA enjoys widespread acceptance in both academia and business, has an active community, and has been downloaded more than 1.4 million times since being placed on SourceForge in April 2000. This paper provides an introduction to the WEKA workbench, reviews the history of the project, and, in light of the recent 3.6 stable release, briefly discusses what has been added since the last stable version (Weka 3.4) released in 2003. 1.
Using string kernels to identify famous performers from their playing style
 In Proceedings of the 15th European Conference on Machine Learning (ECML’2004
, 2004
"... In this chapter we show a novel application of string kernels: that is to the problem of recognising famous pianists from their style of playing. The characteristics of performers playing the same piece are obtained from changes in beatlevel tempo and beatlevel loudness, which over the time of the ..."
Abstract

Cited by 31 (9 self)
 Add to MetaCart
(Show Context)
In this chapter we show a novel application of string kernels: that is to the problem of recognising famous pianists from their style of playing. The characteristics of performers playing the same piece are obtained from changes in beatlevel tempo and beatlevel loudness, which over the time of the piece form a performance worm. From such worms, general performance alphabets can be derived, and pianists ’ performances can then be represented as strings. We show that when using the string kernel on this data, both kernel partial least squares and Support Vector Machines outperform the current best results. Furthermore we suggest a new method of obtaining feature directions from the Kernel Partial Least Squares algorithm and show that this can deliver better performance than methods previously used in the literature when used in conjunction with a Support Vector Machine 1
GaborBased Kernel PartialLeastSquares Discrimination Features for Face Recognition ∗
, 2007
"... Abstract. The paper presents a novel method for the extraction of facial features based on the Gaborwavelet representation of face images and the kernel partialleastsquares discrimination (KPLSD) algorithm. The proposed featureextraction method, called the Gaborbased kernel partialleastsquare ..."
Abstract

Cited by 6 (0 self)
 Add to MetaCart
(Show Context)
Abstract. The paper presents a novel method for the extraction of facial features based on the Gaborwavelet representation of face images and the kernel partialleastsquares discrimination (KPLSD) algorithm. The proposed featureextraction method, called the Gaborbased kernel partialleastsquares discrimination (GKPLSD), is performed in two consecutive steps. In the first step a set of forty Gabor wavelets is used to extract discriminative and robust facial features, while in the second step the kernel partialleastsquares discrimination technique is used to reduce the dimensionality of the Gabor feature vector and to further enhance its discriminatory power. For optimal performance, the KPLSDbased transformation is implemented using the recently proposed fractionalpowerpolynomial models. The experimental results based on the XM2VTS and ORL databases show that the GKPLSD approach outperforms featureextraction methods such as principal component analysis (PCA), linear discriminant analysis (LDA), kernel principal component analysis (KPCA) or generalized discriminant analysis (GDA) as well as combinations of these methods with Gabor representations of the face images. Furthermore, as the KPLSD algorithm is derived from the kernel partialleastsquares regression (KPLSR) model it does not suffer from the smallsamplesize problem, which is regularly encountered in the field of face recognition.
Face authentication using a hybrid approach
 JOURNAL OF ELECTRONIC IMAGING
, 2008
"... Abstract. This paper presents a hybrid approach to facefeature extraction based on the trace transform and the novel kernel partialleastsquares discriminant analysis (KPA). The hybrid approach, called trace kernel partialleastsquares discriminant analysis (TKPA) first uses a set of fifteen trac ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
Abstract. This paper presents a hybrid approach to facefeature extraction based on the trace transform and the novel kernel partialleastsquares discriminant analysis (KPA). The hybrid approach, called trace kernel partialleastsquares discriminant analysis (TKPA) first uses a set of fifteen trace functionals to derive robust and discriminative facial features and then applies the KPA method to reduce their dimensionality. The feasibility of the proposed approach was successfully tested on the XM2VTS database, where a false rejection rate (FRR) of 1.25 % and a false acceptance rate (FAR) of 2.11 % were achieved in our bestperforming faceauthentication experiment. The experimental results also show that the proposed approach can outperform kernel methods such as generalized discriminant analysis (GDA), kernel fisher analysis (KFA) and complete kernel fisher discriminant analysis (CKFA) as well as combinations of these methods with features extracted using the trace transform.
The influence of the risk functional in data classification with MLPs
 In Proceedings of the International Conference on Artificial Neural Networks
, 2008
"... Abstract. We investigate the capability of multilayer perceptrons using specific risk functionals attaining the minimum probability of error (optimal performance) achievable by the class of mappings implemented by the multilayer perceptron (MLP). For that purpose we have carried out a large set of e ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
(Show Context)
Abstract. We investigate the capability of multilayer perceptrons using specific risk functionals attaining the minimum probability of error (optimal performance) achievable by the class of mappings implemented by the multilayer perceptron (MLP). For that purpose we have carried out a large set of experiments using different risk functionals and datasets. The experiments were rigorously controlled so that any performance difference could only be attributed to the different risk functional being used. Statistical analysis was also conducted in a careful way. From the several conclusions that can be drawn from our experimental results it is worth to emphasize that a risk functional based on a specially tuned exponentially weighted distance attained the best performance in a large variety of datasets. As to the issue of attaining the minimum probability of error we also carried out classification experiments using nonMLP classifiers that implement complex mappings and are known to provide the best results until this date. These experiments have provided evidence that at least in many cases, by using an adequate risk functional, it will be possible to reach the optimal performance. 1
Random Forests Feature Selection with KPLS: Detecting Ischemia from Magnetocardiograms
, 2006
"... Random Forests were introduced by Breiman for feature (variable) selection and improved predictions for decision tree models. The resulting model is often superior to AdaBoost and bagging approaches. In this paper the random forests approach is extended for variable selection with other learning mo ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
(Show Context)
Random Forests were introduced by Breiman for feature (variable) selection and improved predictions for decision tree models. The resulting model is often superior to AdaBoost and bagging approaches. In this paper the random forests approach is extended for variable selection with other learning models, in this case Partial Least Squares (PLS) and Kernel Partial Least Squares (KPLS) to estimate the importance of variables. This variable selection method is demonstrated on two benchmark datasets (Boston Housing and South African heart disease data). Finally, this methodology is applied to magnetocardiogram data for the detection of ischemic heart disease.
2013 IEEE Conference on Computer Vision and Pattern Recognition MKPLS: Manifold Kernel Partial Least Squares for Lipreading and Speaker Identification
"... Visual speech recognition is a challenging problem, due to confusion between visual speech features. The speaker identification problem is usually coupled with speech recognition. Moreover, speaker identification is important to several applications, such as automatic access control, biometrics, aut ..."
Abstract
 Add to MetaCart
(Show Context)
Visual speech recognition is a challenging problem, due to confusion between visual speech features. The speaker identification problem is usually coupled with speech recognition. Moreover, speaker identification is important to several applications, such as automatic access control, biometrics, authentication, and personal privacy issues. In this paper, we propose a novel approach for lipreading and speaker identification. We propose a new approach for manifold parameterization in a lowdimensional latent space, where each manifold is represented as a point in that space. We initially parameterize each instance manifold using a nonlinear mapping from a unified manifold representation. We then factorize the parameter space using Kernel Partial Least Squares (KPLS) to achieve a lowdimension manifold latent space. We use twoway projections to achieve two manifold latent spaces, one for the speech content and one for the speaker. We apply our approach on two public databases: AVLetters and OuluVS. We show the results for three different settings of lipreading: speaker independent, speaker dependent, and speaker semidependent. Our approach outperforms for the speaker semidependent setting by at least 15 % of the baseline, and competes in the other two settings. 1.
RÉGRESSION PLS ET DONNÉES CENSURÉES Composition du jury:
"... Sujet de la thèse: ..."
(Show Context)
Sigma Tuning of Gaussian Kernels: Detection of Ischemia from magnetocardiograms
, 2011
"... This chapter introduces a novel LevenbergMarquardt like secondorder algorithm for tuning the Parzen window σ in a Radial Basis Function (Gaussian) kernel. In this case each attribute has its own sigma parameter associated with it. The values of the optimized σ are then used as a gauge for variable ..."
Abstract
 Add to MetaCart
This chapter introduces a novel LevenbergMarquardt like secondorder algorithm for tuning the Parzen window σ in a Radial Basis Function (Gaussian) kernel. In this case each attribute has its own sigma parameter associated with it. The values of the optimized σ are then used as a gauge for variable selection. In this study Kernel Partial Least Squares (KPLS) model is applied to several benchmark data sets in order to estimate the effectiveness of the secondorder sigma tuning procedure for an RBF kernel. The variable subset selection method based on these sigma values is then compared with different feature selection procedures such as random forests and sensitivity analysis. The sigmatuned RBF kernel model outperforms KPLS and SVM models with a single sigma value. KPLS models also compare favorably with Least Squares Support Vector Machines (LSSVM), epsiloninsensitive Support Vector Regression and traditional PLS. The sigma tuning and variable selection procedure introduced in this paper is applied to industrial magnetocardiogram data for the detection of ischemic heart disease from measurement of the magnetic field around the heart.
NonLinear Variable Selection in a Regression Context
"... A Bayesian approach to variable selection in a regression context is presented. This aims to find which of a large number of input variables are the important ones in that they contribute to the given regression output. This approach is unlike many in the literature which focus more on features, and ..."
Abstract
 Add to MetaCart
(Show Context)
A Bayesian approach to variable selection in a regression context is presented. This aims to find which of a large number of input variables are the important ones in that they contribute to the given regression output. This approach is unlike many in the literature which focus more on features, and do not explicitly seek to include prior belief that many of the input variables do not contribute any information. The EM methodology presented enables this to be done in a nonlinear regression framework, in particular that of kernel regression. An initial experiment on a biscuit dough problem is presented. 1.