Results 1  10
of
1,027
Novel methods improve prediction of species’ distributions from occurrence data
 Ecography
, 2006
"... occurrence data ..."
(Show Context)
Extremely Randomized Trees
 MACHINE LEARNING
, 2003
"... This paper presents a new learning algorithm based on decision tree ensembles. In opposition to the classical decision tree induction method, the trees of the ensemble are built by selecting the tests during their induction fully at random. This extreme ..."
Abstract

Cited by 232 (44 self)
 Add to MetaCart
This paper presents a new learning algorithm based on decision tree ensembles. In opposition to the classical decision tree induction method, the trees of the ensemble are built by selecting the tests during their induction fully at random. This extreme
Metric Learning by Collapsing Classes
"... We present an algorithm for learning a quadratic Gaussian metric (Mahalanobis distance) for use in classification tasks. Our method relies on the simple geometric intuition that a good metric is one under which points in the same class are simultaneously near each other and far from points in th ..."
Abstract

Cited by 198 (2 self)
 Add to MetaCart
We present an algorithm for learning a quadratic Gaussian metric (Mahalanobis distance) for use in classification tasks. Our method relies on the simple geometric intuition that a good metric is one under which points in the same class are simultaneously near each other and far from points in the other classes. We construct a convex optimization problem whose solution generates such a metric by trying to collapse all examples in the same class to a single point and push examples in other classes infinitely far away. We show that when the metric we learn is used in simple classifiers, it yields substantial improvements over standard alternatives on a variety of problems. We also discuss how the learned metric may be used to obtain a compact low dimensional feature representation of the original input space, allowing more efficient classification with very little reduction in performance.
The Entire Regularization Path for the Support Vector Machine
, 2004
"... In this paper we argue that the choice of the SVM cost parameter can be critical. We then derive an algorithm that can fit the entire path of SVM solutions for every value of the cost parameter, with essentially the same computational cost as fitting one SVM model. ..."
Abstract

Cited by 179 (9 self)
 Add to MetaCart
(Show Context)
In this paper we argue that the choice of the SVM cost parameter can be critical. We then derive an algorithm that can fit the entire path of SVM solutions for every value of the cost parameter, with essentially the same computational cost as fitting one SVM model.
An introduction to boosting and leveraging
 Advanced Lectures on Machine Learning, LNCS
, 2003
"... ..."
(Show Context)
Adapting ranking SVM to document retrieval
 In Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval
, 2006
"... The paper is concerned with applying learning to rank to document retrieval. Ranking SVM is a typical method of learning to rank. We point out that there are two factors one must consider when applying Ranking SVM, in general a “learning to rank” method, to document retrieval. First, correctly ranki ..."
Abstract

Cited by 111 (19 self)
 Add to MetaCart
(Show Context)
The paper is concerned with applying learning to rank to document retrieval. Ranking SVM is a typical method of learning to rank. We point out that there are two factors one must consider when applying Ranking SVM, in general a “learning to rank” method, to document retrieval. First, correctly ranking documents on the top of the result list is crucial for an Information Retrieval system. One must conduct training in a way that such ranked results are accurate. Second, the number of relevant documents can vary from query to query. One must avoid training a model biased toward queries with a large number of relevant documents. Previously, when existing methods that include Ranking SVM were applied to document retrieval, none of the two factors was taken into consideration. We show it is possible to make modifications in conventional Ranking SVM, so it can be better used for document retrieval. Specifically, we modify the “Hinge Loss ” function in Ranking SVM to deal with the problems described above. We employ two methods to conduct optimization on the loss function: gradient descent and quadratic programming. Experimental results show that our method, referred to as Ranking SVM for IR, can outperform the conventional Ranking SVM and other existing methods for document retrieval on two datasets.
Piecewise linear regularized solution paths
 Ann. Statist
, 2007
"... We consider the generic regularized optimization problem ˆ β(λ) = arg minβ L(y, Xβ) + λJ(β). Recently, Efron et al. (2004) have shown that for the Lasso – that is, if L is squared error loss and J(β) = ‖β‖1 is the l1 norm of β – the optimal coefficient path is piecewise linear, i.e., ∂ ˆ β(λ)/∂λ i ..."
Abstract

Cited by 105 (8 self)
 Add to MetaCart
(Show Context)
We consider the generic regularized optimization problem ˆ β(λ) = arg minβ L(y, Xβ) + λJ(β). Recently, Efron et al. (2004) have shown that for the Lasso – that is, if L is squared error loss and J(β) = ‖β‖1 is the l1 norm of β – the optimal coefficient path is piecewise linear, i.e., ∂ ˆ β(λ)/∂λ is piecewise constant. We derive a general characterization of the properties of (loss L, penalty J) pairs which give piecewise linear coefficient paths. Such pairs allow for efficient generation of the full regularized coefficient paths. We investigate the nature of efficient path following algorithms which arise. We use our results to suggest robust versions of the Lasso for regression and classification, and to develop new, efficient algorithms for existing problems in the literature, including Mammen & van de Geer’s Locally Adaptive Regression Splines. 1
Optimizing Spatial Filters for Robust EEG SingleTrial Analysis
 IEEE Signal Proc. Magazine
, 2008
"... Abstract—Due to the volume conduction multichannel electroencephalogram (EEG) recordings give a rather blurred image of brain activity. Therefore spatial filters are extremely useful in singletrial analysis in order to improve the signaltonoise ratio. There are powerful methods from machine lear ..."
Abstract

Cited by 92 (22 self)
 Add to MetaCart
(Show Context)
Abstract—Due to the volume conduction multichannel electroencephalogram (EEG) recordings give a rather blurred image of brain activity. Therefore spatial filters are extremely useful in singletrial analysis in order to improve the signaltonoise ratio. There are powerful methods from machine learning and signal processing that permit the optimization of spatiotemporal filters for each subject in a data dependent fashion beyond the fixed filters based on the sensor geometry, e.g., Laplacians. Here we elucidate the theoretical background of the Common Spatial Pattern (CSP) algorithm, a popular method in BrainComputer Interface (BCI) research. Apart from reviewing several variants of the basic algorithm, we reveal tricks of the trade for achieving a powerful CSP performance, briefly elaborate on theoretical aspects of CSP and demonstrate the application of CSPtype preprocessing in our studies of the Berlin BrainComputer Interface project.
Machine learning classifiers and fmri: A tutorial overview
 NeuroImage
, 2009
"... Interpreting brain image experiments requires analysis of complex, multivariate data. In recent years, one analysis approach that has grown in popularity is the use of machine learning algorithms to train classifiers to decode stimuli, mental states, behaviors and other variables of interest from fM ..."
Abstract

Cited by 89 (4 self)
 Add to MetaCart
Interpreting brain image experiments requires analysis of complex, multivariate data. In recent years, one analysis approach that has grown in popularity is the use of machine learning algorithms to train classifiers to decode stimuli, mental states, behaviors and other variables of interest from fMRI data and thereby show the data contain enough information about them. In this tutorial overview we review some of the key choices faced in using this approach as well as how to derive statistically significant results, illustrating each point from a case study. Furthermore, we show how, in addition to answering the question of ‘is there information about a variable of interest ’ (pattern discrimination), classifiers can be used to tackle other classes of question, namely ‘where is the information ’ (pattern localization) and ‘how is that information encoded ’ (pattern characterization). 1
Practical selection of svm parameters and noise estimation for svm regression
 Neural Networks
, 2004
"... We investigate practical selection of metaparameters for SVM regression (that is, εinsensitive zone and regularization parameter C). The proposed methodology advocates analytic parameter selection directly from the training data, rather than resampling approaches commonly used in SVM applications. ..."
Abstract

Cited by 89 (0 self)
 Add to MetaCart
(Show Context)
We investigate practical selection of metaparameters for SVM regression (that is, εinsensitive zone and regularization parameter C). The proposed methodology advocates analytic parameter selection directly from the training data, rather than resampling approaches commonly used in SVM applications. Good generalization performance of the proposed parameter selection is demonstrated empirically using several lowdimensional and highdimensional regression problems. Further, we point out the importance of Vapnik’s εinsensitive loss for regression problems with finite samples. To this end, we compare generalization performance of SVM regression (with optimally chosen ε) with regression using ‘leastmodulus ’ loss (ε =0). These comparisons indicate superior generalization performance of SVM regression, for finite sample settings.