Results 1  10
of
538
The Entire Regularization Path for the Support Vector Machine
, 2004
"... In this paper we argue that the choice of the SVM cost parameter can be critical. We then derive an algorithm that can fit the entire path of SVM solutions for every value of the cost parameter, with essentially the same computational cost as fitting one SVM model. ..."
Abstract

Cited by 148 (9 self)
 Add to MetaCart
In this paper we argue that the choice of the SVM cost parameter can be critical. We then derive an algorithm that can fit the entire path of SVM solutions for every value of the cost parameter, with essentially the same computational cost as fitting one SVM model.
Extremely Randomized Trees
 MACHINE LEARNING
, 2003
"... This paper presents a new learning algorithm based on decision tree ensembles. In opposition to the classical decision tree induction method, the trees of the ensemble are built by selecting the tests during their induction fully at random. This extreme ..."
Abstract

Cited by 130 (34 self)
 Add to MetaCart
This paper presents a new learning algorithm based on decision tree ensembles. In opposition to the classical decision tree induction method, the trees of the ensemble are built by selecting the tests during their induction fully at random. This extreme
Metric Learning by Collapsing Classes
"... We present an algorithm for learning a quadratic Gaussian metric (Mahalanobis distance) for use in classification tasks. Our method relies on the simple geometric intuition that a good metric is one under which points in the same class are simultaneously near each other and far from points in th ..."
Abstract

Cited by 130 (2 self)
 Add to MetaCart
We present an algorithm for learning a quadratic Gaussian metric (Mahalanobis distance) for use in classification tasks. Our method relies on the simple geometric intuition that a good metric is one under which points in the same class are simultaneously near each other and far from points in the other classes. We construct a convex optimization problem whose solution generates such a metric by trying to collapse all examples in the same class to a single point and push examples in other classes infinitely far away. We show that when the metric we learn is used in simple classifiers, it yields substantial improvements over standard alternatives on a variety of problems. We also discuss how the learned metric may be used to obtain a compact low dimensional feature representation of the original input space, allowing more efficient classification with very little reduction in performance.
An introduction to boosting and leveraging
 Advanced Lectures on Machine Learning, LNCS
, 2003
"... ..."
Novel methods improve prediction of species’ distributions from occurrence data
 Ecography
, 2006
"... occurrence data ..."
Piecewise linear regularized solution paths
 Ann. Statist
, 2007
"... We consider the generic regularized optimization problem ˆ β(λ) = arg minβ L(y, Xβ) + λJ(β). Recently, Efron et al. (2004) have shown that for the Lasso – that is, if L is squared error loss and J(β) = ‖β‖1 is the l1 norm of β – the optimal coefficient path is piecewise linear, i.e., ∂ ˆ β(λ)/∂λ i ..."
Abstract

Cited by 83 (8 self)
 Add to MetaCart
We consider the generic regularized optimization problem ˆ β(λ) = arg minβ L(y, Xβ) + λJ(β). Recently, Efron et al. (2004) have shown that for the Lasso – that is, if L is squared error loss and J(β) = ‖β‖1 is the l1 norm of β – the optimal coefficient path is piecewise linear, i.e., ∂ ˆ β(λ)/∂λ is piecewise constant. We derive a general characterization of the properties of (loss L, penalty J) pairs which give piecewise linear coefficient paths. Such pairs allow for efficient generation of the full regularized coefficient paths. We investigate the nature of efficient path following algorithms which arise. We use our results to suggest robust versions of the Lasso for regression and classification, and to develop new, efficient algorithms for existing problems in the literature, including Mammen & van de Geer’s Locally Adaptive Regression Splines. 1
Adapting ranking SVM to document retrieval
 In Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval
, 2006
"... The paper is concerned with applying learning to rank to document retrieval. Ranking SVM is a typical method of learning to rank. We point out that there are two factors one must consider when applying Ranking SVM, in general a “learning to rank” method, to document retrieval. First, correctly ranki ..."
Abstract

Cited by 83 (19 self)
 Add to MetaCart
The paper is concerned with applying learning to rank to document retrieval. Ranking SVM is a typical method of learning to rank. We point out that there are two factors one must consider when applying Ranking SVM, in general a “learning to rank” method, to document retrieval. First, correctly ranking documents on the top of the result list is crucial for an Information Retrieval system. One must conduct training in a way that such ranked results are accurate. Second, the number of relevant documents can vary from query to query. One must avoid training a model biased toward queries with a large number of relevant documents. Previously, when existing methods that include Ranking SVM were applied to document retrieval, none of the two factors was taken into consideration. We show it is possible to make modifications in conventional Ranking SVM, so it can be better used for document retrieval. Specifically, we modify the “Hinge Loss ” function in Ranking SVM to deal with the problems described above. We employ two methods to conduct optimization on the loss function: gradient descent and quadratic programming. Experimental results show that our method, referred to as Ranking SVM for IR, can outperform the conventional Ranking SVM and other existing methods for document retrieval on two datasets.
Temporal properties of low power wireless links: Modeling and implications on multihop routing
 In ACM MobiHoc
, 2005
"... Recently, several studies have analyzed the statistical properties of low power wireless links in real environments, clearly demonstrating the differences between experimentally observed communication properties and widely used simulation models. However, most of these studies have not performed in ..."
Abstract

Cited by 65 (3 self)
 Add to MetaCart
Recently, several studies have analyzed the statistical properties of low power wireless links in real environments, clearly demonstrating the differences between experimentally observed communication properties and widely used simulation models. However, most of these studies have not performed in depth analysis of the temporal properties of wireless links. These properties have high impact on the performance of routing algorithms. Our first goal is to study the statistical temporal properties of links in low power wireless communications. We study short term temporal issues, like lagged autocorrelation of individual links, lagged correlation of reverse links, and consecutive same path links. We also study long term temporal aspects, gaining insight on the length of time the channel needs to be measured and how often we should update our models. Our second objective is to explore how statistical temporal properties impact routing protocols. We studied onetoone routing schemes and developed new routing algorithms that consider autocorrelation, and reverse link and consecutive same path link lagged correlations. We have developed two new routing algorithms for the cost link model: (i) a generalized Dijkstra algorithm with centralized execution, and (ii)a localized distributed probabilistic algorithm. 1
Practical selection of svm parameters and noise estimation for svm regression
 Neural Networks
, 2004
"... We investigate practical selection of metaparameters for SVM regression (that is, εinsensitive zone and regularization parameter C). The proposed methodology advocates analytic parameter selection directly from the training data, rather than resampling approaches commonly used in SVM applications. ..."
Abstract

Cited by 55 (0 self)
 Add to MetaCart
We investigate practical selection of metaparameters for SVM regression (that is, εinsensitive zone and regularization parameter C). The proposed methodology advocates analytic parameter selection directly from the training data, rather than resampling approaches commonly used in SVM applications. Good generalization performance of the proposed parameter selection is demonstrated empirically using several lowdimensional and highdimensional regression problems. Further, we point out the importance of Vapnik’s εinsensitive loss for regression problems with finite samples. To this end, we compare generalization performance of SVM regression (with optimally chosen ε) with regression using ‘leastmodulus ’ loss (ε =0). These comparisons indicate superior generalization performance of SVM regression, for finite sample settings.
Optimizing Spatial Filters for Robust EEG SingleTrial Analysis
 IEEE Signal Proc. Magazine
, 2008
"... Abstract—Due to the volume conduction multichannel electroencephalogram (EEG) recordings give a rather blurred image of brain activity. Therefore spatial filters are extremely useful in singletrial analysis in order to improve the signaltonoise ratio. There are powerful methods from machine lear ..."
Abstract

Cited by 50 (14 self)
 Add to MetaCart
Abstract—Due to the volume conduction multichannel electroencephalogram (EEG) recordings give a rather blurred image of brain activity. Therefore spatial filters are extremely useful in singletrial analysis in order to improve the signaltonoise ratio. There are powerful methods from machine learning and signal processing that permit the optimization of spatiotemporal filters for each subject in a data dependent fashion beyond the fixed filters based on the sensor geometry, e.g., Laplacians. Here we elucidate the theoretical background of the Common Spatial Pattern (CSP) algorithm, a popular method in BrainComputer Interface (BCI) research. Apart from reviewing several variants of the basic algorithm, we reveal tricks of the trade for achieving a powerful CSP performance, briefly elaborate on theoretical aspects of CSP and demonstrate the application of CSPtype preprocessing in our studies of the Berlin BrainComputer Interface project.