Results 1  10
of
82
An Empirical Comparison of Supervised Learning Algorithms
 In Proc. 23 rd Intl. Conf. Machine learning (ICML’06
, 2006
"... A number of supervised learning methods have been introduced in the last decade. Unfortunately, the last comprehensive empirical evaluation of supervised learning was the Statlog Project in the early 90’s. We present a largescale empirical comparison between ten supervised learning methods: SVMs, n ..."
Abstract

Cited by 95 (6 self)
 Add to MetaCart
A number of supervised learning methods have been introduced in the last decade. Unfortunately, the last comprehensive empirical evaluation of supervised learning was the Statlog Project in the early 90’s. We present a largescale empirical comparison between ten supervised learning methods: SVMs, neural nets, logistic regression, naive bayes, memorybased learning, random forests, decision trees, bagged trees, boosted trees, and boosted stumps. We also examine the effect that calibrating the models via Platt Scaling and Isotonic Regression has on their performance. An important aspect of our study is the use of a variety of performance criteria to evaluate the learning methods. 1.
Transforming Classifier Scores into Accurate Multiclass Probability Estimates
, 2002
"... Class membership probability estimates are important for many applications of data mining in which classification outputs are combined with other sources of information for decisionmaking, such as exampledependent misclassification costs, the outputs of other classifiers, or domain knowledge. Prev ..."
Abstract

Cited by 67 (5 self)
 Add to MetaCart
Class membership probability estimates are important for many applications of data mining in which classification outputs are combined with other sources of information for decisionmaking, such as exampledependent misclassification costs, the outputs of other classifiers, or domain knowledge. Previous calibration methods apply only to twoclass problems. Here, we show how to obtain accurate probability estimates for multiclass problems by combining calibrated binary probability estimates. We also propose a new method for obtaining calibrated twoclass probability estimates that can be applied to any classifier that produces a ranking of examples. Using naive Bayes and support vector machine classifiers, we give experimental results from a variety of twoclass and multiclass domains, including direct marketing, text categorization and digit recognition.
Nonparametric Tests for Common Values in FirstPrice SealedBid Auctions
, 2003
"... We develop tests for common values at firstprice sealedbid auctions. Our tests are nonparametric, require observation only of the bids submitted at each auction, and are based on the fact that the “winner’s curse” arises only in common values auctions. The tests build on recently developed methods ..."
Abstract

Cited by 52 (8 self)
 Add to MetaCart
We develop tests for common values at firstprice sealedbid auctions. Our tests are nonparametric, require observation only of the bids submitted at each auction, and are based on the fact that the “winner’s curse” arises only in common values auctions. The tests build on recently developed methods for using observed bids to estimate each bidder’s conditional expectation of the value of winning the auction. Equilibrium behavior implies that in a private values auction these expectations are invariant to the number of opponents each bidder faces, while with common values they are decreasing in the number of opponents. This distinction forms the basis of our tests. We consider both exogenous and endogenous variation in the number of bidders. Monte Carlo experiments show that our tests can perform well in samples of moderate sizes. We apply our tests to two different types of U.S. Forest Service timber auctions. For unitprice (“scaled”) sales often argued to fit a private values model, our tests consistently fail to find evidence of common values. For “lumpsum” sales, where aprioriarguments for common values appear stronger, our tests yield mixed evidence against the private values hypothesis.
Computing Chernoff’s distribution
 J. Comput. Graph. Statist
"... A distribution that arises in problems of estimation of monotone functions is that of the location of the maximum of twosided Brownian motion minus a parabola. Using results from the � rst author’s earlier work, we present algorithms and programs for computation of this distribution and its quantil ..."
Abstract

Cited by 34 (11 self)
 Add to MetaCart
A distribution that arises in problems of estimation of monotone functions is that of the location of the maximum of twosided Brownian motion minus a parabola. Using results from the � rst author’s earlier work, we present algorithms and programs for computation of this distribution and its quantiles. We also present some comparisons with earlier computations and simulations.
Likelihood ratio tests for monotone functions
 Ann. Statist
, 2001
"... \Ve study the problem of testing for equality at a fixed point in the setting of nonparametric estimation of a monotone function. The likelihood ratio test for this hypothesis is derived in the particular case of interval censoring (or current status data) and its limiting distribution is obtained. ..."
Abstract

Cited by 27 (17 self)
 Add to MetaCart
\Ve study the problem of testing for equality at a fixed point in the setting of nonparametric estimation of a monotone function. The likelihood ratio test for this hypothesis is derived in the particular case of interval censoring (or current status data) and its limiting distribution is obtained. The limiting distribution is that of the integral of the difference of the squared slope processes corresponding to a canonical version of the problem involving Brownian motion + t2 and greatest convex minorants thereof. 2ROI AI291968
Sleeping coordination for comprehensive sensing using isotonic regression and domatic partitions
 in Proc. of INFOCOM ’06
"... Abstract — We address the problem of energy efficient sensing by adaptively coordinating the sleep schedules of sensor nodes while guaranteeing that values of sleeping nodes can be recovered from the awake nodes within a user’s specified error bound. Our approach has two phases. First, development o ..."
Abstract

Cited by 20 (4 self)
 Add to MetaCart
Abstract — We address the problem of energy efficient sensing by adaptively coordinating the sleep schedules of sensor nodes while guaranteeing that values of sleeping nodes can be recovered from the awake nodes within a user’s specified error bound. Our approach has two phases. First, development of models for predicting measurement of one sensor using data from other sensors. Second, creation of the maximal number of subgroups of disjoint nodes, each of whose data is sufficient to recover the measurements of the entire sensor network. For prediction of the sensor measurements, we introduce a new optimal nonparametric polynomial time isotonic regression. Utilizing the prediction models, the sleeping coordination problem is abstracted to a domatic number problem and is optimally solved using an ILP solver. To capture evolving dynamics of the instrumented environment, we monitor the prediction errors occasionally to trigger adaptation of the models and domatic partitions as needed. Experimental evaluations on traces of a medium size network with temperature and humidity sensors indicate that the method can extend the lifetime of the network by a factor of 4 or higher even for a strict error target. I.
DifferentiallyPrivate Network Trace Analysis
"... Abstract – We consider the potential for network trace analysis while providing the guarantees of “differential privacy.” While differential privacy provably obscures the presence or absence of individual records in a dataset, it has two major limitations: analyses must (presently) be expressed in a ..."
Abstract

Cited by 19 (0 self)
 Add to MetaCart
Abstract – We consider the potential for network trace analysis while providing the guarantees of “differential privacy.” While differential privacy provably obscures the presence or absence of individual records in a dataset, it has two major limitations: analyses must (presently) be expressed in a higher level declarative language; and the analysis results are randomized before returning to the analyst. We report on our experiences conducting a diverse set of analyses in a differentially private manner. We are able to express all of our target analyses, though for some of them an approximate expression is required to keep the errorlevel low. By running these analyses on real datasets, we find that the error introduced for the sake of privacy is often (but not always) low even at high levels of privacy. We factor our learning into a toolkit that will be likely useful for other analyses. Overall, we conclude that differential privacy shows promise for a broad class of network analyses. Categories and Subject Descriptors C.2.m [Computercommunication networks] Miscellaneous
A Fast Scaling Algorithm for Minimizing Separable Convex Functions Subject to Chain Constraints
 OPERATIONS RESEARCH
, 2001
"... In this paper, we consider the problem of minimizing jj jN C(x) # , subject to the following chain constraints l x 1 x 2 x 3 x n u, where C j (x j ) is a convex function of x j for each j N = {1, 2, , n}. This problem is a generalization of the isotonic regression problems with complete ..."
Abstract

Cited by 18 (2 self)
 Add to MetaCart
In this paper, we consider the problem of minimizing jj jN C(x) # , subject to the following chain constraints l x 1 x 2 x 3 x n u, where C j (x j ) is a convex function of x j for each j N = {1, 2, , n}. This problem is a generalization of the isotonic regression problems with complete order, an important class of problems in regression analysis that has been examined extensively in the literature. We refer to this problem as the generalized isotonic regression problem. In this paper, we focus on developing a fast scaling algorithm to obtain an integer solution of the generalized isotonic regression problem. Let U denote the difference between an upper bound on an optimal value of x n and a lower bound on an optimal value of x 1 . Under the assumption that the evaluation of any function C j (x j ) takes O(1) time, we show that the generalized isotonic regression problem can be solved in O(n log U) time. This improves by a factor of n the previous best running time of O(n 2 log U) to solve the same problem. In addition, when our algorithm is specialized to the isotonic median regression problem (where jj C(x)= jjj cx a  ) for specified values of c j s and a j s, the algorithm obtains a realvalued optimal solution in O(n log n) time. This time bound matches the best available time bound to solve the isotonic median regression problem, but our algorithm uses simpler data structures and may be easier to implement.
Obtaining calibrated probabilities from boosting
 in Proc. 21st Conference on Uncertainty in Artificial Intelligence (UAI’05
, 2005
"... Boosted decision trees typically yield good accuracy, precision, and ROC area. However, because the outputs from boosting are not well calibrated posterior probabilities, boosting yields poor squared error and crossentropy. We empirically demonstrate why AdaBoost predicts distorted probabilities an ..."
Abstract

Cited by 17 (1 self)
 Add to MetaCart
Boosted decision trees typically yield good accuracy, precision, and ROC area. However, because the outputs from boosting are not well calibrated posterior probabilities, boosting yields poor squared error and crossentropy. We empirically demonstrate why AdaBoost predicts distorted probabilities and examine three calibration methods for correcting this distortion: Platt Scaling, Isotonic Regression, and Logistic Correction. We also experiment with boosting using logloss instead of the usual exponential loss. Experiments show that Logistic Correction and boosting with logloss work well when boosting weak models such as decision stumps, but yield poor performance when boosting more complex models such as full decision trees. Platt Scaling and Isotonic Regression, however, significantly improve the probabilities predicted by both boosted stumps and boosted trees. After calibration, boosted full decision trees predict better probabilities than other learning methods such as SVMs, neural nets, bagged decision trees, and KNNs, even after these methods are calibrated.