Results 1 
6 of
6
Logistic Regression, AdaBoost and Bregman Distances
, 2000
"... We give a unified account of boosting and logistic regression in which each learning problem is cast in terms of optimization of Bregman distances. The striking similarity of the two problems in this framework allows us to design and analyze algorithms for both simultaneously, and to easily adapt al ..."
Abstract

Cited by 213 (43 self)
 Add to MetaCart
We give a unified account of boosting and logistic regression in which each learning problem is cast in terms of optimization of Bregman distances. The striking similarity of the two problems in this framework allows us to design and analyze algorithms for both simultaneously, and to easily adapt algorithms designed for one problem to the other. For both problems, we give new algorithms and explain their potential advantages over existing methods. These algorithms can be divided into two types based on whether the parameters are iteratively updated sequentially (one at a time) or in parallel (all at once). We also describe a parameterized family of algorithms which interpolates smoothly between these two extremes. For all of the algorithms, we give convergence proofs using a general formalization of the auxiliaryfunction proof technique. As one of our sequentialupdate algorithms is equivalent to AdaBoost, this provides the first general proof of convergence for AdaBoost. We show that all of our algorithms generalize easily to the multiclass case, and we contrast the new algorithms with iterative scaling. We conclude with a few experimental results with synthetic data that highlight the behavior of the old and newly proposed algorithms in different settings.
MadaBoost: A Modification of AdaBoost
, 2000
"... . In the last decade, one of the research topics that has received a great deal of attention from the machine learning and computational learning communities has been the so called boosting techniques. In this paper, we further explore this topic by proposing a new boosting algorithm that mends some ..."
Abstract

Cited by 23 (3 self)
 Add to MetaCart
. In the last decade, one of the research topics that has received a great deal of attention from the machine learning and computational learning communities has been the so called boosting techniques. In this paper, we further explore this topic by proposing a new boosting algorithm that mends some of the problems that have been detected in the, so far most successful boosting algorithm, AdaBoost due to Freund and Schapire [FS97]. These problems are: (1) AdaBoost cannot be used in the boosting by filtering framework, and (2) AdaBoost does not seem to be noise resistant. In order to solve them, we propose a new boosting algorithm MadaBoost by modifying the weighting system of AdaBoost. We first prove that one version of MadaBoost is in fact a boosting algorithm. Second, we show how our algorithm can be used and analyzed its performance in detail. Finally, we show that our new boosting algorithm can be casted in the statistical query learning model [Kea93] and thus, it is robust to ra...
Scaling up a BoostingBased Learner via Adaptive Sampling
 In Proceedings of the Fourth PacificAsia Conference on Knowledge Discovery and Data Mining
, 2000
"... In this paper we present a experimental evaluation of a boosting based learning system and show that can be run efficiently over a large dataset. The system uses as base learner decision stumps, single atribute decision trees with only two terminal nodes. To select the best decision stump at each it ..."
Abstract

Cited by 12 (5 self)
 Add to MetaCart
(Show Context)
In this paper we present a experimental evaluation of a boosting based learning system and show that can be run efficiently over a large dataset. The system uses as base learner decision stumps, single atribute decision trees with only two terminal nodes. To select the best decision stump at each iteration we use an adaptive sampling method. As a boosting algorithm, we use a modification of AdaBoost that is suitable to be combined with a base learner that does not use all the dataset. We provide experimental evidence that our method is as accurate as the equivalent algorithm that uses all the dataset but much faster.
Faster NearOptimal Reinforcement Learning: Adding Adaptiveness to the E³ Algorithm
 IN ALGORITHMIC LEARNING THEORY, 10TH INTERNATIONAL CONFERENCE, ALT ’99
, 1999
"... Recently, Kearns and Singh presented the first probably efficient and nearoptimal algorithm for reinforcement learning in general Markov decision processes. One of the key contributions of the algorithm is its explicit treatment of the explorationexploitation trade off. In this paper, we show how ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
Recently, Kearns and Singh presented the first probably efficient and nearoptimal algorithm for reinforcement learning in general Markov decision processes. One of the key contributions of the algorithm is its explicit treatment of the explorationexploitation trade off. In this paper, we show how the algorithm can be improved by substituting the exploration phase, that builds a model of the underlying Markov decision process by estimating the transition probabilities, by an adaptive sampling method more suitable for the problem. Our improvement is twofolded. First, our theoretical bound on the worst case time need to converge to an almost optimal policy is significatively smaller. Second, due to the adaptiveness of the sampling method we use, we discuss how our algorithm might perform better in practice than the previous one.
A modification of AdaBoost: A preliminary report
, 1999
"... . In this note, we discuss the boosting algorithm AdaBoost and identify two of its main drawbacks: it cannot be used in the boosting by ltering framework and it is not noise resistant. In order to solve them, we propose a modication of the weighting system of AdaBoost. We prove that the new algorith ..."
Abstract
 Add to MetaCart
. In this note, we discuss the boosting algorithm AdaBoost and identify two of its main drawbacks: it cannot be used in the boosting by ltering framework and it is not noise resistant. In order to solve them, we propose a modication of the weighting system of AdaBoost. We prove that the new algorithm is in fact a boosting algorithm under the condition that the sequence of advantages generated by the weak learning algorithm with respect to the modied distributions is monotonically decreasing. Second, we show how our algorithm can be used in the boosting by ltering framework unlike AdaBoost that can only be used in the boosting by subsampling framework. Finally, we also show that our boosting algorithm is a statistical query algorithm and thus, it is robust to noise in certain sense, a property that AdaBoost does not seem to have. 1 Introduction In the last decade, one of the research topics that has received a great deal of attention from the machine learning and computational le...
Learning Theory, 2000. Logistic Regression, AdaBoost and Bregman Distances
"... Abstract. We give a unified account of boosting and logistic regression in which each learning problem is cast in terms of optimization of Bregman distances. The striking similarity of the two problems in this framework allows us to design and analyze algorithms for both simultaneously, and to easi ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract. We give a unified account of boosting and logistic regression in which each learning problem is cast in terms of optimization of Bregman distances. The striking similarity of the two problems in this framework allows us to design and analyze algorithms for both simultaneously, and to easily adapt algorithms designed for one problem to the other. For both problems, we give new algorithms and explain their potential advantages over existing methods. These algorithms can be divided into two types based on whether the parameters are iteratively updated sequentially (one at a time) or in parallel (all at once). We also describe a parameterized family of algorithms which interpolates smoothly between these two extremes. For all of the algorithms, we give convergence proofs using a general formalization of the auxiliaryfunction proof technique. As one of our sequentialupdate algorithms is equivalent to AdaBoost, this provides the first general proof of convergence for AdaBoost. We show that all of our algorithms generalize easily to the multiclass case, and we contrast the new algorithms with iterative scaling. We conclude with preliminary experimental results. 1