Results 1 -
4 of
4
Fast Effective Rule Induction
, 1995
"... Many existing rule learning systems are computationally expensive on large noisy datasets. In this paper we evaluate the recently-proposed rule learning algorithm IREP on a large and diverse collection of benchmark problems. We show that while IREP is extremely efficient, it frequently gives error r ..."
Abstract
-
Cited by 800 (19 self)
- Add to MetaCart
Many existing rule learning systems are computationally expensive on large noisy datasets. In this paper we evaluate the recently-proposed rule learning algorithm IREP on a large and diverse collection of benchmark problems. We show that while IREP is extremely efficient, it frequently gives error rates higher than those of C4.5 and C4.5rules. We then propose a number of modifications resulting in an algorithm RIPPERk that is very competitive with C4.5rules with respect to error rates, but much more efficient on large samples. RIPPERk obtains error rates lower than or equivalent to C4.5rules on 22 of 37 benchmark problems, scales nearly linearly with the number of training examples, and can efficiently process noisy datasets containing hundreds of thousands of examples.
Bagging, Boosting, and C4.5
- In Proceedings of the Thirteenth National Conference on Artificial Intelligence
, 1996
"... Breiman's bagging and Freund and Schapire's boosting are recent methods for improving the predictive power of classifier learning systems. Both form a set of classifiers that are combined by voting, bagging by generating replicated bootstrap samples of the data, and boosting by adjusting the weight ..."
Abstract
-
Cited by 251 (1 self)
- Add to MetaCart
Breiman's bagging and Freund and Schapire's boosting are recent methods for improving the predictive power of classifier learning systems. Both form a set of classifiers that are combined by voting, bagging by generating replicated bootstrap samples of the data, and boosting by adjusting the weights of training instances. This paper reports results of applying both techniques to a system that learns decision trees and testing on a representative collection of datasets. While both approaches substantially improve predictive accuracy, boosting shows the greater benefit. On the other hand, boosting also produces severe degradation on some datasets. A small change to the way that boosting combines the votes of learned classifiers reduces this downside and also leads to slightly better results on most of the datasets considered. Introduction Designers of empirical machine learning systems are concerned with such issues as the computational cost of the learning method and the accuracy and ...
Likelihood-based Data Squashing: A Modeling Approach to Instance Construction.
, 2002
"... Squashing is a lossy data compression technique that preserves statistical information. Specifically, squashing compresses a massive dataset to a much smaller one so that outputs from statistical analyses carried out on the smaller (squashed) dataset reproduce outputs from the same statistical analy ..."
Abstract
-
Cited by 14 (1 self)
- Add to MetaCart
Squashing is a lossy data compression technique that preserves statistical information. Specifically, squashing compresses a massive dataset to a much smaller one so that outputs from statistical analyses carried out on the smaller (squashed) dataset reproduce outputs from the same statistical analyses carried out on the original dataset. Likelihood-based data squashing (LDS) differs from a previously published squashing algorithm insofar as it uses a statistical model to squash the data. The results show that LDS provides excellent squashing performance even when the target statistical analysis departs from the model used to squash the data.
Distributed Learning on Very Large Data Sets
- In Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
, 2000
"... One approach to learning from intractably large data sets is to utilize all the training data by learning models on tractably sized subsets of the data. The subsets of data may be disjoint or partially overlapping. The individual learned models may be combined into a single model or a voting approac ..."
Abstract
-
Cited by 10 (4 self)
- Add to MetaCart
One approach to learning from intractably large data sets is to utilize all the training data by learning models on tractably sized subsets of the data. The subsets of data may be disjoint or partially overlapping. The individual learned models may be combined into a single model or a voting approachmay be used to combine the classi#cations of a set of models. An approach to learning models in parallel from arbitrarily large training data sets and combining them into a classi#er is described. The training sets are disjoint in the work described here. A parallel implementation on the DOE's ASCI Red parallel supercomputer is described. Results with data sets small enough to be handled by a single processor show that data sets can be divided into a moderate number of distinct subsets without degrading classi#er accuracy. Speedup results are shown for a parallel implementation on the ASCI Red with data sets too large to be handled on a single processor. Training sets of size 3 to 50 millio...

