Results 1  10
of
33
The strength of weak learnability
 Machine Learning
, 1990
"... Abstract. This paper addresses the problem of improving the accuracy of an hypothesis output by a learning algorithm in the distributionfree (PAC) learning model. A concept class is learnable (or strongly learnable) if, given access to a Source of examples of the unknown concept, the learner with h ..."
Abstract

Cited by 667 (23 self)
 Add to MetaCart
Abstract. This paper addresses the problem of improving the accuracy of an hypothesis output by a learning algorithm in the distributionfree (PAC) learning model. A concept class is learnable (or strongly learnable) if, given access to a Source of examples of the unknown concept, the learner with high probability is able to output an hypothesis that is correct on all but an arbitrarily small fraction of the instances. The concept class is weakly learnable if the learner can produce an hypothesis that performs only slightly better than random guessing. In this paper, it is shown that these two notions of learnability are equivalent. A method is described for converting a weak learning algorithm into one that achieves arbitrarily high accuracy. This construction may have practical applications as a tool for efficiently converting a mediocre learning algorithm into one that performs extremely well. In addition, the construction has some interesting theoretical consequences, including a set of general upper bounds on the complexity of any strong learning algorithm as a function of the allowed error e.
Optimal Prefetching via Data Compression
, 1995
"... Caching and prefetching are important mechanisms for speeding up access time to data on secondary storage. Recent work in competitive online algorithms has uncovered several promising new algorithms for caching. In this paper we apply a form of the competitive philosophy for the first time to the pr ..."
Abstract

Cited by 236 (11 self)
 Add to MetaCart
Caching and prefetching are important mechanisms for speeding up access time to data on secondary storage. Recent work in competitive online algorithms has uncovered several promising new algorithms for caching. In this paper we apply a form of the competitive philosophy for the first time to the problem of prefetching to develop an optimal universal prefetcher in terms of fault ratio, with particular applications to largescale databases and hypertext systems. Our prediction algorithms for prefetching are novel in that they are based on data compression techniques that are both theoretically optimal and good in practice. Intuitively, in order to compress data effectively, you have to be able to predict future data well, and thus good data compressors should be able to predict well for purposes of prefetching. We show for powerful models such as Markov sources and nth order Markov sources that the page fault rates incurred by our prefetching algorithms are optimal in the limit for almost all sequences of page requests.
The minimum consistent DFA problem cannot be approximated within any polynomial
 Journal of the Association for Computing Machinery
, 1993
"... Abstract. The minimum consistent DFA problem is that of finding a DFA with as few states as possible that is consistent with a given sample (a finite collection of words, each labeled as to whether the DFA found should accept or reject). Assuming that P # NP, it is shown that for any constant k, no ..."
Abstract

Cited by 82 (4 self)
 Add to MetaCart
Abstract. The minimum consistent DFA problem is that of finding a DFA with as few states as possible that is consistent with a given sample (a finite collection of words, each labeled as to whether the DFA found should accept or reject). Assuming that P # NP, it is shown that for any constant k, no polynomialtime algorithm can be guaranteed to find a consistent DFA with fewer than opt ~ states, where opt is the number of states in the minimum state DFA consistent with the sample. This result holds even if the alphabet is of constant size two, and if the algorithm is allowed to produce an NFA, a regular expression, or a regular grammar that is consistent with the sample. A similar nonapproximability result is presented for the problem of finding small consistent linear grammars. For the case of finding minimum consistent DFAs when the alphabet is not of constant size but instead is allowed to vay with the problem specification, the slightly
Learning in the Presence of Finitely or Infinitely Many Irrelevant Attributes
, 1995
"... This paper addresses the problem of learning boolean functions in query and mistakebound ..."
Abstract

Cited by 52 (8 self)
 Add to MetaCart
This paper addresses the problem of learning boolean functions in query and mistakebound
Compression, Significance and Accuracy
, 1992
"... Inductive Logic Programming (ILP) involves learning relational concepts from examples and background knowledge. To date all ILP learning systems make use of tests inherited from propositional and decision tree learning for evaluating the significance of hypotheses. None of these significance t ..."
Abstract

Cited by 43 (5 self)
 Add to MetaCart
Inductive Logic Programming (ILP) involves learning relational concepts from examples and background knowledge. To date all ILP learning systems make use of tests inherited from propositional and decision tree learning for evaluating the significance of hypotheses. None of these significance tests take account of the relevance or utility of the background knowledge. In this paper we describe a method, called HPcompression, of evaluating the significance of a hypothesis based on the degree to which it allows compression of the observed data with respect to the background knowledge. This can be measured by comparing the lengths of the input and output tapes of a reference Turing machine which will generate the examples from the hypothesis and a set of derivational proofs. The model extends an earlier approach of Muggleton by allowing for noise. The truth values of noisy instances are switched by making use of correction codes. The utility of compression as a significance measure is evaluated empirically in three independent domains. In particular, the results show that the existence of positive compression distinguishes a larger number of significant clauses than other significance tests The method is also shown to reliably distinguish artificially introduced noise as incompressible data.
MobilityBased Predictive Call Admission Control and Bandwidth Reservation in Wireless Cellular Networks
 IEEE INFOCOM
, 2001
"... This paper presents call admission control and bandwidth reservation schemes in wireless cellular networks that have been developed based on assumptions more realistic than existing proposals. In order to guarantee the handoff dropping probability, we propose to statistically predict user mobility b ..."
Abstract

Cited by 41 (3 self)
 Add to MetaCart
This paper presents call admission control and bandwidth reservation schemes in wireless cellular networks that have been developed based on assumptions more realistic than existing proposals. In order to guarantee the handoff dropping probability, we propose to statistically predict user mobility based on the mobility history of users. Our mobility prediction scheme is motivated by computational learning theory, which has shown that prediction is synonymous with data compression. We derive our mobility prediction scheme from data compression techniques that are both theoretically optimal and good in practice. In order to utilize resource more efficiently, we predict not only the cell to which the mobile will handoff but also when the handoff will occur. Based on the mobility prediction, bandwidth is reserved to guarantee some target handoff dropping probability. We also adaptively control the admission threshold to achieve a better balance between guaranteeing handoff dropping probability and maximizing resource utilization. Simulation results show that the proposed schemes meet our design goals and outperform the staticreservation and cellreservation schemes. Paper submitted to Computer Networks. This paper is based on a paper presented at IEEE Infocom 2001, Anchorage, Alaska, April 2001. Technical subject area: call admission control, bandwidth reservation, mobility prediction. Please address all correspondence to Professor Victor Leung at the above address. This work was supported by a grant from Motorola Canada Ltd., and by the Canadian Natural Sciences and Engineering Research Council under grant CRDPJ 223095. MobilityBased Predictive Call Admission Control and Bandwidth Reservation in Wireless Cellular Networks Yu 1 I.
Statistical Queries and Faulty PAC Oracles
 In Proceedings of the Sixth Annual ACM Workshop on Computational Learning Theory
, 1993
"... In this paper we study learning in the PAC model of Valiant [18] in which the example oracle used for learning may be faulty in one of two ways: either by misclassifying the example or by distorting the distribution of examples. We first consider models in which examples are misclassified. Kearns [1 ..."
Abstract

Cited by 40 (6 self)
 Add to MetaCart
In this paper we study learning in the PAC model of Valiant [18] in which the example oracle used for learning may be faulty in one of two ways: either by misclassifying the example or by distorting the distribution of examples. We first consider models in which examples are misclassified. Kearns [12] recently showed that efficient learning in a new model using statistical queries is a sufficient condition for PAC learning with classification noise. We show that efficient learning with statistical queries is sufficient for learning in the PAC model with malicious error rate proportional to the required statistical query accuracy. One application of this result is a new lower bound for tolerable malicious error in learning monomials of k literals. This is the first such bound which is independent of the number of irrelevant attributes n. We also use the statistical query model to give sufficient conditions for using distribution specific algorithms on distributions outside their prescr...
A Formal Definition of Intelligence Based on an Intensional Variant of Algorithmic Complexity
 In Proceedings of the International Symposium of Engineering of Intelligent Systems (EIS'98
, 1998
"... Machine Due to the current technology of the computers we can use, we have chosen an extremely abridged emulation of the machine that will effectively run the programs, instead of more proper languages, like lcalculus (or LISP). We have adapted the "toy RISC" machine of [Hernndez & Hernndez 1993] ..."
Abstract

Cited by 30 (17 self)
 Add to MetaCart
Machine Due to the current technology of the computers we can use, we have chosen an extremely abridged emulation of the machine that will effectively run the programs, instead of more proper languages, like lcalculus (or LISP). We have adapted the "toy RISC" machine of [Hernndez & Hernndez 1993] with two remarkable features inherited from its objectoriented coding in C++: it is easily tunable for our needs, and it is efficient. We have made it even more reduced, removing any operand in the instruction set, even for the loop operations. We have only three registers which are AX (the accumulator), BX and CX. The operations Q b we have used for our experiment are in Table 1: LOOPTOP Decrements CX. If it is not equal to the first element jump to the program top.
A Markovian extension of Valiant's learning model (Extended Abstract)
 IN PROCEEDINGS OF THE THIRTYFIRST SYMPOSIUM ON FOUNDATIONS OF COMPUTER SCIENCE
, 1990
"... Formalizing the process of natural induction and justifying its predictive value is not only basic to the phi ..."
Abstract

Cited by 29 (0 self)
 Add to MetaCart
Formalizing the process of natural induction and justifying its predictive value is not only basic to the phi
Relevant examples and relevant features: Thoughts from computational learning theory
 In AAAI94 Fall Symposium, Workshop on Relevance, 1994. [DF95] [DF99] [Jac94] [Kea98
, 1993
"... It can be said that nearly all results in machine learning, whether experimental or theoretical, deal with problems of separating relevant from irrelevant information in some way. In this paper I will attempt to ..."
Abstract

Cited by 25 (1 self)
 Add to MetaCart
It can be said that nearly all results in machine learning, whether experimental or theoretical, deal with problems of separating relevant from irrelevant information in some way. In this paper I will attempt to