Results 1  10
of
53
The strength of weak learnability
 Machine Learning
, 1990
"... Abstract. This paper addresses the problem of improving the accuracy of an hypothesis output by a learning algorithm in the distributionfree (PAC) learning model. A concept class is learnable (or strongly learnable) if, given access to a Source of examples of the unknown concept, the learner with h ..."
Abstract

Cited by 667 (23 self)
 Add to MetaCart
Abstract. This paper addresses the problem of improving the accuracy of an hypothesis output by a learning algorithm in the distributionfree (PAC) learning model. A concept class is learnable (or strongly learnable) if, given access to a Source of examples of the unknown concept, the learner with high probability is able to output an hypothesis that is correct on all but an arbitrarily small fraction of the instances. The concept class is weakly learnable if the learner can produce an hypothesis that performs only slightly better than random guessing. In this paper, it is shown that these two notions of learnability are equivalent. A method is described for converting a weak learning algorithm into one that achieves arbitrarily high accuracy. This construction may have practical applications as a tool for efficiently converting a mediocre learning algorithm into one that performs extremely well. In addition, the construction has some interesting theoretical consequences, including a set of general upper bounds on the complexity of any strong learning algorithm as a function of the allowed error e.
Learning Decision Trees using the Fourier Spectrum
, 1991
"... This work gives a polynomial time algorithm for learning decision trees with respect to the uniform distribution. (This algorithm uses membership queries.) The decision tree model that is considered is an extension of the traditional boolean decision tree model that allows linear operations in each ..."
Abstract

Cited by 182 (10 self)
 Add to MetaCart
This work gives a polynomial time algorithm for learning decision trees with respect to the uniform distribution. (This algorithm uses membership queries.) The decision tree model that is considered is an extension of the traditional boolean decision tree model that allows linear operations in each node (i.e., summation of a subset of the input variables over GF (2)). This paper shows how to learn in polynomial time any function that can be approximated (in norm L 2 ) by a polynomially sparse function (i.e., a function with only polynomially many nonzero Fourier coefficients). The authors demonstrate that any function f whose L 1 norm (i.e., the sum of absolute value of the Fourier coefficients) is polynomial can be approximated by a polynomially sparse function, and prove that boolean decision trees with linear operations are a subset of this class of functions. Moreover, it is shown that the functions with polynomial L 1 norm can be learned deterministically. The algorithm can a...
Learning conjunctions of Horn clauses
 In Proceedings of the 31st Annual Symposium on Foundations of Computer Science
, 1990
"... Abstract. An algorithm is presented for learning the class of Boolean formulas that are expressible as conjunctions of Horn clauses. (A Horn clause is a disjunction of literals, all but at most one of which is a negated variable.) The algorithm uses equivalence queries and membership queries to prod ..."
Abstract

Cited by 112 (16 self)
 Add to MetaCart
Abstract. An algorithm is presented for learning the class of Boolean formulas that are expressible as conjunctions of Horn clauses. (A Horn clause is a disjunction of literals, all but at most one of which is a negated variable.) The algorithm uses equivalence queries and membership queries to produce a formula that is logically equivalent to the unknown formula to be learned. The amount of time used by the algorithm is polynomial in the number of variables and the number of clauses in the unknown formula.
Extracting Comprehensible Models from Trained Neural Networks
, 1996
"... To Mom, Dad, and Susan, for their support and encouragement. ..."
Abstract

Cited by 69 (4 self)
 Add to MetaCart
To Mom, Dad, and Susan, for their support and encouragement.
Dynamic Power Management Using Adaptive Learning Tree
 IN PROC. INTERNATIONAL CONFERENCE ON COMPUTERAIDED DESIGN
, 1999
"... Dynamic Power Management (DPM) is a technique to reduce power consumption of electronic systems by selectively shutting down idle components. The quality of the shutdown control algorithm (power management policy) mostly depends on the knowledge of user behavior, which in many cases is initially unk ..."
Abstract

Cited by 54 (3 self)
 Add to MetaCart
Dynamic Power Management (DPM) is a technique to reduce power consumption of electronic systems by selectively shutting down idle components. The quality of the shutdown control algorithm (power management policy) mostly depends on the knowledge of user behavior, which in many cases is initially unknown or nonstationary. For this reason, DPM policies should be capable of adapting to changes in user behavior. In this paper, we present a novel DPM scheme based on idle period clustering and adaptive learning trees. We also provide a design guide for applying our technique to components with multiple sleep states. Experimental results show that our technique outperforms other advanced DPM schemes as well as simple timeout policies. The proposed approach shows little deviation of efficiency for various workloads having different characteristics, while other policies show that their efficiency changes drastically depending on the trace data characteristics. Furthermore, experimental evidence indicates that our workload learning algorithm is stable and has fast convergence.
Exact Learning Boolean Functions via the Monotone Theory
 INFORMATION AND COMPUTATION
, 1995
"... We study the learnability of boolean functions from membership and equivalence queries. We develop the Monotone Theory that proves 1) Any boolean function is learnable in polynomial time in its minimal DNF size, its minimal CNF size and the number of variables n. In particular, 2) Decision tree ..."
Abstract

Cited by 51 (3 self)
 Add to MetaCart
We study the learnability of boolean functions from membership and equivalence queries. We develop the Monotone Theory that proves 1) Any boolean function is learnable in polynomial time in its minimal DNF size, its minimal CNF size and the number of variables n. In particular, 2) Decision trees are learnable. Our algorithms are in the model of exact learning with membership queries and unrestricted equivalence queires. The hypotheses to the equivalence queries and the output hypotheses are depth 3 formulas.
Learnability Beyond AC^0
"... We give an algorithm to learn constantdepth polynomialsize circuits augmented with majority gates under the uniform distribution using random examples only. For circuits which contain a polylogarithmic number of majority gates the algorithm runs in quasipolynomial time. This is the first algorithm ..."
Abstract

Cited by 35 (15 self)
 Add to MetaCart
We give an algorithm to learn constantdepth polynomialsize circuits augmented with majority gates under the uniform distribution using random examples only. For circuits which contain a polylogarithmic number of majority gates the algorithm runs in quasipolynomial time. This is the first algorithm for learning a more expressive circuit class than the class AC° of constantdepth polynomialsize circuits, a class which was shown to be learnable in quasipolynomial time by Linial, Mansour and Nisan in 1989. Our approach combines an extension of some of the Fourier analysis from Linial et al. with hypothesis boosting. We also show that under a standard cryptographic assumption our algorithm is essentially optimal with respect to both running time and expressiveness (number of majority gates) of the circuits being learned.
Learning DNF from Random Walks
 IN PROCEEDINGS OF FOCS
, 2003
"... We consider a model of learning Boolean functions from examples generated by a uniform random walk on {0, 1}^n. We give a polynomial time algorithm for learning decision trees and DNF formulas in this model. This is the first efficient algorithm for learning these classes in a natural passive learni ..."
Abstract

Cited by 17 (3 self)
 Add to MetaCart
We consider a model of learning Boolean functions from examples generated by a uniform random walk on {0, 1}^n. We give a polynomial time algorithm for learning decision trees and DNF formulas in this model. This is the first efficient algorithm for learning these classes in a natural passive learning model where the learner has no influence over the choice of examples used for learning.
NoiseTolerant DistributionFree Learning of General Geometric Concepts
, 1996
"... this paper. First, we give an algorithm to learn C ..."
Abstract

Cited by 16 (3 self)
 Add to MetaCart
this paper. First, we give an algorithm to learn C
Agnostically learning decision trees
 In Proceedings of the 40th Annual ACM Symposium on Theory of Computing
"... We give a query algorithm for agnostically learning decision trees with respect to the uniform distribution on inputs. Given blackbox access to an arbitrary binary function f on the ndimensional hypercube, our algorithm finds a function that agrees with f on almost (within an ɛ fraction) as many i ..."
Abstract

Cited by 16 (7 self)
 Add to MetaCart
We give a query algorithm for agnostically learning decision trees with respect to the uniform distribution on inputs. Given blackbox access to an arbitrary binary function f on the ndimensional hypercube, our algorithm finds a function that agrees with f on almost (within an ɛ fraction) as many inputs as the best sizet decision tree, in time poly(n, t, 1/ɛ). This is the first polynomialtime algorithm for learning decision trees in a harsh noise model. We also give a proper agnostic learning algorithm for juntas, a subclass of decision trees, again using membership queries. Conceptually, the present paper parallels recent work towards agnostic learning of halfspaces [13]; algorithmically, it is more challenging. The core of our learning algorithm is a procedure to implicitly solve a convex optimization problem over the L1 ball in 2 n dimensions using an approximate gradient projection method. 1.