Results 1  10
of
20
Learning Intersections and Thresholds of Halfspaces
"... We give the first polynomial time algorithm to learn any function of a constant number of halfspaces under the uniform distribution to within any constant error parameter. We also give the first quasipolynomial time algorithm for learning any function of a polylog number of polynomialweight halfsp ..."
Abstract

Cited by 68 (24 self)
 Add to MetaCart
We give the first polynomial time algorithm to learn any function of a constant number of halfspaces under the uniform distribution to within any constant error parameter. We also give the first quasipolynomial time algorithm for learning any function of a polylog number of polynomialweight halfspaces under any distribution. As special cases of these results we obtain algorithms for learning intersections and thresholds of halfspaces. Our uniform distribution learning algorithms involve a novel nongeometric approach to learning halfspaces; we use Fourier techniques together with a careful analysis of the noise sensitivity of functions of halfspaces. Our algorithms for learning under any distribution use techniques from real approximation theory to construct low degree polynomial threshold functions.
Software Abstractions
, 2006
"... We give an algorithm that with high probability properly learns random monotone t(n)term DNF under the uniform distribution on the Boolean cube {0, 1} n. For any polynomially bounded function t(n) ≤ poly(n) the algorithm runs in time poly(n, 1/ɛ) and with high probability outputs an ɛaccurate mon ..."
Abstract

Cited by 32 (2 self)
 Add to MetaCart
We give an algorithm that with high probability properly learns random monotone t(n)term DNF under the uniform distribution on the Boolean cube {0, 1} n. For any polynomially bounded function t(n) ≤ poly(n) the algorithm runs in time poly(n, 1/ɛ) and with high probability outputs an ɛaccurate monotone DNF hypothesis. This is the first algorithm that can learn monotone DNF of arbitrary polynomial size in a reasonable averagecase model of learning from random examples only.
Learning Monotone Decision Trees in Polynomial Time
, 2005
"... We give an algorithm that learns any monotone Boolean function f: {1, 1}n! {1, 1}to any constant accuracy, under the uniform distribution, in time polynomial in n and in thedecision tree size of f. This is the first algorithm that can learn arbitrary monotone Booleanfunctions to high accuracy, us ..."
Abstract

Cited by 23 (5 self)
 Add to MetaCart
We give an algorithm that learns any monotone Boolean function f: {1, 1}n! {1, 1}to any constant accuracy, under the uniform distribution, in time polynomial in n and in thedecision tree size of f. This is the first algorithm that can learn arbitrary monotone Booleanfunctions to high accuracy, using random examples only, in time polynomial in a reasonable measure of the complexity of f. A key ingredient of the result is a new bound showing that theaverage sensitivity of any monotone function computed by a decision tree of size s must be atmost plog s. This bound has already proved to be of independent utility in the study of decisiontree complexity [27]. We generalize the basic inequality and learning result described above in various ways; specifically, to partition size (a stronger complexity measure than decision tree size), pbiased measuresover the Boolean cube (rather than just the uniform distribution), and realvalued (rather than
Learning DNF from Random Walks
 IN PROCEEDINGS OF FOCS
, 2003
"... We consider a model of learning Boolean functions from examples generated by a uniform random walk on {0, 1}^n. We give a polynomial time algorithm for learning decision trees and DNF formulas in this model. This is the first efficient algorithm for learning these classes in a natural passive learni ..."
Abstract

Cited by 18 (3 self)
 Add to MetaCart
We consider a model of learning Boolean functions from examples generated by a uniform random walk on {0, 1}^n. We give a polynomial time algorithm for learning decision trees and DNF formulas in this model. This is the first efficient algorithm for learning these classes in a natural passive learning model where the learner has no influence over the choice of examples used for learning.
Learning a circuit by injecting values
, 2008
"... We propose a new model for exact learning of acyclic circuits using experiments in which chosen values may be assigned to an arbitrary subset of wires internal to the circuit, but only the value of the circuit’s single output wire may be observed. We give polynomial time algorithms to learn (1) arbi ..."
Abstract

Cited by 11 (5 self)
 Add to MetaCart
We propose a new model for exact learning of acyclic circuits using experiments in which chosen values may be assigned to an arbitrary subset of wires internal to the circuit, but only the value of the circuit’s single output wire may be observed. We give polynomial time algorithms to learn (1) arbitrary circuits with logarithmic depth and constant fanin and (2) Boolean circuits of constant depth and unbounded fanin over AND, OR, and NOT gates. Thus, both AC0 and NC1 circuits are learnable in polynomial time in this model. Negative results show that some restrictions on depth, fanin and gate types are necessary: exponentially many experiments are required to learn AND/OR circuits of unbounded depth and fanin; it is NPhard to learn AND/OR circuits of unbounded depth and fanin 2; and it is NPhard to learn circuits of constant depth and unbounded fanin over AND, OR, and threshold gates, even when the target circuit is known to contain at most one threshold gate and that threshold gate has threshold 2. We also consider the effect of adding an oracle for behavioral equivalence. In this case there are polynomialtime algorithms to learn arbitrary circuits of constant fanin and unbounded depth and to learn Boolean circuits with arbitrary fanin and unbounded depth over AND, OR, and NOT gates. A corollary is that these two classes are PAClearnable if experiments are available. Finally, we consider an extension of the model called the synchronous model. We show that an even more general class of circuits are learnable in this model. In particular, we are able to learn circuits with cycles.
The Complexity of Properly Learning Simple Concept Classes
, 2007
"... We consider the complexity of properly learning concept classes, i.e. when the learner must output a hypothesis of the same form as the unknown concept. We present the following new upper and lower bounds on wellknown concept classes: • We show that unless NP = RP, there is no polynomialtime PAC l ..."
Abstract

Cited by 9 (1 self)
 Add to MetaCart
We consider the complexity of properly learning concept classes, i.e. when the learner must output a hypothesis of the same form as the unknown concept. We present the following new upper and lower bounds on wellknown concept classes: • We show that unless NP = RP, there is no polynomialtime PAC learning algorithm for DNF formulas where the hypothesis is an ORofthresholds. Note that as special cases, we show that neither DNF nor ORofthresholds are properly learnable unless NP = RP. Previous hardness results have required strong restrictions on the size of the output DNF formula. We also prove that it is NPhard to learn the intersection of ℓ ≥ 2 halfspaces by the intersection of k halfspaces for any constant k ≥ 0. Previous work held for the case when k = ℓ. • Assuming that NP � ⊆ DTIME(2nɛ) for a certain constant ɛ < 1 we show that it is not possible to learn size s decision trees by size sk decision trees for any k ≥ 0. Previous hardness results for learning decision trees held for k ≤ 2. • We present the first nontrivial upper bounds on properly learning DNF formulas. More specifically, we show how to learn size s DNF by DNF in time 2 Õ( √ n log s). The hardness results for DNF formulas and intersections of halfspaces are obtained via specialized
Learning largealphabet and analog circuits with value injection queries
 In the 20th Annual Conference on Learning Theory
, 2007
"... Abstract. We consider the problem of learning an acyclic discrete circuit with n wires, fanin bounded by k and alphabet size s using value injection queries. For the class of transitively reduced circuits, we develop the Distinguishing Paths Algorithm, that learns such a circuit using (ns) O(k) val ..."
Abstract

Cited by 8 (6 self)
 Add to MetaCart
Abstract. We consider the problem of learning an acyclic discrete circuit with n wires, fanin bounded by k and alphabet size s using value injection queries. For the class of transitively reduced circuits, we develop the Distinguishing Paths Algorithm, that learns such a circuit using (ns) O(k) value injection queries and time polynomial in the number of queries. We describe a generalization of the algorithm to the class of circuits with shortcut width bounded by b that uses (ns) O(k+b) value injection queries. Both algorithms use value injection queries that fix only O(kd) wires, where d is the depth of the target circuit. We give a reduction showing that without such restrictions on the topology of the circuit, the learning problem may be computationally intractable when s = n Θ(1) , even for circuits of depth O(log n). We then apply our largealphabet learning algorithms to the problem of approximate learning of analog circuits whose gate functions satisfy a Lipschitz condition. Finally, we consider models in which behavioral equivalence queries are also available, and extend and improve the learning algorithms of [5] to handle general classes of gates functions that are polynomial time learnable from counterexamples. 1
Learning restricted models of arithmetic circuits
 Theory of computing
"... Abstract: We present a polynomial time algorithm for learning a large class of algebraic models of computation. We show that any arithmetic circuit whose partial derivatives induce a lowdimensional vector space is exactly learnable from membership and equivalence queries. As a consequence, we obtai ..."
Abstract

Cited by 5 (3 self)
 Add to MetaCart
Abstract: We present a polynomial time algorithm for learning a large class of algebraic models of computation. We show that any arithmetic circuit whose partial derivatives induce a lowdimensional vector space is exactly learnable from membership and equivalence queries. As a consequence, we obtain polynomialtime algorithms for learning restricted algebraic branching programs as well as noncommutative setmultilinear arithmetic formulae. In addition, we observe that the algorithms of Bergadano et al. (1996) and Beimel et al. (2000) can be used to learn depth3 setmultilinear arithmetic circuits. Previously only versions of depth2 arithmetic circuits were known to be learnable in polynomial time. Our learning algorithms can be viewed as solving a generalization of the well known polynomial interpolation problem where the unknown polynomial has a succinct representation. We can learn representations of polynomials encoding exponentially many monomials. Our techniques combine a careful algebraic analysis of the partial derivatives of arithmetic circuits with “multiplicity automata ” learning algorithms due to Bergadano et al. (1997) and Beimel et al. (2000).
On learning random DNF formulas under the uniform distribution
 In Proc. 9th Internat. Workshop on Randomization and Computation (RANDOM’05
, 2005
"... Abstract: We study the averagecase learnability of DNF formulas in the model of learning from uniformly distributed random examples. We define a natural model of random monotone DNF formulas and give an efficient algorithm which with high probability can learn, for any fixed constant γ> 0, a random ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
Abstract: We study the averagecase learnability of DNF formulas in the model of learning from uniformly distributed random examples. We define a natural model of random monotone DNF formulas and give an efficient algorithm which with high probability can learn, for any fixed constant γ> 0, a random tterm monotone DNF for any t = O(n 2−γ). We also define a model of random nonmonotone DNF and give an efficient algorithm which with high probability can learn a random tterm DNF for any t = O(n 3/2−γ). These are the first known algorithms that can learn a broad class of polynomialsize DNF in a reasonable averagecase model of learning from random examples. ACM Classification: I.2.6, F.2.2, G.1.2, G.3