Results 1  10
of
19
Learning juntas
 In Proc. 35th Ann. ACM Symp. on the Theory of Computing
, 2003
"... We consider a fundamental problem in computational learning theory: learning an arbitrary Boolean function which depends on an unknown set of k out of n Boolean variables. We give an algorithm for learning such functions from uniform random examples which runs in time roughly (n k) ω ω+1, where ω < ..."
Abstract

Cited by 28 (2 self)
 Add to MetaCart
We consider a fundamental problem in computational learning theory: learning an arbitrary Boolean function which depends on an unknown set of k out of n Boolean variables. We give an algorithm for learning such functions from uniform random examples which runs in time roughly (n k) ω ω+1, where ω < 2.376 is the matrix multiplication exponent. We thus obtain the first polynomial factor improvement on the naive n k time bound which can be achieved via exhaustive search. Our algorithm and analysis exploit new structural properties of Boolean functions.
Learning active classifiers
 Proceedings of the Thirteenth International Conference on Machine Learning (ICML96
, 1996
"... Most classification algorithms are "passive", in that they assign a classlabel to each instance based only on the description given, even if that description is incomplete. By contrast, an active classifier can  at some cost  obtain the values of missing attributes, before deciding upon a class ..."
Abstract

Cited by 18 (5 self)
 Add to MetaCart
Most classification algorithms are "passive", in that they assign a classlabel to each instance based only on the description given, even if that description is incomplete. By contrast, an active classifier can  at some cost  obtain the values of missing attributes, before deciding upon a class label. This can be useful when considering, for example, whether to extract some information from the web for a critical decision or whether to gather information for a medical test or experiment. The expected utility of using an active classifier depends on both the cost required to obtain the additional attribute values and the penalty incurred if the classifier outputs the wrong classification. This paper analyzes the problem of learning optimal active classifiers, using a variant of the probablyapproximatelycorrect (PAC) model. After defining the framework, we show that this task can be achieved efficiently when the active classifier is allowed to perform only (at most) a constant number of tests. We then show that, in more general environments, the task is often intractable.
On the Fourier spectrum of symmetric Boolean functions with applications to learning symmetric juntas
 In Proceedings of 20th IEEE Conference on Computational Complexity
, 2005
"... We study the following question: What is the smallest t such that every symmetric boolean function on k variables (which is not a constant or a parity function), has a nonzero Fourier coefficient of order at least 1 and at most t? We exclude the constant functions for which there is no such t and t ..."
Abstract

Cited by 14 (3 self)
 Add to MetaCart
We study the following question: What is the smallest t such that every symmetric boolean function on k variables (which is not a constant or a parity function), has a nonzero Fourier coefficient of order at least 1 and at most t? We exclude the constant functions for which there is no such t and the parity functions for which t has to be k. Let τ(k) be the smallest such t. The main contribution of this paper is a proof of the following self similar nature of this question: If τ(l) ≤ s, then for any ɛ> 0 and � for k ≥ k0(l, ɛ), τ(k) ≤ k s+1 l+1 + ɛ Coupling this result with a computer based search which establishes τ(30) = 2, one obtains that for large enough k, τ(k) ≤ 3k/31. The motivation for our work is to understand the complexity of learning symmetric juntas. A kjunta is a boolean function of n variables that depends only on an unknown subset of k variables. If f is symmetric in the variables it depends on, it is called a symmetric kjunta. Our results imply an algorithm to learn the class of symmetric kjuntas, in the uniform PAC learning model, in time approximately n 3k 31. This improves on a result of Mossel, O’Donnell and Servedio in [11], who show that symmetric kjuntas can be ∗ Research supported by NSF grants CCR0002299 and CCF0431023.
Public Key Cryptography from Different Assumptions
, 2008
"... We construct a new public key encryption based on two assumptions: 1. One can obtain a pseudorandom generator with small locality by connecting the outputs to the inputs using any sufficiently good unbalanced expander. 2. It is hard to distinguish between a random graph that is such an expander and ..."
Abstract

Cited by 10 (2 self)
 Add to MetaCart
We construct a new public key encryption based on two assumptions: 1. One can obtain a pseudorandom generator with small locality by connecting the outputs to the inputs using any sufficiently good unbalanced expander. 2. It is hard to distinguish between a random graph that is such an expander and a random graph where a (planted) random logarithmicsized subset S of the outputs is connected to fewer than S  inputs. The validity and strength of the assumptions raise interesting new algorithmic and pseudorandomness questions, and we explore their relation to the current stateofart. 1
Improved bounds for testing juntas
 In Proc. 12th Workshop RANDOM
, 2008
"... Abstract. We consider the problem of testing functions for the property of being a kjunta (i.e., of depending on at most k variables). Fischer, Kindler, Ron, Safra, and Samorodnitsky (J. Comput. Sys. Sci., 2004) showed that Õ(k2)/ɛ queries are sufficient to test kjuntas, and conjectured that this ..."
Abstract

Cited by 8 (4 self)
 Add to MetaCart
Abstract. We consider the problem of testing functions for the property of being a kjunta (i.e., of depending on at most k variables). Fischer, Kindler, Ron, Safra, and Samorodnitsky (J. Comput. Sys. Sci., 2004) showed that Õ(k2)/ɛ queries are sufficient to test kjuntas, and conjectured that this bound is optimal for nonadaptive testing algorithms. Our main result is a nonadaptive algorithm for testing kjuntas with Õ(k 3/2)/ɛ queries. This algorithm disproves the conjecture of Fischer et al. We also show that the query complexity of nonadaptive algorithms for testing juntas has a lower bound of min ` ˜ Ω(k/ɛ), 2
On Agnostic Learning of Parities, Monomials and Halfspaces
, 2006
"... We study the learnability of several fundamental concept classes in the agnostic learning framework of Haussler [Hau92] and Kearns et al. [KSS94]. We show that under the uniform distribution, agnostically learning parities reduces to learning parities with random classification noise, commonly refer ..."
Abstract

Cited by 6 (1 self)
 Add to MetaCart
We study the learnability of several fundamental concept classes in the agnostic learning framework of Haussler [Hau92] and Kearns et al. [KSS94]. We show that under the uniform distribution, agnostically learning parities reduces to learning parities with random classification noise, commonly referred to as the noisy parity problem. Together with the parity learning algorithm of Blum et al. [BKW03], this gives the first nontrivial algorithm for agnostic learning of parities. We use similar techniques to reduce learning of two other fundamental concept classes under the uniform distribution to learning of noisy parities. Namely, we show that learning of DNF expressions reduces to learning noisy parities of just logarithmic number of variables and learning of kjuntas reduces to learning noisy parities of k variables. We give essentially optimal hardness results for agnostic learning of monomials over {0, 1} n and halfspaces over Q n. We show that for any constant ɛ finding a monomial (halfspace) that agrees with an unknown function on 1/2 + ɛ fraction of examples is NPhard even when there exists a monomial (halfspace) that agrees with the unknown function on 1 − ɛ fraction of examples. This resolves an open question due to Blum and significantly improves on a number of previous hardness results for these problems. We extend these results to ɛ = 2 − log1−λ n (ɛ = 2 − √ log n in the case of halfspaces) for any constant λ> 0 under stronger complexity assumptions.
Sharper bounds for the hardness of prototype and feature selection
 PROC. OF THE 11TH INTERNATIONAL CONFERENCE ON ALGORITHMIC LEARNING THEORY, IN: LECTURE NOTES IN ARTIFICIAL INTELLIGENCE
, 2000
"... As pointed out by Blum [Blu94], ”nearly all results in Machine Learning [...] deal with problems of separating relevant from irrelevant information in some way”. This paper is concerned with structural complexity issues regarding the selection of relevant Prototypes or Features. We give the first re ..."
Abstract

Cited by 4 (2 self)
 Add to MetaCart
As pointed out by Blum [Blu94], ”nearly all results in Machine Learning [...] deal with problems of separating relevant from irrelevant information in some way”. This paper is concerned with structural complexity issues regarding the selection of relevant Prototypes or Features. We give the first results proving that both problems can be much harder than expected in the literature for various notions of relevance. In particular, the worstcase bounds achievable by any efficient algorithm are proven to be very large, most of the time not so far from trivial bounds. We think these results give a theoretical justification for the numerous heuristic approaches found in the literature to cope with these problems.
Learning Convex Concepts from Gaussian Distributions with PCA
"... Abstract—We present a new algorithm for learning a convex set in ndimensional space given labeled examples drawn from any Gaussian distribution. The complexity of the algorithm is bounded by a fixed polynomial in n times a function of k and ɛ where k is the dimension of the normal subspace (the spa ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
Abstract—We present a new algorithm for learning a convex set in ndimensional space given labeled examples drawn from any Gaussian distribution. The complexity of the algorithm is bounded by a fixed polynomial in n times a function of k and ɛ where k is the dimension of the normal subspace (the span of normal vectors to supporting hyperplanes of the convex set) and the output is a hypothesis that correctly classifies at least 1−ɛ of the unknown Gaussian distribution. For the important case when the convex set is the intersection of k halfspaces, the complexity is poly(n, k, 1/ɛ) + n · min k O(log k/ɛ4)
Learning functions of k hidden variables
"... We consider a fundamental problem in computational learning theory: learning an arbitrary Boolean function which depends on an unknown set of k out of n Boolean variables. We give an algorithm for learning such functions under the uniform distribution which runs in time roughly (nk)!!+1; where! ! 2: ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
We consider a fundamental problem in computational learning theory: learning an arbitrary Boolean function which depends on an unknown set of k out of n Boolean variables. We give an algorithm for learning such functions under the uniform distribution which runs in time roughly (nk)!!+1; where! ! 2:376 is the matrix multiplication exponent. We thus obtain the first polynomial factor improvement on a naive nk time bound which can be achieved via exhaustive search. Our algorithm and analysis exploit new structural properties of Boolean functions.