Results 1 
8 of
8
A Complete Characterization of Statistical Query Learning with Applications to Evolvability
, 2009
"... Statistical query (SQ) learning model of Kearns is a natural restriction of the PAC learning model in which a learning algorithm is allowed to obtain estimates of statistical properties of the examples but cannot see the examples themselves [18]. We describe a new and simple characterization of the ..."
Abstract

Cited by 26 (14 self)
 Add to MetaCart
(Show Context)
Statistical query (SQ) learning model of Kearns is a natural restriction of the PAC learning model in which a learning algorithm is allowed to obtain estimates of statistical properties of the examples but cannot see the examples themselves [18]. We describe a new and simple characterization of the query complexity of learning in the SQ learning model. Unlike the previously known bounds on SQ learning [7, 9, 33, 3, 28] our characterization preserves the accuracy and the efficiency of learning. The preservation of accuracy implies that that our characterization gives the first characterization of SQ learning in the agnostic learning framework of Haussler and Kearns, Schapire and Sellie [15, 20]. The preservation of efficiency allows us to derive a new technique for the design of evolutionary algorithms in Valiant’s model of evolvability [32]. We use this technique to demonstrate the existence of a large class of monotone evolutionary learning algorithms based on square loss fitness estimation. These results differ significantly from the few known evolutionary algorithms and give evidence that evolvability in Valiant’s model is a more versatile phenomenon than there had been previous reason to suspect. 1
Interactive Submodular Set Cover
"... We introduce a natural generalization of submodular set cover and exact active learning with a finite hypothesis class (query learning). We call this new problem interactive submodular set cover. Applications include advertising in social networks with hidden information. We give an approximation gu ..."
Abstract

Cited by 15 (4 self)
 Add to MetaCart
(Show Context)
We introduce a natural generalization of submodular set cover and exact active learning with a finite hypothesis class (query learning). We call this new problem interactive submodular set cover. Applications include advertising in social networks with hidden information. We give an approximation guarantee for a novel greedy algorithm and give a hardness of approximation result which matches up to constant factors. We also discuss negative results for simpler approaches and present encouraging early experimental results. 1.
Simultaneous learning and covering with adversarial noise
 In ICML
, 2011
"... We study simultaneous learning and covering problems: submodular set cover problems that depend on the solution to an active (query) learning problem. The goal is to jointly minimize the cost of both learning and covering. We extend recent work in this setting to allow for a limited amount of advers ..."
Abstract

Cited by 12 (3 self)
 Add to MetaCart
(Show Context)
We study simultaneous learning and covering problems: submodular set cover problems that depend on the solution to an active (query) learning problem. The goal is to jointly minimize the cost of both learning and covering. We extend recent work in this setting to allow for a limited amount of adversarial noise. Certain noisy query learning problems are a special case of our problem. Crucial to our analysis is a lemma showing the logical OR of two submodular cover constraints can be reduced to a single submodular set cover constraint. Combined with known results, this new lemma allows for arbitrary monotone circuits of submodular cover constraints to be reduced to a single constraint. As an example practical application, we present a movie recommendation website that minimizes the total cost of learning what the user wants to watch and recommending a set of movies. 1. Background Consider a movie recommendation problem where we want to recommend to a user a small set of movies to watch. Assume first that we already have some model of the user’s taste in movies (for example learned from the user’s ratings history or stated genre preferences). In this case, we can pose the recommendation problem as an optimization problem: using the model, we can design an objective function F (S) which measures the quality of a set of movie recommendations S ⊆ V. Our goal is then to maximize F (S) subject to a constraint on the size or cost of S (e.g. S  ≤ k). Alternatively
Query Learning and Certificates in Lattices
"... Abstract. We provide an abstract version, in terms of lattices, of the Horn query learning algorithm of Angluin, Frazier, and Pitt. To validate it, we develop a proof that is independent of the propositional Horn logic structure. We also construct a certificate set for the class of lattices that gen ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
(Show Context)
Abstract. We provide an abstract version, in terms of lattices, of the Horn query learning algorithm of Angluin, Frazier, and Pitt. To validate it, we develop a proof that is independent of the propositional Horn logic structure. We also construct a certificate set for the class of lattices that generalizes and improves an earlier certificate construction and that relates very clearly with the new proof. 1
Canonical Horn Representations and Query Learning
"... We describe an alternative construction of an existing canonical representation for definite Horn theories, the GuiguesDuquenne basis (or GD basis), which minimizes a natural notion of implicational size. We extend the canonical representation to general Horn, by providing a reduction from definite ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
(Show Context)
We describe an alternative construction of an existing canonical representation for definite Horn theories, the GuiguesDuquenne basis (or GD basis), which minimizes a natural notion of implicational size. We extend the canonical representation to general Horn, by providing a reduction from definite to general Horn CNF. We show how this representation relates to two topics in query learning theory: first, we show that a wellknown algorithm by Angluin, Frazier and Pitt that learns Horn CNF always outputs the GD basis independently of the counterexamples it receives; second, we build strong polynomial certificates for Horn CNF directly from the GD basis.
JMLR: Workshop and Conference Proceedings vol (2012) 1–23 Computational Bounds on Statistical Query Learning
"... We study the complexity of learning in Kearns ’ wellknown statistical query (SQ) learning model (Kearns, 1993). A number of previous works have addressed the definition and estimation of the informationtheoretic bounds on the SQ learning complexity, in other words, bounds on the query complexity. ..."
Abstract
 Add to MetaCart
We study the complexity of learning in Kearns ’ wellknown statistical query (SQ) learning model (Kearns, 1993). A number of previous works have addressed the definition and estimation of the informationtheoretic bounds on the SQ learning complexity, in other words, bounds on the query complexity. Here we give the first strictly computational upper and lower bounds on the complexity of several types of learning in the SQ model. As it was already observed, the known characterization of distributionspecific SQ learning (Blum, et al. 1994) implies that for weak learning over a fixed distribution, the query complexity and computational complexity are essentially the same. In contrast, we show that for both distributionspecific and distributionindependent (strong) learning there exists a concept class of polynomial query complexity that is not efficiently learnable unless RP = NP. We then prove that our distributionspecific lower bound is essentially tight by showing that for every concept class C of polynomial query complexity there exists a polynomial time algorithm that given access to random points from any distribution D and an NP oracle, can SQ learn C over D. We also consider a restriction of the SQ model, the correlational statistical query (CSQ) model (Bshouty and Feldman, 2001; Feldman, 2008) of learning which is closelyrelated to Valiant’s model of evolvability (Valiant, 2007). We show a similar separation result for distributionindependent CSQ learning under a stronger assumption: there exists a concept class of polynomial CSQ query complexity which is not efficiently learnable unless every problem in W[P] has a randomized fixed parameter tractable algorithm.
Active Learning and Submodular Functions
, 2012
"... Active learning is a machine learning setting where the learning algorithm decides what data is labeled. Submodular functions are a class of set functions for which many optimization problems have efficient exact or approximate algorithms. We examine their connections. • We propose a new class of in ..."
Abstract
 Add to MetaCart
Active learning is a machine learning setting where the learning algorithm decides what data is labeled. Submodular functions are a class of set functions for which many optimization problems have efficient exact or approximate algorithms. We examine their connections. • We propose a new class of interactive submodular optimization problems which connect and generalize submodular optimization and active learning over a finite query set. We derive greedy algorithms with approximately optimal worstcase cost. These analyses apply to exact learning, approximate learning, learning in the presence of adversarial noise, and applications that mix learning and covering. • We consider active learning in a batch, transductive setting where the learning algorithm selects a set of examples to be labeled at once. In this setting we derive new error bounds which use symmetric submodular functions for regularization, and we give algorithms which approximately minimize these bounds. • We consider a repeated active learning setting where the learning algorithm solves a sequence of related learning problems. We propose an approach to this problem based on a new online prediction version of submodular set cover. A common
25th Annual Conference on Learning Theory Computational Bounds on Statistical Query Learning
"... We study the complexity of learning in Kearns ’ wellknown statistical query (SQ) learning model (Kearns, 1993). A number of previous works have addressed the definition and estimation of the informationtheoretic bounds on the SQ learning complexity, in other words, bounds on the query complexity. ..."
Abstract
 Add to MetaCart
We study the complexity of learning in Kearns ’ wellknown statistical query (SQ) learning model (Kearns, 1993). A number of previous works have addressed the definition and estimation of the informationtheoretic bounds on the SQ learning complexity, in other words, bounds on the query complexity. Here we give the first strictly computational upper and lower bounds on the complexity of several types of learning in the SQ model. As it was already observed, the known characterization of distributionspecific SQ learning (Blum, et al. 1994) implies that for weak learning over a fixed distribution, the query complexity and computational complexity are essentially the same. In contrast, we show that for both distributionspecific and distributionindependent (strong) learning there exists a concept class of polynomial query complexity that is not efficiently learnable unless RP = NP. We then prove that our distributionspecific lower bound is essentially tight by showing that for every concept class C of polynomial query complexity there exists a polynomial time algorithm that given access to random points from any distribution D and an NP oracle, can SQ learn C over D. We also consider a restriction of the SQ model, the correlational statistical query (CSQ) model (Bshouty and Feldman, 2001; Feldman, 2008) of learning which is closelyrelated to Valiant’s model of evolvability (Valiant, 2007). We show a similar separation result for distributionindependent CSQ learning under a stronger assumption: there exists a concept class of polynomial CSQ query complexity which is not efficiently learnable unless every problem in W[P] has a randomized fixed parameter tractable algorithm.