Results 1  10
of
63
Simulating Threshold Circuits by Majority Circuits
 SIAM Journal on Computing
, 1994
"... We prove that a single threshold gate with arbitrary weights can be simulated by an explicit polynomialsize depth 2 majority circuit. In general we show that a depth d threshold circuit can be simulated uniformly by a majority circuit of depth d + 1. Goldmann, Hastad, and Razborov showed in [10 ..."
Abstract

Cited by 37 (0 self)
 Add to MetaCart
(Show Context)
We prove that a single threshold gate with arbitrary weights can be simulated by an explicit polynomialsize depth 2 majority circuit. In general we show that a depth d threshold circuit can be simulated uniformly by a majority circuit of depth d + 1. Goldmann, Hastad, and Razborov showed in [10] that a nonuniform simulation exists. Our construction answers two open questions posed in [10]: we give an explicit construction whereas [10] uses a randomized existence argument, and we show that such a simulation is possible even if the depth d grows with the number of variables n (the simulation in [10] gives polynomialsize circuits only when d is constant). 1 A preliminary version of this paper appeared in Proc. 25th ACM STOC (1993), pp. 551560. 2 Laboratory for Computer Science, MIT, Cambridge MA 02139, Email: migo@theory.lcs.mit.edu. This author 's work was done at Royal Institute of Technology in Stockholm, and while visiting the University of Bonn 3 Department of Com...
Bounded Independence Fools Halfspaces
 In Proc. 50th Annual Symposium on Foundations of Computer Science (FOCS), 2009
"... We show that any distribution on {−1, +1} n that is kwise independent fools any halfspace (a.k.a. linear threshold function) h: {−1, +1} n → {−1, +1}, i.e., any function of the form h(x) = sign ( ∑n i=1 wixi − θ) where the w1,..., wn, θ are arbitrary real numbers, with error ɛ for k = O(ɛ−2 log 2 ..."
Abstract

Cited by 27 (13 self)
 Add to MetaCart
(Show Context)
We show that any distribution on {−1, +1} n that is kwise independent fools any halfspace (a.k.a. linear threshold function) h: {−1, +1} n → {−1, +1}, i.e., any function of the form h(x) = sign ( ∑n i=1 wixi − θ) where the w1,..., wn, θ are arbitrary real numbers, with error ɛ for k = O(ɛ−2 log 2 (1/ɛ)). Our result is tight up to log(1/ɛ) factors. Using standard constructions of kwise independent distributions, we obtain the first explicit pseudorandom generators G: {−1, +1} s → {−1, +1} n that fool halfspaces. Specifically, we fool halfspaces with error ɛ and seed length s = k · log n = O(log n · ɛ−2 log 2 (1/ɛ)). Our approach combines classical tools from real approximation theory with structural results on halfspaces by Servedio (Comput. Complexity 2007).
Halfspace matrices
 In Proc. of the 22nd Conference on Computational Complexity (CCC
, 2007
"... A halfspace matrix is a Boolean matrix A with rows indexed by linear threshold functions f, columns indexed by inputs x ∈ {−1,1} n, and the entries given by A f,x = f (x). We demonstrate the potential of halfspace matrices as tools to answer nontrivial open questions. 1. (Communication complexity) W ..."
Abstract

Cited by 24 (8 self)
 Add to MetaCart
(Show Context)
A halfspace matrix is a Boolean matrix A with rows indexed by linear threshold functions f, columns indexed by inputs x ∈ {−1,1} n, and the entries given by A f,x = f (x). We demonstrate the potential of halfspace matrices as tools to answer nontrivial open questions. 1. (Communication complexity) We exhibit a Boolean function f with discrepancy Ω(1/n 4) under every product distribution but O ( √ n/2 n/4) under a certain nonproduct distribution. This partially solves an open problem of Kushilevitz and Nisan [25]. 2. (Complexity of sign matrices) We construct a matrix A ∈ {−1,1} N×NlogN with dimension complexity logN but margin complexity Ω(N 1/4 / √ logN). This gap is an exponential improvement over previous work. As an application to circuit complexity, we prove an Ω(2n/4 /(d √ n)) circuit lower bound for computing halfspaces by a majority of an arbitrary set of d gates. This complements a result of Goldmann, H˚astad, and Razborov [15]. In addition, we prove new results on the complexity measures of sign matrices, complementing recent work by Linial et al. [27–29]. 3. (Learning theory) We give a short and simple proof that the statisticalquery (SQ) dimension of halfspaces in n dimensions is less than 2(n + 1) 2 under all distributions (with n + 1 being a trivial lower bound). This improves on the n O(1) estimate from the fundamental paper of Blum et al. [5]. Finally, we motivate our learningtheoretic result for the complexity community by showing that SQ dimension estimates for natural classes of Boolean functions can resolve major open problems in complexity theory. Specifically, we show that an exp(2 (logn)o(1) ) upper bound on the SQ dimension of AC 0 would imply an explicit language in PSPACE cc \ PH cc. 1
Computational Complexity Of Neural Networks: A Survey
, 1994
"... . We survey some of the central results in the complexity theory of discrete neural networks, with pointers to the literature. Our main emphasis is on the computational power of various acyclic and cyclic network models, but we also discuss briefly the complexity aspects of synthesizing networks fr ..."
Abstract

Cited by 23 (6 self)
 Add to MetaCart
. We survey some of the central results in the complexity theory of discrete neural networks, with pointers to the literature. Our main emphasis is on the computational power of various acyclic and cyclic network models, but we also discuss briefly the complexity aspects of synthesizing networks from examples of their behavior. CR Classification: F.1.1 [Computation by Abstract Devices]: Models of Computationneural networks, circuits; F.1.3 [Computation by Abstract Devices ]: Complexity Classescomplexity hierarchies Key words: Neural networks, computational complexity, threshold circuits, associative memory 1. Introduction The currently again very active field of computation by "neural" networks has opened up a wealth of fascinating research topics in the computational complexity analysis of the models considered. While much of the general appeal of the field stems not so much from new computational possibilities, but from the possibility of "learning", or synthesizing networks...
On PAC Learning using Winnow, Perceptron, and a PerceptronLike Algorithm
"... In this paper we analyze the PAC learning abilities of several simple iterative algorithms for learning linear threshold functions, obtaining both positive and negative results. We show that Littlestone’s Winnow algorithm is not an efficient PAC learning algorithm for the class of positive linear th ..."
Abstract

Cited by 21 (10 self)
 Add to MetaCart
(Show Context)
In this paper we analyze the PAC learning abilities of several simple iterative algorithms for learning linear threshold functions, obtaining both positive and negative results. We show that Littlestone’s Winnow algorithm is not an efficient PAC learning algorithm for the class of positive linear threshold functions. We also prove that the Perceptron algorithm cannot efficiently learn the unrestricted class of linear threshold functions even under the uniform distribution on boolean examples. However, we show that the Perceptron algorithm can efficiently PAC learn the class of nested functions (a concept class known to be hard for Perceptron under arbitrary distributions) under the uniform distribution on boolean examples. Finally, we give a very simple Perceptronlike algorithm for learning origincentered halfspaces under the uniform distribution on the unit sphere in R^n. Unlike the Perceptron algorithm, which cannot learn in the presence of classification noise, the new algorithm can learn in the presence of monotonic noise (a generalization of classification noise). The new algorithm is significantly faster than previous algorithms in both the classification and monotonic noise settings.
Every linear threshold function has a lowweight approximator
 In Proceedings of the 21st Conference on Computational Complexity (CCC
, 2006
"... Given any linear threshold function f on n Boolean variables, we construct a linear threshold function g which disagrees with f on at most an ɛ fraction of inputs and has integer weights each of magnitude at most √ n · 2 Õ(1/ɛ2). We show that the construction is optimal in terms of its dependence on ..."
Abstract

Cited by 19 (7 self)
 Add to MetaCart
Given any linear threshold function f on n Boolean variables, we construct a linear threshold function g which disagrees with f on at most an ɛ fraction of inputs and has integer weights each of magnitude at most √ n · 2 Õ(1/ɛ2). We show that the construction is optimal in terms of its dependence on n by proving a lower bound of Ω ( √ n) on the weights required to approximate a particular linear threshold function. We give two applications. The first is a deterministic algorithm for approximately counting the fraction of satisfying assignments to an instance of the zeroone knapsack problem to within an additive ±ɛ. The algorithm runs in time polynomial in n (but exponential in 1/ɛ 2). In our second application, we show that any linear threshold function f is specified to within error ɛ by estimates of its Chow parameters (degree 0 and 1 Fourier coefficients) which are accurate to within an additive ±1/(n · 2 Õ(1/ɛ2)). This is the first such accuracy bound which is inverse polynomial in n (previous work of Goldberg [12] gave a 1/quasipoly(n) bound), and gives the first polynomial bound (in terms of n) on the number of examples required for learning linear threshold functions in the “restricted focus of attention ” framework.
Lectures on 0/1polytopes
 Polytopes — combinatorics and computation (Oberwolfach, 1997), volume 29 of DMV Seminar
, 2000
"... These lectures on the combinatorics and geometry of 0/1polytopes are meant as an introduction and invitation. Rather than heading for an extensive survey on 0/1polytopes I present some interesting aspects of these objects; all of them are related to some quite recent work and progress. 0/1polytope ..."
Abstract

Cited by 18 (1 self)
 Add to MetaCart
(Show Context)
These lectures on the combinatorics and geometry of 0/1polytopes are meant as an introduction and invitation. Rather than heading for an extensive survey on 0/1polytopes I present some interesting aspects of these objects; all of them are related to some quite recent work and progress. 0/1polytopes have a very simple definition and explicit descriptions; we can enumerate and analyze small examples explicitly in the computer (e. g. using polymake). However, any intuition that is derived from the analysis of examples in “low dimensions” will miss the true complexity of 0/1polytopes. Thus, in the following we will study several aspects of the complexity of higherdimensional 0/1polytopes: the doublyexponential number of combinatorial types, the number of facets which can be huge, and the coefficients of defining inequalities which sometimes turn out to be extremely large. Some of the effects and results will be backed by proofs in the course of these lectures; we will also be able to verify some of them on explicit examples, which are accessible as a polymake database.
AntiHadamard matrices, coin weighing, threshold gates, and indecomposable hypergraphs
 Journal of Combinatorial Theory
, 1997
"... Let χ1(n) denote the maximum possible absolute value of an entry of the inverse of 1 ( an n by n invertible matrix with 0, 1 entries. It is proved that χ1(n) = n 2 +o(1))n. This solves a problem of Graham and Sloane. Let m(n) denote the maximum possible number m such that given a set of m coins out ..."
Abstract

Cited by 18 (1 self)
 Add to MetaCart
(Show Context)
Let χ1(n) denote the maximum possible absolute value of an entry of the inverse of 1 ( an n by n invertible matrix with 0, 1 entries. It is proved that χ1(n) = n 2 +o(1))n. This solves a problem of Graham and Sloane. Let m(n) denote the maximum possible number m such that given a set of m coins out of a collection of coins of two unknown distinct weights, one can decide if all the coins have the same weight or not using n weighings in a regular balance beam. It is 1 ( shown that m(n) = n 2 +o(1))n. This settles a problem of Kozlov and Vũ. Let D(n) denote the maximum possible degree of a regular multihypergraph on n vertices that contains no proper regular nonempty subhypergraph. It is shown that 1 ( D(n) = n 2 +o(1))n. This improves estimates of Shapley, van Lint and Pollak. All these results and several related ones are proved by a similar technique whose main ingredient is an extension of a construction of H˚astad of threshold gates that require large weights. 1
Neural Networks and Complexity Theory
 In Proc. 17th International Symposium on Mathematical Foundations of Computer Science
, 1992
"... . We survey some of the central results in the complexity theory of discrete neural networks, with pointers to the literature. 1 Introduction The recently revived field of computation by "neural" networks provides the complexity theorist with a wealth of fascinating research topics. Whi ..."
Abstract

Cited by 16 (4 self)
 Add to MetaCart
(Show Context)
. We survey some of the central results in the complexity theory of discrete neural networks, with pointers to the literature. 1 Introduction The recently revived field of computation by "neural" networks provides the complexity theorist with a wealth of fascinating research topics. While much of the general appeal of the field stems not so much from new computational possibilities, but from the possibility of "learning", or synthesizing networks directly from examples of their desired inputoutput behavior, it is nevertheless important to pay attention also to the complexity issues: firstly, what kinds of functions are computable by networks of a given type and size, and secondly, what is the complexity of the synthesis problems considered. In fact, inattention to these issues was a significant factor in the demise of the first stage of neural networks research in the late 60's, under the criticism of Minsky and Papert [51]. The intent of this paper is to survey some of the centra...
Improved approximation of linear threshold functions
 In Proc. 24nd Annual IEEE Conference on Computational Complexity (CCC
, 2009
"... We prove two main results on how arbitrary linear threshold functions f(x) = sign(w · x − θ) over the ndimensional Boolean hypercube can be approximated by simple threshold functions. Our first result shows that every nvariable threshold function f is ɛclose to a threshold function depending only ..."
Abstract

Cited by 15 (9 self)
 Add to MetaCart
We prove two main results on how arbitrary linear threshold functions f(x) = sign(w · x − θ) over the ndimensional Boolean hypercube can be approximated by simple threshold functions. Our first result shows that every nvariable threshold function f is ɛclose to a threshold function depending only on Inf(f) 2 · poly(1/ɛ) many variables, where Inf(f) denotes the total influence or average sensitivity of f. This is an exponential sharpening of Friedgut’s wellknown theorem [Fri98], which states that every Boolean function f is ɛclose to a function depending only on 2 O(Inf(f)/ɛ) many variables, for the case of threshold functions. We complement this upper bound by showing that Ω(Inf(f) 2 + 1/ɛ 2) many variables are required for ɛapproximating threshold functions. Our second result is a proof that every nvariable threshold function is ɛclose to a threshold function with integer weights at most poly(n) · 2 Õ(1/ɛ2/3). This is an improvement, in the dependence on the error parameter ɛ, on an earlier result of [Ser07] which gave a poly(n) · 2 Õ(1/ɛ2) bound. Our improvement is obtained via a new proof technique that uses strong anticoncentration bounds from probability theory. The new technique also gives a simple and modular proof of the original [Ser07] result, and extends to give lowweight approximators for threshold functions under a range of probability distributions other than the uniform distribution.