Results 1  10
of
92
Property Testing and its connection to Learning and Approximation
"... We study the question of determining whether an unknown function has a particular property or is fflfar from any function with that property. A property testing algorithm is given a sample of the value of the function on instances drawn according to some distribution, and possibly may query the fun ..."
Abstract

Cited by 421 (57 self)
 Add to MetaCart
We study the question of determining whether an unknown function has a particular property or is fflfar from any function with that property. A property testing algorithm is given a sample of the value of the function on instances drawn according to some distribution, and possibly may query the function on instances of its choice. First, we establish some connections between property testing and problems in learning theory. Next, we focus on testing graph properties, and devise algorithms to test whether a graph has properties such as being kcolorable or having a aeclique (clique of density ae w.r.t the vertex set). Our graph property testing algorithms are probabilistic and make assertions which are correct with high probability, utilizing only poly(1=ffl) edgequeries into the graph, where ffl is the distance parameter. Moreover, the property testing algorithms can be used to efficiently (i.e., in time linear in the number of vertices) construct partitions of the graph which corre...
The art of uninformed decisions: A primer to property testing
 Science
, 2001
"... Property testing is a new field in computational theory, that deals with the information that can be deduced from the input where the number of allowable queries (reads from the input) is significally smaller than its size. ..."
Abstract

Cited by 128 (20 self)
 Add to MetaCart
Property testing is a new field in computational theory, that deals with the information that can be deduced from the input where the number of allowable queries (reads from the input) is significally smaller than its size.
Robust PCPs of Proximity, Shorter PCPs and Applications to Coding
 in Proc. 36th ACM Symp. on Theory of Computing
, 2004
"... We continue the study of the tradeo between the length of PCPs and their query complexity, establishing the following main results (which refer to proofs of satis ability of circuits of size n): 1. We present PCPs of length exp( ~ O(log log n) ) n that can be veri ed by making o(log log n) ..."
Abstract

Cited by 80 (25 self)
 Add to MetaCart
We continue the study of the tradeo between the length of PCPs and their query complexity, establishing the following main results (which refer to proofs of satis ability of circuits of size n): 1. We present PCPs of length exp( ~ O(log log n) ) n that can be veri ed by making o(log log n) Boolean queries.
Testing that distributions are close
 In IEEE Symposium on Foundations of Computer Science
, 2000
"... Given two distributions over an n element set, we wish to check whether these distributions are statistically close by only sampling. We give a sublinear algorithm which uses O(n 2/3 ɛ −4 log n) independent samples from each distribution, runs in time linear in the sample size, makes no assumptions ..."
Abstract

Cited by 77 (16 self)
 Add to MetaCart
Given two distributions over an n element set, we wish to check whether these distributions are statistically close by only sampling. We give a sublinear algorithm which uses O(n 2/3 ɛ −4 log n) independent samples from each distribution, runs in time linear in the sample size, makes no assumptions about the structure of the distributions, and distinguishes the cases ɛ when the distance between the distributions is small (less than max ( 2 32 3 √ n, ɛ 4 √)) or large (more n than ɛ) in L1distance. We also give an Ω(n 2/3 ɛ −2/3) lower bound. Our algorithm has applications to the problem of checking whether a given Markov process is rapidly mixing. We develop sublinear algorithms for this problem as well.
Property Testing
 Handbook of Randomized Computing, Vol. II
, 2000
"... this technical aspect (as in the boundeddegree model the closest graph having the property must have at most dN edges and degree bound d as well). ..."
Abstract

Cited by 76 (10 self)
 Add to MetaCart
this technical aspect (as in the boundeddegree model the closest graph having the property must have at most dN edges and degree bound d as well).
Testing of Clustering
 In Proc. 41th Annu. IEEE Sympos. Found. Comput. Sci
, 2000
"... A set X of points in ! d is (k; b)clusterable if X can be partitioned into k subsets (clusters) so that the diameter (alternatively, the radius) of each cluster is at most b. We present algorithms that by sampling from a set X , distinguish between the case that X is (k; b)clusterable and the ca ..."
Abstract

Cited by 60 (13 self)
 Add to MetaCart
A set X of points in ! d is (k; b)clusterable if X can be partitioned into k subsets (clusters) so that the diameter (alternatively, the radius) of each cluster is at most b. We present algorithms that by sampling from a set X , distinguish between the case that X is (k; b)clusterable and the case that X is fflfar from being (k; b 0 )clusterable for any given 0 ! ffl 1 and for b 0 b. In fflfar from being (k; b 0 )clusterable we mean that more than ffl \Delta jX j points should be removed from X so that it becomes (k; b 0 )clusterable. We give algorithms for a variety of cost measures that use a sample of size independent of jX j, and polynomial in k and 1=ffl. Our algorithms can also be used to find approximately good clusterings. Namely, these are clusterings of all but an fflfraction of the points in X that have optimal (or close to optimal) cost. The benefit of our algorithms is that they construct an implicit representation of such clusterings in time independ...
Some 3CNF properties are hard to test
 In Proc. 35th ACM Symp. on Theory of Computing
, 2003
"... Abstract. For a Boolean formula ϕ on n variables, the associated property Pϕ is the collection of nbit strings that satisfy ϕ. We study the query complexity of tests that distinguish (with high probability) between strings in Pϕ and strings that are far from Pϕ in Hamming distance. We prove that th ..."
Abstract

Cited by 58 (11 self)
 Add to MetaCart
Abstract. For a Boolean formula ϕ on n variables, the associated property Pϕ is the collection of nbit strings that satisfy ϕ. We study the query complexity of tests that distinguish (with high probability) between strings in Pϕ and strings that are far from Pϕ in Hamming distance. We prove that there are 3CNF formulae (with O(n) clauses) such that testing for the associated property requires Ω(n) queries, even with adaptive tests. This contrasts with 2CNF formulae, whose associated properties are always testable with O ( √ n) queries [E. Fischer et al., Monotonicity testing over general poset domains, in Proceedings of the 34th Annual ACM Symposium on Theory of Computing, ACM, New York, 2002, pp. 474–483]. Notice that for every negative instance (i.e., an assignment that does not satisfy ϕ) there are three bit queries that witness this fact. Nevertheless, finding such a short witness requires reading a constant fraction of the input, even when the input is very far from satisfying the formula that is associated with the property. A property is linear if its elements form a linear space. We provide sufficient conditions for linear properties to be hard to test, and in the course of the proof include the following observations which are of independent interest: 1. In the context of testing for linear properties, adaptive twosided error tests have no more power than nonadaptive onesided error tests. Moreover, without loss of generality, any test for a linear property is a linear test. A linear test verifies that a portion of the input satisfies a set of linear constraints, which define the property, and rejects if and only if it finds a falsified constraint. A linear test is by definition nonadaptive and, when applied to linear properties, has a onesided error. 2. Random low density parity check codes (which are known to have linear distance and constant rate) are not locally testable. In fact, testing such a code of length n requires Ω(n) queries.
Testing Monotonicity
, 1999
"... We present a (randomized) test for monotonicity of Boolean functions. Namely, given the ability to query an unknown function f : f0; 1g 7! f0; 1g at arguments of its choice, the test always accepts a monotone f , and rejects f with high probability if it is fflfar from being monotone (i.e., e ..."
Abstract

Cited by 57 (12 self)
 Add to MetaCart
We present a (randomized) test for monotonicity of Boolean functions. Namely, given the ability to query an unknown function f : f0; 1g 7! f0; 1g at arguments of its choice, the test always accepts a monotone f , and rejects f with high probability if it is fflfar from being monotone (i.e., every monotone function differs from f on more than an ffl fraction of the domain).
Sampling Algorithms: Lower Bounds and Applications (Extended Abstract)
, 2001
"... ] Ziv BarYossef y Computer Science Division U. C. Berkeley Berkeley, CA 94720 zivi@cs.berkeley.edu Ravi Kumar IBM Almaden 650 Harry Road San Jose, CA 95120 ravi@almaden.ibm.com D. Sivakumar IBM Almaden 650 Harry Road San Jose, CA 95120 siva@almaden.ibm.com ABSTRACT We develop a fr ..."
Abstract

Cited by 52 (2 self)
 Add to MetaCart
] Ziv BarYossef y Computer Science Division U. C. Berkeley Berkeley, CA 94720 zivi@cs.berkeley.edu Ravi Kumar IBM Almaden 650 Harry Road San Jose, CA 95120 ravi@almaden.ibm.com D. Sivakumar IBM Almaden 650 Harry Road San Jose, CA 95120 siva@almaden.ibm.com ABSTRACT We develop a framework to study probabilistic sampling algorithms that approximate general functions of the form f : A n ! B, where A and B are arbitrary sets. Our goal is to obtain lower bounds on the query complexity of functions, namely the number of input variables x i that any sampling algorithm needs to query to approximate f(x1 ; : : : ; xn ). We define two quantitative properties of functions  the block sensitivity and the minimum Hellinger distance  that give us techniques to prove lower bounds on the query complexity. These techniques are quite general, easy to use, yet powerful enough to yield tight results. Our applications include the mean and higher statistical moments, the median and other selection functions, and the frequency moments, where we obtain lower bounds that are close to the corresponding upper bounds. We also point out some connections between sampling and streaming algorithms and lossy compression schemes. 1.
Monotonicity testing over general poset domains (Extended Abstract)
 STOC'02
, 2002
"... The field of property testing studies algorithms that distinguish, using a small number of queries, between inputs which satisfy a given property, and those that are ‘far’ from satisfying the property. Testing properties that are defined in terms of monotonicity has been extensively investigated, pr ..."
Abstract

Cited by 48 (22 self)
 Add to MetaCart
The field of property testing studies algorithms that distinguish, using a small number of queries, between inputs which satisfy a given property, and those that are ‘far’ from satisfying the property. Testing properties that are defined in terms of monotonicity has been extensively investigated, primarily in the context of the monotonicity of a sequence of integers, or the monotonicity of a function over the £dimensional hypercube ¤¥¦§§ § ¦¨©�. These works resulted in monotonicity testers whose query complexity is at most polylogarithmic in the size of the domain. We show that in its most general setting, testing that Boolean functions are close to monotone is equivalent, with respect to the number of required queries, to several other testing problems in logic and graph theory. These problems include: testing that a Boolean assignment of variables is close to an assignment that satisfies a specific �CNF formula, testing that a set of vertices is close to one that is a vertex cover of a specific graph, and testing that a set of vertices is close to a clique. We then investigate the query complexity of monotonicity testing of both Boolean and integer functions over general partial orders. We give algorithms and lower bounds for the general problem, as well as for some interesting special cases. In proving a general lower bound, we construct graphs with combinatorial properties that may be of independent interest.