Results 1  10
of
46
Testing of Clustering
 In Proc. 41th Annu. IEEE Sympos. Found. Comput. Sci
, 2000
"... A set X of points in ! d is (k; b)clusterable if X can be partitioned into k subsets (clusters) so that the diameter (alternatively, the radius) of each cluster is at most b. We present algorithms that by sampling from a set X , distinguish between the case that X is (k; b)clusterable and the ca ..."
Abstract

Cited by 60 (13 self)
 Add to MetaCart
A set X of points in ! d is (k; b)clusterable if X can be partitioned into k subsets (clusters) so that the diameter (alternatively, the radius) of each cluster is at most b. We present algorithms that by sampling from a set X , distinguish between the case that X is (k; b)clusterable and the case that X is fflfar from being (k; b 0 )clusterable for any given 0 ! ffl 1 and for b 0 b. In fflfar from being (k; b 0 )clusterable we mean that more than ffl \Delta jX j points should be removed from X so that it becomes (k; b 0 )clusterable. We give algorithms for a variety of cost measures that use a sample of size independent of jX j, and polynomial in k and 1=ffl. Our algorithms can also be used to find approximately good clusterings. Namely, these are clusterings of all but an fflfraction of the points in X that have optimal (or close to optimal) cost. The benefit of our algorithms is that they construct an implicit representation of such clusterings in time independ...
Monotonicity testing over general poset domains (Extended Abstract)
 STOC'02
, 2002
"... The field of property testing studies algorithms that distinguish, using a small number of queries, between inputs which satisfy a given property, and those that are ‘far’ from satisfying the property. Testing properties that are defined in terms of monotonicity has been extensively investigated, pr ..."
Abstract

Cited by 48 (22 self)
 Add to MetaCart
The field of property testing studies algorithms that distinguish, using a small number of queries, between inputs which satisfy a given property, and those that are ‘far’ from satisfying the property. Testing properties that are defined in terms of monotonicity has been extensively investigated, primarily in the context of the monotonicity of a sequence of integers, or the monotonicity of a function over the £dimensional hypercube ¤¥¦§§ § ¦¨©�. These works resulted in monotonicity testers whose query complexity is at most polylogarithmic in the size of the domain. We show that in its most general setting, testing that Boolean functions are close to monotone is equivalent, with respect to the number of required queries, to several other testing problems in logic and graph theory. These problems include: testing that a Boolean assignment of variables is close to an assignment that satisfies a specific �CNF formula, testing that a set of vertices is close to one that is a vertex cover of a specific graph, and testing that a set of vertices is close to a clique. We then investigate the query complexity of monotonicity testing of both Boolean and integer functions over general partial orders. We give algorithms and lower bounds for the general problem, as well as for some interesting special cases. In proving a general lower bound, we construct graphs with combinatorial properties that may be of independent interest.
Improved Testing Algorithms for Monotonicity
, 1999
"... We present improved algorithms for testing monotonicity of functions. Namely, given theability to query an unknown function f: \Sigma n 7! \Xi, where \Sigma and \Xi are finite ordered sets,the test always accepts a monotone f, and rejects f with high probability if it is fflfar frombeing monotone ( ..."
Abstract

Cited by 37 (8 self)
 Add to MetaCart
We present improved algorithms for testing monotonicity of functions. Namely, given theability to query an unknown function f: \Sigma n 7! \Xi, where \Sigma and \Xi are finite ordered sets,the test always accepts a monotone f, and rejects f with high probability if it is fflfar frombeing monotone (i.e., every monotone function differs from f on more than an ffl fraction of thedomain). For any ffl> 0, the query and time complexities of the test are O((n/ffl)*log \Sigma *log \Xi ).The previous best known bound was ~
Algorithmic and Analysis Techniques in Property Testing
"... Property testing algorithms are “ultra”efficient algorithms that decide whether a given object (e.g., a graph) has a certain property (e.g., bipartiteness), or is significantly different from any object that has the property. To this end property testing algorithms are given the ability to perform ..."
Abstract

Cited by 24 (3 self)
 Add to MetaCart
Property testing algorithms are “ultra”efficient algorithms that decide whether a given object (e.g., a graph) has a certain property (e.g., bipartiteness), or is significantly different from any object that has the property. To this end property testing algorithms are given the ability to perform (local) queries to the input, though the decision they need to make usually concern properties with a global nature. In the last two decades, property testing algorithms have been designed for many types of objects and properties, amongst them, graph properties, algebraic properties, geometric properties, and more. In this article we survey results in property testing, where our emphasis is on common analysis and algorithmic techniques. Among the techniques surveyed are the following: • The selfcorrecting approach, which was mainly applied in the study of property testing of algebraic properties; • The enforce and test approach, which was applied quite extensively in the analysis of algorithms for testing graph properties (in the densegraphs model), as well as in other contexts;
Sublinear time algorithms
 SIGACT News
, 2003
"... Abstract Sublinear time algorithms represent a new paradigm in computing, where an algorithmmust give some sort of an answer after inspecting only a very small portion of the input. We discuss the sorts of answers that one might be able to achieve in this new setting. 1 Introduction The goal of algo ..."
Abstract

Cited by 22 (2 self)
 Add to MetaCart
Abstract Sublinear time algorithms represent a new paradigm in computing, where an algorithmmust give some sort of an answer after inspecting only a very small portion of the input. We discuss the sorts of answers that one might be able to achieve in this new setting. 1 Introduction The goal of algorithmic research is to design efficient algorithms, where efficiency is typicallymeasured as a function of the length of the input. For instance, the elementary school algorithm for multiplying two n digit integers takes roughly n2 steps, while more sophisticated algorithmshave been devised which run in less than n log2 n steps. It is still not known whether a linear time algorithm is achievable for integer multiplication. Obviously any algorithm for this task, as for anyother nontrivial task, would need to take at least linear time in n, since this is what it would take to read the entire input and write the output. Thus, showing the existence of a linear time algorithmfor a problem was traditionally considered to be the gold standard of achievement. Nevertheless, due to the recent tremendous increase in computational power that is inundatingus with a multitude of data, we are now encountering a paradigm shift from traditional computational models. The scale of these data sets, coupled with the typical situation in which there is verylittle time to perform our computations, raises the issue of whether there is time to consider any more than a miniscule fraction of the data in our computations? Analogous to the reasoning thatwe used for multiplication, for most natural problems, an algorithm which runs in sublinear time must necessarily use randomization and must give an answer which is in some sense imprecise.Nevertheless, there are many situations in which a fast approximate solution is more useful than a slower exact solution.
Information theory in property testing and monotonicity testing in higher dimension
 Information and Computation
, 2006
"... Abstract. In general property testing, we are given oracle access to a function f, and we wish to randomly test if the function satisfies a given property P, or it is εfar from having that property. In a more general setting, the domain on which the function is defined is equipped with a probabilit ..."
Abstract

Cited by 20 (2 self)
 Add to MetaCart
Abstract. In general property testing, we are given oracle access to a function f, and we wish to randomly test if the function satisfies a given property P, or it is εfar from having that property. In a more general setting, the domain on which the function is defined is equipped with a probability distribution, which assigns different weight to different elements in the distance function. This paper relates the complexity of testing the monotonicity of a function over the ddimensional cube to the Shannon entropy of the underlying distribution. We provide an improved upper bound on the property tester query complexity and we finetune the exponential dependence on the dimension d. 1
Sublinear algorithms for testing monotone and unimodal distributions
 Proceedings of STOC 36th
, 2004
"... The complexity of testing properties of monotone and unimodal distributions, when given access only to samples of the distribution, is investigated. Two kinds of sublineartime algorithms—those for testing monotonicity and those that take advantage of monotonicity—are provided. The first algorithm te ..."
Abstract

Cited by 19 (7 self)
 Add to MetaCart
The complexity of testing properties of monotone and unimodal distributions, when given access only to samples of the distribution, is investigated. Two kinds of sublineartime algorithms—those for testing monotonicity and those that take advantage of monotonicity—are provided. The first algorithm tests if a given distribution on [n] is monotone or far away from any monotone distribution in L1norm;this algorithm uses Õ(√n) samples and is shown to be nearly optimal. The next algorithm, given a joint distribution on [n]×[n], tests if it is monotone or is far away from any monotone distribution in L1norm;this algorithm uses Õ(n3/2)samples. The problems of testing if two monotone distributions are close in L1norm and if two random variables with a monotone joint distribution are close to being independent in L1norm are also considered. Algorithms for these problems that use only poly(log n) samples are presented. The closeness and independence testing algorithms for monotone distributions are significantly more efficient than the corresponding algorithms as well as the lower bounds for arbitrary distributions. Some of the above results are also extended to unimodal distributions.
Testing halfspaces
 IN PROC. 20TH ANNUAL SYMPOSIUM ON DISCRETE ALGORITHMS (SODA
, 2009
"... This paper addresses the problem of testing whether a Booleanvalued function f is a halfspace, i.e. a function of the form f(x) = sgn(w ·x−θ). We consider halfspaces over the continuous domain R n (endowed with the standard multivariate Gaussian distribution) as well as halfspaces over the Boolean ..."
Abstract

Cited by 19 (9 self)
 Add to MetaCart
This paper addresses the problem of testing whether a Booleanvalued function f is a halfspace, i.e. a function of the form f(x) = sgn(w ·x−θ). We consider halfspaces over the continuous domain R n (endowed with the standard multivariate Gaussian distribution) as well as halfspaces over the Boolean cube {−1, 1} n (endowed with the uniform distribution). In both cases we give an algorithm that distinguishes halfspaces from functions that are ǫfar from any halfspace using only poly ( 1) queries, independent of ǫ the dimension n. Two simple structural results about halfspaces are at the heart of our approach for the Gaussian distribution: the first gives an exact relationship between the expected value of a halfspace f and the sum of the squares of f’s degree1 Hermite coefficients, and the second shows that any function that approximately satisfies this relationship is close to a halfspace. We prove analogous results for the Boolean cube {−1, 1} n (with Fourier coefficients in place of Hermite coefficients) for balanced halfspaces in which all degree1 Fourier coefficients are small. Dealing with general halfspaces over {−1, 1} n poses significant additional complications and requires other ingredients. These include “crossconsistency ” versions of the results mentioned above for pairs of halfspaces with the same weights but different thresholds; new structural results relating the largest degree1 Fourier coefficient and the largest weight in unbalanced halfspaces; and algorithmic techniques from recent work on testing juntas [FKR+ 02].
Estimating the distance to a monotone function
 Proc. 8th RANDOM
, 2004
"... In standard property testing, the task is to distinguish between objects that have a property P and those that are εfar from P, for some ε> 0. In this setting, it is perfectly acceptable for the tester to provide a negative answer for every input object that does not satisfy P. This implies that pr ..."
Abstract

Cited by 19 (4 self)
 Add to MetaCart
In standard property testing, the task is to distinguish between objects that have a property P and those that are εfar from P, for some ε> 0. In this setting, it is perfectly acceptable for the tester to provide a negative answer for every input object that does not satisfy P. This implies that property testing in and of itself cannot be expected to yield any information whatsoever about the distance from the object to the property. We address this problem in this paper, restricting our attention to monotonicity testing. A function f: {1,..., n} ↦ → R is at distance εf from being monotone if it can (and must) be modified at εfn places to become monotone. For any fixed δ> 0, we compute, with probability at least 2/3, an interval [(1/2−δ)ε, ε] that encloses εf. The running time of our algorithm is O(ε −1 f log log ε −1 f log n), which is optimal within a factor of log log ε −1 f and represents a substantial improvement over previous work. We give a second algorithm with an expected running time of O(ε −1 f log n log log log n). Finally, we extend our results to multivariate functions. 1
Estimating the sortedness of a data stream
 In Proceedings of the ACMSIAM Symposium on Discrete Algorithms
, 2007
"... The distance to monotonicity of a sequence is the minimum number of edit operations required to transform the sequence into an increasing order; this measure is complementary to the length of the longest increasing subsequence (LIS). We address the question of estimating these quantities in the one ..."
Abstract

Cited by 17 (2 self)
 Add to MetaCart
The distance to monotonicity of a sequence is the minimum number of edit operations required to transform the sequence into an increasing order; this measure is complementary to the length of the longest increasing subsequence (LIS). We address the question of estimating these quantities in the onepass data stream model and present the first sublinear space algorithms for both problems. We first present O ( √ n)space deterministic algorithms that approximate the distance to monotonicity and the LIS to within a factor that is arbitrarily close to 1. We also show a lower bound of Ω(n) on the space required by any randomized algorithm to compute the LIS (or alternatively the distance from monotonicity) exactly, demonstrating that approximation is necessary for sublinear space computation; this bound improves upon the existing lower bound of Ω ( √ n) [LNVZ06]. Our main result is a randomized algorithm that uses only O(log 2 n) space and approximates the distance to monotonicity to within a factor that is arbitrarily close to 4. In contrast, we believe that any significant reduction in the space complexity for approximating the length of the LIS is considerably hard. We conjecture that any deterministic (1 + ɛ) approximation algorithm for LIS requires Ω ( √ n) space, and as a step towards this conjecture, prove a space lower bound of Ω ( √ n) for a restricted yet natural class of deterministic algorithms. 1