Results 1  10
of
26
Three Theorems regarding Testing Graph Properties
, 2001
"... Property testing is a relaxation of decision problems in which it is required to distinguish yesinstances (i.e., objects having a predetermined property) from instances that are far from any yesinstance. We presents three theorems regarding testing graph properties in the adjacency matrix represe ..."
Abstract

Cited by 73 (10 self)
 Add to MetaCart
Property testing is a relaxation of decision problems in which it is required to distinguish yesinstances (i.e., objects having a predetermined property) from instances that are far from any yesinstance. We presents three theorems regarding testing graph properties in the adjacency matrix representation. More specifically, these theorems relate to the project of characterizing graph properties according to the complexity of testing them (in the adjacency matrix representation). The first theorem is that there exist monotone graph properties in N P for which testing is very hard (i.e., requires to examine a constant fraction of the entries in the matrix). The second theorem is that every graph property that can be tested making a number of queries that is independent of the size of the graph, can be so tested by uniformly selecting a set of vertices and accepting iff the induced subgraph has some fixed graph property (which is not necessarily the same as the one being tested). The third theorem refers to the framework of graph partition problems, and is a characterization of the subclass of properties that can be tested using a onesided error tester making a number of queries that is independent of the size of the graph.
Testing random variables for independence and identity
 Proceedings of the 41st Annual Symposium on Foundations of Computer Science
, 2000
"... Given access to independent samples of a distribution �over�℄�℄, we show how to test whether the distributions formed by projecting�to each coordinate are independent, i.e., whether�isclose in the norm to the product distribution��for some distributions�over �℄and�over�℄. The sample complexity of o ..."
Abstract

Cited by 54 (18 self)
 Add to MetaCart
Given access to independent samples of a distribution �over�℄�℄, we show how to test whether the distributions formed by projecting�to each coordinate are independent, i.e., whether�isclose in the norm to the product distribution��for some distributions�over �℄and�over�℄. The sample complexity of our test is �poly, assuming without loss of generality that �. We also give a matching lower bound, up to poly� � factors. Furthermore, given access to samples of a distribution �over�℄, we show how to test if�isclose in norm to an explicitly specified distribution�. Our test uses��poly samples, which nearly matches the known tight bounds for the case when�is uniform. 1.
Testing Juntas
, 2002
"... We show that a Boolean function over n Boolean variables can be tested for the property of depending on only k of them, using a number of queries that depends only on k and the approximation parameter . We present two tests, both nonadaptive, that require a number of queries that is polynomial k an ..."
Abstract

Cited by 48 (9 self)
 Add to MetaCart
We show that a Boolean function over n Boolean variables can be tested for the property of depending on only k of them, using a number of queries that depends only on k and the approximation parameter . We present two tests, both nonadaptive, that require a number of queries that is polynomial k and linear in . The first test is stronger in that it has a 1sided error, while the second test has a more compact analysis. We also present an adaptive version and a 2sided error version of the first test, that have a somewhat better query complexity than the other algorithms...
Approximating the Minimum Spanning Tree Weight in Sublinear Time
 In Proceedings of the 28th Annual International Colloquium on Automata, Languages and Programming (ICALP
, 2001
"... We present a probabilistic algorithm that, given a connected graph G (represented by adjacency lists) of average degree d, with edge weights in the set {1,...,w}, and given a parameter 0 < ε < 1/2, estimates in time O(dwε−2 log dw ε) the weight of the minimum spanning tree of G with a relativ ..."
Abstract

Cited by 42 (6 self)
 Add to MetaCart
We present a probabilistic algorithm that, given a connected graph G (represented by adjacency lists) of average degree d, with edge weights in the set {1,...,w}, and given a parameter 0 < ε < 1/2, estimates in time O(dwε−2 log dw ε) the weight of the minimum spanning tree of G with a relative error of at most ε. Note that the running time does not depend on the number of vertices in G. We also prove a nearly matching lower bound of Ω(dwε−2) on the probe and time complexity of any approximation algorithm for MST weight. The essential component of our algorithm is a procedure for estimating in time O(dε−2 log d ε) the number of connected components of an unweighted graph to within an additive error of εn. (This becomes O(ε−2 log 1 ε) for d = O(1).) The time bound is shown to be tight up to within the log d ε factor. Our connectedcomponents algorithm picks O(1/ε2) vertices in the graph and then grows “local spanning trees” whose sizes are specified by a stochastic process. From the local information collected in this way, the algorithm is able to infer, with high confidence, an estimate of the number of connected components. We then show how estimates on the number of components in various subgraphs of G can be used to estimate the weight of its MST. 1
Distributed Approaches to Triangulation and Embedding
 In Proceedings 16th ACMSIAM Symposium on Discrete Algorithms (SODA
, 2005
"... A number of recent papers in the networking community study the distance matrix defined by the nodetonode latencies in the Internet and, in particular, provide a number of quite successful distributed approaches that embed this distance into a lowdimensional Euclidean space. In such algorithms it ..."
Abstract

Cited by 31 (6 self)
 Add to MetaCart
A number of recent papers in the networking community study the distance matrix defined by the nodetonode latencies in the Internet and, in particular, provide a number of quite successful distributed approaches that embed this distance into a lowdimensional Euclidean space. In such algorithms it is feasible to measure distances among only a linear or nearlinear number of node pairs; the rest of the distances are simply not available. Moreover, for applications it is desirable to spread the load evenly among the participating nodes. Indeed, several recent studies use this ’fully distributed ’ approach and achieve, empirically, a low distortion for all but a small fraction of node pairs. This is concurrent with the large body of theoretical work on metric embeddings, but there is a fundamental distinction: in the theoretical approaches to metric embeddings, full and centralized access to the distance matrix is assumed and heavily used. In this paper we present the first fully distributed embedding algorithm with provable distortion guarantees for doubling metrics (which have been proposed as a reasonable abstraction of Internet latencies), thus providing some insight into the empirical success of the recent Vivaldi algorithm [7]. The main ingredient of our embedding algorithm is an improved fully distributed algorithm for a more basic problem of triangulation, where the triangle inequality is used to infer the distances that have not been measured; this problem received a considerable attention in the networking community, and has also been studied theoretically in [19]. We use our techniques to extend ɛrelaxed embeddings and triangulations to infinite metrics and arbitrary measures, and to improve on the approximate distance labeling scheme of Talwar [36]. 1
Sublinear time algorithms
 SIGACT News
, 2003
"... Abstract Sublinear time algorithms represent a new paradigm in computing, where an algorithmmust give some sort of an answer after inspecting only a very small portion of the input. We discuss the sorts of answers that one might be able to achieve in this new setting. 1 Introduction The goal of algo ..."
Abstract

Cited by 24 (3 self)
 Add to MetaCart
Abstract Sublinear time algorithms represent a new paradigm in computing, where an algorithmmust give some sort of an answer after inspecting only a very small portion of the input. We discuss the sorts of answers that one might be able to achieve in this new setting. 1 Introduction The goal of algorithmic research is to design efficient algorithms, where efficiency is typicallymeasured as a function of the length of the input. For instance, the elementary school algorithm for multiplying two n digit integers takes roughly n2 steps, while more sophisticated algorithmshave been devised which run in less than n log2 n steps. It is still not known whether a linear time algorithm is achievable for integer multiplication. Obviously any algorithm for this task, as for anyother nontrivial task, would need to take at least linear time in n, since this is what it would take to read the entire input and write the output. Thus, showing the existence of a linear time algorithmfor a problem was traditionally considered to be the gold standard of achievement. Nevertheless, due to the recent tremendous increase in computational power that is inundatingus with a multitude of data, we are now encountering a paradigm shift from traditional computational models. The scale of these data sets, coupled with the typical situation in which there is verylittle time to perform our computations, raises the issue of whether there is time to consider any more than a miniscule fraction of the data in our computations? Analogous to the reasoning thatwe used for multiplication, for most natural problems, an algorithm which runs in sublinear time must necessarily use randomization and must give an answer which is in some sense imprecise.Nevertheless, there are many situations in which a fast approximate solution is more useful than a slower exact solution.
Quantum testers for hidden group properties
 FUNDAMENTA INFORMATICAE 1–16
, 2008
"... We construct efficient or query efficient quantum property testers for two existential group properties which have exponential query complexity both for their decision problem in the quantum and for their testing problem in the classical model of computing. These are periodicity in groups and the c ..."
Abstract

Cited by 9 (1 self)
 Add to MetaCart
We construct efficient or query efficient quantum property testers for two existential group properties which have exponential query complexity both for their decision problem in the quantum and for their testing problem in the classical model of computing. These are periodicity in groups and the common coset range property of two functions having identical ranges within each coset of some normal subgroup. Our periodicity tester is efficient in Abelian groups and generalizes, in several aspects, previous periodicity testers. This is achieved by introducing a technique refining the majority correction process widely used for proving robustness of algebraic properties. The periodicity tester in nonAbelian groups and the common coset range tester are query efficient.
Testing stconnectivity
 11th International Workshop on Randomization and Computation (RANDOM 2007
, 2007
"... We continue the study, started in [9], of property testing of graphs in the orientation model. A major question which was left open in [9] is whether the property of stconnectivity can be tested with a constant number of queries. Here we answer this question on the affirmative. To this end we const ..."
Abstract

Cited by 6 (5 self)
 Add to MetaCart
We continue the study, started in [9], of property testing of graphs in the orientation model. A major question which was left open in [9] is whether the property of stconnectivity can be tested with a constant number of queries. Here we answer this question on the affirmative. To this end we construct a nontrivial reduction of the stconnectivity problem to the problem of testing languages that are decidable by branching programs, which was solved in [11]. The reduction combines combinatorial arguments with a concentration type lemma that is proven for this purpose. Unlike many other property testing results, here the resulting testing algorithm is highly nontrivial itself, and not only its analysis.
DistributionFree Testing Lower Bounds for BasicBoolean Functions
"... Abstract. In the distributionfree property testing model, the distance betweenfunctions is measured with respect to an arbitrary and unknown probability distribution D over the input domain. We consider distributionfree testing of several basic Boolean function classes over { 0, 1}n, namely monot ..."
Abstract

Cited by 5 (1 self)
 Add to MetaCart
Abstract. In the distributionfree property testing model, the distance betweenfunctions is measured with respect to an arbitrary and unknown probability distribution D over the input domain. We consider distributionfree testing of several basic Boolean function classes over { 0, 1}n, namely monotone conjunctions,general conjunctions, decision lists, and linear threshold functions. We prove that for each of these function classes, \Omega ((n / log n)1/5) oracle calls are required forany distributionfree testing algorithm. Since each of these function classes is known to be distributionfree properly learnable (and hence testable) using \Theta (n)oracle calls, our lower bounds are within a polynomial factor of the best possible. 1 Introduction The field of property testing deals with algorithms that decide whether an input objecthas a certain property or is far from having the property after reading only a small fraction of the object. Property testing was introduced in [21] and has evolved into a richfield of study (see [3, 7, 10, 19, 20] for some surveys). A standard approach in property testing is to view the input to the testing algorithm as a function over some finite domain;the testing algorithm is required to distinguish functions that have a certain property Pfrom functions that are fflfar from having property P. In the most commonly consideredproperty testing scenario, a function
Detecting and Exploiting NearSortedness for Efficient Relational Query Evaluation
"... Many relational operations are best performed when the relations are stored sorted over the relevant attributes (e.g. the common attributes in a natural join operation). However, generally relations are not stored sorted because it is expensive to maintain them this way (and impossible whenever ther ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
Many relational operations are best performed when the relations are stored sorted over the relevant attributes (e.g. the common attributes in a natural join operation). However, generally relations are not stored sorted because it is expensive to maintain them this way (and impossible whenever there is more than one relevant sort key). Still, many times relations turn out to be nearlysorted, where most tuples are close to their place in the order. This state can result from “leftover sortedness”, where originally sorted relations were updated, or were combined into interim results when evaluating a complex query. It can also result from weak correlations between attribute values. Currently, nearlysorted relations are treated the same as unsorted relations, and when relational operations are evaluated for them, a generic algorithm is used. Yet, many operations can be computed more efficiently by an algorithm that exploits this nearordering. However, to consistently benefit from using such algorithms the system should also refrain from using the wrong algorithm for relations which happen not to be sorted at all. Thus, an efficient test is required, i.e., a very fast approximation algorithm for establishing whether a given relation is sufficiently nearlysorted. In this paper, we provide the theoretical foundations for improving query evaluation over possibly nearlysorted relations. First we formally define what it means for a relation to be nearlysorted, and show how operations over such relations, such as natural join, set operations and sorting, can be executed significantly more efficiently using an algorithm that we provide. If a relation is nearlysorted enough, then it can be sorted using two sequential reads of the relation, and writing no intermediate data to disk. We then construct efficient probabilistic tests for approximating the degree of the nearsortedness of a relation without having to read an entire file. The role of our algorithms in a database manage