Results 1  10
of
44
Streaming and sublinear approximation of entropy and information distances
 In ACMSIAM Symposium on Discrete Algorithms
, 2006
"... In most algorithmic applications which compare two distributions, information theoretic distances are more natural than standard ℓp norms. In this paper we design streaming and sublinear time property testing algorithms for entropy and various information theoretic distances. Batu et al posed the pr ..."
Abstract

Cited by 54 (12 self)
 Add to MetaCart
In most algorithmic applications which compare two distributions, information theoretic distances are more natural than standard ℓp norms. In this paper we design streaming and sublinear time property testing algorithms for entropy and various information theoretic distances. Batu et al posed the problem of property testing with respect to the JensenShannon distance. We present optimal algorithms for estimating bounded, symmetric fdivergences (including the JensenShannon divergence and the Hellinger distance) between distributions in various property testing frameworks. Along the way, we close a (log n)/H gap between the upper and lower bounds for estimating entropy H, yielding an optimal algorithm over all values of the entropy. In a data stream setting (sublinear space), we give the first algorithm for estimating the entropy of a distribution. Our algorithm runs in polylogarithmic space and yields an asymptotic constant factor approximation scheme. An integral part of the algorithm is an interesting use of an F0 (the number of distinct elements in a set) estimation algorithm; we also provide other results along the space/time/approximation tradeoff curve. Our results have interesting structural implications that connect sublinear time and space constrained algorithms. The mediating model is the random order streaming model, which assumes the input is a random permutation of a multiset and was first considered by Munro and Paterson in 1980. We show that any property testing algorithm in the combined oracle model for calculating a permutation invariant functions can be simulated in the random order model in a single pass. This addresses a question raised by Feigenbaum et al regarding the relationship between property testing and stream algorithms. Further, we give a polylogspace PTAS for estimating the entropy of a one pass random order stream. This bound cannot be achieved in the combined oracle (generalized property testing) model. 1
A lower bound for testing 3colorability in boundeddegree graphs
 In Proc. 43rd IEEE Symp. on Foundations of Comp. Science
, 2002
"... We consider the problem of testing 3colorability in the boundeddegree model. We show that, for small enough ε, every tester for 3colorability must have query complexity Ω(n). This is the first linear lower bound for testing a natural graph property in the boundeddegree model. An Ω ( √ n) lower b ..."
Abstract

Cited by 30 (0 self)
 Add to MetaCart
We consider the problem of testing 3colorability in the boundeddegree model. We show that, for small enough ε, every tester for 3colorability must have query complexity Ω(n). This is the first linear lower bound for testing a natural graph property in the boundeddegree model. An Ω ( √ n) lower bound was previously known. For onesided error testers, we also show an Ω(n) lower bound for testers that distinguish 3colorable graphs from graphs that are (1/3 − α)far from 3colorable, for arbitrarily small α. In contrast, a polynomial time algorithm by Frieze and Jerrum distinguishes 3colorable graphs from graphs that are 1/5far from 3colorable. As a byproduct of our techniques, we obtain tight unconditional lower bounds on the approximation ratios achievable by sublinear time algorithms for Max E3SAT, Max E3LIN2 and other problems. 1
Algorithmic and Analysis Techniques in Property Testing
"... Property testing algorithms are “ultra”efficient algorithms that decide whether a given object (e.g., a graph) has a certain property (e.g., bipartiteness), or is significantly different from any object that has the property. To this end property testing algorithms are given the ability to perform ..."
Abstract

Cited by 27 (4 self)
 Add to MetaCart
Property testing algorithms are “ultra”efficient algorithms that decide whether a given object (e.g., a graph) has a certain property (e.g., bipartiteness), or is significantly different from any object that has the property. To this end property testing algorithms are given the ability to perform (local) queries to the input, though the decision they need to make usually concern properties with a global nature. In the last two decades, property testing algorithms have been designed for many types of objects and properties, amongst them, graph properties, algebraic properties, geometric properties, and more. In this article we survey results in property testing, where our emphasis is on common analysis and algorithmic techniques. Among the techniques surveyed are the following: • The selfcorrecting approach, which was mainly applied in the study of property testing of algebraic properties; • The enforce and test approach, which was applied quite extensively in the analysis of algorithms for testing graph properties (in the densegraphs model), as well as in other contexts;
Property Testing: A Learning Theory Perspective
"... Property testing deals with tasks where the goal is to distinguish between the case that an object (e.g., function or graph) has a prespecified property (e.g., the function is linear or the graph is bipartite) and the case that it differs significantly from any such object. The task should be perfor ..."
Abstract

Cited by 25 (5 self)
 Add to MetaCart
Property testing deals with tasks where the goal is to distinguish between the case that an object (e.g., function or graph) has a prespecified property (e.g., the function is linear or the graph is bipartite) and the case that it differs significantly from any such object. The task should be performed by observing only a very small part of the object, in particular by querying the object, and the algorithm is allowed a small failure probability. One view of property testing is as a relaxation of learning the object (obtaining an approximate representation of the object). Thus property testing algorithms can serve as a preliminary step to learning. That is, they can be applied in order to select, very efficiently, what hypothesis class to use for learning. This survey takes the learningtheory point of view and focuses on results for testing properties of functions that are of interest to the learning theory community. In particular, we cover results for testing algebraic properties of functions such as linearity, testing properties defined by concise representations, such as having a small DNF representation, and more. 1
Sublinear time algorithms
 SIGACT News
, 2003
"... Abstract Sublinear time algorithms represent a new paradigm in computing, where an algorithmmust give some sort of an answer after inspecting only a very small portion of the input. We discuss the sorts of answers that one might be able to achieve in this new setting. 1 Introduction The goal of algo ..."
Abstract

Cited by 24 (3 self)
 Add to MetaCart
Abstract Sublinear time algorithms represent a new paradigm in computing, where an algorithmmust give some sort of an answer after inspecting only a very small portion of the input. We discuss the sorts of answers that one might be able to achieve in this new setting. 1 Introduction The goal of algorithmic research is to design efficient algorithms, where efficiency is typicallymeasured as a function of the length of the input. For instance, the elementary school algorithm for multiplying two n digit integers takes roughly n2 steps, while more sophisticated algorithmshave been devised which run in less than n log2 n steps. It is still not known whether a linear time algorithm is achievable for integer multiplication. Obviously any algorithm for this task, as for anyother nontrivial task, would need to take at least linear time in n, since this is what it would take to read the entire input and write the output. Thus, showing the existence of a linear time algorithmfor a problem was traditionally considered to be the gold standard of achievement. Nevertheless, due to the recent tremendous increase in computational power that is inundatingus with a multitude of data, we are now encountering a paradigm shift from traditional computational models. The scale of these data sets, coupled with the typical situation in which there is verylittle time to perform our computations, raises the issue of whether there is time to consider any more than a miniscule fraction of the data in our computations? Analogous to the reasoning thatwe used for multiplication, for most natural problems, an algorithm which runs in sublinear time must necessarily use randomization and must give an answer which is in some sense imprecise.Nevertheless, there are many situations in which a fast approximate solution is more useful than a slower exact solution.
Estimating the weight of metric minimum spanning trees in sublineartime
 in Proceedings of the 36th Annual ACM Symposium on Theory of Computing (STOC
"... In this paper we present a sublinear time (1+ ɛ)approximation randomized algorithm to estimate the weight of the minimum spanning tree of an npoint metric space. The running time of the algorithm is Õ(n/ɛO(1)). Since the full description of an npoint metric space is of size Θ(n 2),the complexity ..."
Abstract

Cited by 18 (5 self)
 Add to MetaCart
In this paper we present a sublinear time (1+ ɛ)approximation randomized algorithm to estimate the weight of the minimum spanning tree of an npoint metric space. The running time of the algorithm is Õ(n/ɛO(1)). Since the full description of an npoint metric space is of size Θ(n 2),the complexity of our algorithm is sublinear with respect to the input size. Our algorithm is almost optimal as it is not possible to approximate in o(n) time the weight of the minimum spanning tree to within any factor. Furthermore,it has been previously shown that no o(n 2) algorithm exists that returns a spanning tree whose weight is within a constant times the optimum.
Abstract combinatorial programs and efficient property testers
, 2005
"... Property testing is a relaxation of classical decision problems which aims at distinguishing between functions having a predetermined property and functions being far from any function having the property. In this paper we present a novel framework for analyzing property testing algorithms. Our fr ..."
Abstract

Cited by 15 (6 self)
 Add to MetaCart
Property testing is a relaxation of classical decision problems which aims at distinguishing between functions having a predetermined property and functions being far from any function having the property. In this paper we present a novel framework for analyzing property testing algorithms. Our framework is based on a connection of property testing and a new class of problems which we call abstract combinatorial programs. We show that if the problem of testing a property can be reduced to an abstract combinatorial program of small dimension, then the property has an efficient tester. We apply our framework to a variety of problems. We present efficient property testing algorithms for geometric clustering problems, for the reversal distance problem, and for graph and hypergraph coloring problems. We also prove that, informally, any hereditary graph property can be efficiently tested if and only if it can be reduced to an abstract combinatorial program of small size. Our framework allows us to analyze all our testers in a unified way, and the obtained complexity bounds either match or improve the previously known bounds. Furthermore, even if the asymptotic complexity of the testers is not improved, the obtained proofs are significantly simpler than the previous ones. We believe that our framework will help to understand the structure of efficiently testable properties.
SublinearTime Approximation of Euclidean Minimum Spanning Tree
, 2003
"... We consider the problem of estimating the weight of a Euclidean minimum spanning tree for a set of n points in R . We focus on the situation when the input point set is supported by certain basic (and commonly used) geometric data structures that can provide efficient access to the input in a s ..."
Abstract

Cited by 13 (6 self)
 Add to MetaCart
We consider the problem of estimating the weight of a Euclidean minimum spanning tree for a set of n points in R . We focus on the situation when the input point set is supported by certain basic (and commonly used) geometric data structures that can provide efficient access to the input in a structured way. Our main result is that if we assume the access to the input is supported by a minimal bounding cube of the input, by orthogonal range queries, and by cone approximate nearest neighbors queries, then it is possible to estimate the weight of a Euclidean minimum spanning tree of P to within 1 + " using only e O( n poly(1=")) queries for constant d.
Distance approximation in boundeddegree and general sparse graphs
 In Proceedings of the Tenth International Workshop on Randomization and Computation (RANDOM
, 2006
"... We address the problem of approximating the distance of bounded degree and general sparse graphs from having some predetermined graph property P. Namely, we are interested in sublinear algorithms for estimating the fraction of edges that should be added to / removed from a graph so that it obtains P ..."
Abstract

Cited by 12 (4 self)
 Add to MetaCart
We address the problem of approximating the distance of bounded degree and general sparse graphs from having some predetermined graph property P. Namely, we are interested in sublinear algorithms for estimating the fraction of edges that should be added to / removed from a graph so that it obtains P. This fraction is taken with respect to a given upper bound m on the number of edges. In particular, for graphs with degree bound d over n vertices, m = dn. To perform such an approximation the algorithm may ask for the degree of any vertex of its choice, and may ask for the neighbors of any vertex. The problem of estimating the distance to having a property was first explicitly addressed by Parnas et. al. (ECCC 2004). In the context of graphs this problem was studied by Fischer and Newman (FOCS 2005) in the densegraphs model. In this model the fraction of edge modifications is taken with respect to n 2, and the algorithm may ask for the existence of an edge between any pair of vertices of its choice. Fischer and Newman showed that every graph property that has a testing algorithm in this model with query complexity that is independent of the size of the graph, also has a distanceapproximation algorithm with query complexity that is independent of the size of the graph. In this work we focus on boundeddegree and general sparse graphs, and give algorithms for all properties that were shown to have efficient testing algorithms by Goldreich and Ron (Algorithmica, 2002). Specifically, these properties are kedge connectivity, subgraphfreeness (for constant size subgraphs), being a Eulerian graph, and cyclefreeness. A variant of our subgraphfreeness algorithm approximates the size of a minimum vertex cover of a graph in sublinear time. This approximation improves on a recent result of Parnas and Ron (ECCC 2005).
Towards Privacy for Social Networks: A ZeroKnowledge Based Definition of Privacy
"... Abstract. We put forward a zeroknowledge based definition of privacy. Our notion is strictly stronger than the notion of differential privacy and is particularly attractive when modeling privacy in social networks. We furthermore demonstrate that it can be meaningfully achieved for tasks such as co ..."
Abstract

Cited by 11 (1 self)
 Add to MetaCart
Abstract. We put forward a zeroknowledge based definition of privacy. Our notion is strictly stronger than the notion of differential privacy and is particularly attractive when modeling privacy in social networks. We furthermore demonstrate that it can be meaningfully achieved for tasks such as computing averages, fractions, histograms, and a variety of graph parameters and properties, such as average degree and distance to connectivity. Our results are obtained by establishing a connection between zeroknowledge privacy and sample complexity, and by leveraging recent sublinear time algorithms. 1