Results 1  10
of
189
Correlation Clustering
 MACHINE LEARNING
, 2002
"... We consider the following clustering problem: we have a complete graph on # vertices (items), where each edge ### ## is labeled either # or depending on whether # and # have been deemed to be similar or different. The goal is to produce a partition of the vertices (a clustering) that agrees as mu ..."
Abstract

Cited by 331 (4 self)
 Add to MetaCart
(Show Context)
We consider the following clustering problem: we have a complete graph on # vertices (items), where each edge ### ## is labeled either # or depending on whether # and # have been deemed to be similar or different. The goal is to produce a partition of the vertices (a clustering) that agrees as much as possible with the edge labels. That is, we want a clustering that maximizes the number of # edges within clusters, plus the number of edges between clusters (equivalently, minimizes the number of disagreements: the number of edges inside clusters plus the number of # edges between clusters). This formulation is motivated from a document clustering problem in which one has a pairwise similarity function # learned from past data, and the goal is to partition the current set of documents in a way that correlates with # as much as possible; it can also be viewed as a kind of "agnostic learning" problem. An interesting
The art of uninformed decisions: A primer to property testing
 Science
, 2001
"... Property testing is a new field in computational theory, that deals with the information that can be deduced from the input where the number of allowable queries (reads from the input) is significally smaller than its size. ..."
Abstract

Cited by 159 (25 self)
 Add to MetaCart
(Show Context)
Property testing is a new field in computational theory, that deals with the information that can be deduced from the input where the number of allowable queries (reads from the input) is significally smaller than its size.
A characterization of the (natural) graph properties testable with onesided error
 Proc. of FOCS 2005
, 2005
"... The problem of characterizing all the testable graph properties is considered by many to be the most important open problem in the area of propertytesting. Our main result in this paper is a solution of an important special case of this general problem; Call a property tester oblivious if its decis ..."
Abstract

Cited by 112 (19 self)
 Add to MetaCart
(Show Context)
The problem of characterizing all the testable graph properties is considered by many to be the most important open problem in the area of propertytesting. Our main result in this paper is a solution of an important special case of this general problem; Call a property tester oblivious if its decisions are independent of the size of the input graph. We show that a graph property P has an oblivious onesided error tester, if and only if P is (almost) hereditary. We stress that any ”natural ” property that can be tested (either with onesided or with twosided error) can be tested by an oblivious tester. In particular, all the testers studied thus far in the literature were oblivious. Our main result can thus be considered as a precise characterization of the ”natural” graph properties, which are testable with onesided error. One of the main technical contributions of this paper is in showing that any hereditary graph property can be tested with onesided error. This general result contains as a special case all the previous results about testing graph properties with onesided error. These include the results of [20] and [5] about testing kcolorability, the characterization of [21] of the graphpartitioning problems that are testable with onesided error, the induced vertex colorability properties of [3], the induced edge colorability properties of [14], a transformation from twosided to onesided error testing [21], as well as a recent result about testing monotone graph properties [10]. More importantly, as a special case of our main result, we infer that some of the most well studied graph properties, both in graph theory and computer science, are testable with onesided error. Some of these properties are the well known graph properties of being Perfect, Chordal, Interval, Comparability, Permutation and more. None of these properties was previously known to be testable. 1
Convergent Sequences of Dense Graphs I: Subgraph Frequencies, Metric Properties and Testing
, 2006
"... We consider sequences of graphs (Gn) and define various notions of convergence related to these sequences: “left convergence” defined in terms of the densities of homomorphisms from small graphs into Gn; “right convergence” defined in terms of the densities of homomorphisms from Gn into small graphs ..."
Abstract

Cited by 107 (5 self)
 Add to MetaCart
We consider sequences of graphs (Gn) and define various notions of convergence related to these sequences: “left convergence” defined in terms of the densities of homomorphisms from small graphs into Gn; “right convergence” defined in terms of the densities of homomorphisms from Gn into small graphs; and convergence in a suitably defined metric. In Part I of this series, we show that left convergence is equivalent to convergence in metric, both for simple graphs Gn, and for graphs Gn with nodeweights and edgeweights. One of the main steps here is the introduction of a cutdistance comparing graphs, not necessarily of the same size. We also show how these notions of convergence provide natural
Testing that distributions are close
 In IEEE Symposium on Foundations of Computer Science
, 2000
"... Given two distributions over an n element set, we wish to check whether these distributions are statistically close by only sampling. We give a sublinear algorithm which uses O(n 2/3 ɛ −4 log n) independent samples from each distribution, runs in time linear in the sample size, makes no assumptions ..."
Abstract

Cited by 101 (18 self)
 Add to MetaCart
(Show Context)
Given two distributions over an n element set, we wish to check whether these distributions are statistically close by only sampling. We give a sublinear algorithm which uses O(n 2/3 ɛ −4 log n) independent samples from each distribution, runs in time linear in the sample size, makes no assumptions about the structure of the distributions, and distinguishes the cases ɛ when the distance between the distributions is small (less than max ( 2 32 3 √ n, ɛ 4 √)) or large (more n than ɛ) in L1distance. We also give an Ω(n 2/3 ɛ −2/3) lower bound. Our algorithm has applications to the problem of checking whether a given Markov process is rapidly mixing. We develop sublinear algorithms for this problem as well.
Regular Languages are Testable with a Constant Number of Queries
 SIAM Journal on Computing
, 1999
"... We continue the study of combinatorial property testing, initiated by Goldreich, Goldwasser and Ron in [7]. The subject of this paper is testing regular languages. Our main result is as follows. For a regular language L 2 f0; 1g and an integer n there exists a randomized algorithm which always acc ..."
Abstract

Cited by 90 (19 self)
 Add to MetaCart
(Show Context)
We continue the study of combinatorial property testing, initiated by Goldreich, Goldwasser and Ron in [7]. The subject of this paper is testing regular languages. Our main result is as follows. For a regular language L 2 f0; 1g and an integer n there exists a randomized algorithm which always accepts a word w of length n if w 2 L, and rejects it with high probability if w has to be modified in at least n positions to create a word in L. The algorithm queries ~ O(1=) bits of w. This query complexity is shown to be optimal up to a factor polylogarithmic in 1=. We also discuss testability of more complex languages and show, in particular, that the query complexity required for testing contextfree languages cannot be bounded by any function of . The problem of testing regular languages can be viewed as a part of a very general approach, seeking to probe testability of properties defined by logical means. 1
A combinatorial characterization of the testable graph properties: it’s all about regularity
 Proc. of STOC 2006
, 2006
"... A common thread in all the recent results concerning testing dense graphs is the use of Szemerédi’s regularity lemma. In this paper we show that in some sense this is not a coincidence. Our first result is that the property defined by having any given Szemerédipartition is testable with a constant ..."
Abstract

Cited by 87 (15 self)
 Add to MetaCart
A common thread in all the recent results concerning testing dense graphs is the use of Szemerédi’s regularity lemma. In this paper we show that in some sense this is not a coincidence. Our first result is that the property defined by having any given Szemerédipartition is testable with a constant number of queries. Our second and main result is a purely combinatorial characterization of the graph properties that are testable with a constant number of queries. This characterization (roughly) says that a graph property P can be tested with a constant number of queries if and only if testing P can be reduced to testing the property of satisfying one of finitely many Szemerédipartitions. This means that in some sense, testing for Szemerédipartitions is as hard as testing any testable graph property. We thus resolve one of the main open problems in the area of propertytesting, which was first raised in the 1996 paper of Goldreich, Goldwasser and Ron [24] that initiated the study of graph propertytesting. This characterization also gives an intuitive explanation as to what makes a graph property testable.
Three Theorems regarding Testing Graph Properties
, 2001
"... Property testing is a relaxation of decision problems in which it is required to distinguish yesinstances (i.e., objects having a predetermined property) from instances that are far from any yesinstance. We presents three theorems regarding testing graph properties in the adjacency matrix represe ..."
Abstract

Cited by 86 (13 self)
 Add to MetaCart
Property testing is a relaxation of decision problems in which it is required to distinguish yesinstances (i.e., objects having a predetermined property) from instances that are far from any yesinstance. We presents three theorems regarding testing graph properties in the adjacency matrix representation. More specifically, these theorems relate to the project of characterizing graph properties according to the complexity of testing them (in the adjacency matrix representation). The first theorem is that there exist monotone graph properties in N P for which testing is very hard (i.e., requires to examine a constant fraction of the entries in the matrix). The second theorem is that every graph property that can be tested making a number of queries that is independent of the size of the graph, can be so tested by uniformly selecting a set of vertices and accepting iff the induced subgraph has some fixed graph property (which is not necessarily the same as the one being tested). The third theorem refers to the framework of graph partition problems, and is a characterization of the subclass of properties that can be tested using a onesided error tester making a number of queries that is independent of the size of the graph.
Testing Monotonicity
, 1999
"... We present a (randomized) test for monotonicity of Boolean functions. Namely, given the ability to query an unknown function f : f0; 1g 7! f0; 1g at arguments of its choice, the test always accepts a monotone f , and rejects f with high probability if it is fflfar from being monotone (i.e., e ..."
Abstract

Cited by 79 (16 self)
 Add to MetaCart
We present a (randomized) test for monotonicity of Boolean functions. Namely, given the ability to query an unknown function f : f0; 1g 7! f0; 1g at arguments of its choice, the test always accepts a monotone f , and rejects f with high probability if it is fflfar from being monotone (i.e., every monotone function differs from f on more than an ffl fraction of the domain).
Property Testing
 Handbook of Randomized Computing, Vol. II
, 2000
"... this technical aspect (as in the boundeddegree model the closest graph having the property must have at most dN edges and degree bound d as well). ..."
Abstract

Cited by 76 (11 self)
 Add to MetaCart
(Show Context)
this technical aspect (as in the boundeddegree model the closest graph having the property must have at most dN edges and degree bound d as well).