Results 1  10
of
672
Some optimal inapproximability results
, 2002
"... We prove optimal, up to an arbitrary ffl? 0, inapproximability results for MaxEkSat for k * 3, maximizing the number of satisfied linear equations in an overdetermined system of linear equations modulo a prime p and Set Splitting. As a consequence of these results we get improved lower bounds for ..."
Abstract

Cited by 660 (12 self)
 Add to MetaCart
We prove optimal, up to an arbitrary ffl? 0, inapproximability results for MaxEkSat for k * 3, maximizing the number of satisfied linear equations in an overdetermined system of linear equations modulo a prime p and Set Splitting. As a consequence of these results we get improved lower bounds for the efficient approximability of many optimization problems studied previously. In particular, for MaxE2Sat, MaxCut, MaxdiCut, and Vertex cover. Warning: Essentially this paper has been published in JACM and is subject to copyright restrictions. In particular it is for personal use only.
Automatic Subspace Clustering of High Dimensional Data
 Data Mining and Knowledge Discovery
, 2005
"... Data mining applications place special requirements on clustering algorithms including: the ability to find clusters embedded in subspaces of high dimensional data, scalability, enduser comprehensibility of the results, nonpresumption of any canonical data distribution, and insensitivity to the or ..."
Abstract

Cited by 600 (12 self)
 Add to MetaCart
(Show Context)
Data mining applications place special requirements on clustering algorithms including: the ability to find clusters embedded in subspaces of high dimensional data, scalability, enduser comprehensibility of the results, nonpresumption of any canonical data distribution, and insensitivity to the order of input records. We present CLIQUE, a clustering algorithm that satisfies each of these requirements. CLIQUE identifies dense clusters in subspaces of maximum dimensionality. It generates cluster descriptions in the form of DNF expressions that are minimized for ease of comprehension. It produces identical results irrespective of the order in which input records are presented and does not presume any specific mathematical form for data distribution. Through experiments, we show that CLIQUE efficiently finds accurate clusters in large high dimensional datasets.
On the power of unique 2prover 1round games
 In Proceedings of the 34th Annual ACM Symposium on Theory of Computing
, 2002
"... ABSTRACT A 2prover game is called unique if the answer of one prover uniquely determines the answer of the second prover and vice versa (we implicitly assume games to be one round games). The value of a 2prover game is the maximum acceptance probability of the verifier over all the prover strategi ..."
Abstract

Cited by 237 (19 self)
 Add to MetaCart
ABSTRACT A 2prover game is called unique if the answer of one prover uniquely determines the answer of the second prover and vice versa (we implicitly assume games to be one round games). The value of a 2prover game is the maximum acceptance probability of the verifier over all the prover strategies. We make the following conjecture regarding the power of unique 2prover games, which we call the Unique Games Conjecture: The Unique Games Conjecture: For arbitrarily small constants i; ffi? 0, there exists a constant k = k(i; ffi) such that it is NPhard to determine whether a unique 2prover game with answers from a domain of size k has value at least 1 \Gamma i or at most ffi. We show that a positive resolution of this conjecture would imply the following hardness results:
Selection of Views to Materialize in a Data Warehouse
, 1997
"... . A data warehouse stores materialized views of data from one or more sources, with the purpose of efficiently implementing decisionsupport or OLAP queries. One of the most important decisions in designing a data warehouse is the selection of materialized views to be maintained at the warehouse. The ..."
Abstract

Cited by 216 (5 self)
 Add to MetaCart
(Show Context)
. A data warehouse stores materialized views of data from one or more sources, with the purpose of efficiently implementing decisionsupport or OLAP queries. One of the most important decisions in designing a data warehouse is the selection of materialized views to be maintained at the warehouse. The goal is to select an appropriate set of views that minimizes total query response time and the cost of maintaining the selected views, given a limited amount of resource, e.g., materialization time, storage space etc. In this article, we develop a theoretical framework for the general problem of selection of views in a data warehouse. We present competitive polynomialtime heuristics for selection of views to optimize total query response time, for some important special cases of the general data warehouse scenario, viz.: (i) an AND view graph, where each query/view has a unique evaluation, and (ii) an OR view graph, in which any view can be computed from any one of its related views, e.g.,...
Beyond Independent Relevance: Methods and Evaluation Metrics for Subtopic Retrieval
 In Proceedings of SIGIR
, 2003
"... We present a nontraditional retrieval problem we call subtopic retrieval. The subtopic retrieval problem is concerned with finding documents that cover many different subtopics of a query topic. This means that the utility of a document in a ranking is dependent on other documents in the ranking, v ..."
Abstract

Cited by 160 (5 self)
 Add to MetaCart
We present a nontraditional retrieval problem we call subtopic retrieval. The subtopic retrieval problem is concerned with finding documents that cover many different subtopics of a query topic. This means that the utility of a document in a ranking is dependent on other documents in the ranking, violating the assumption of independent relevance which is assumed in most traditional retrieval methods. Subtopic retrieval poses challenges for evaluating performance, as well as for developing effective algorithms. We propose a framework for evaluating subtopic retrieval which generalizes the traditional precision and recall metrics by accounting for intrinsic topic difficulty as well as redundancy in documents. We propose and systematically evaluate several methods for performing subtopic retrieval using statistical language models and a maximal marginal relevance (MMR) ranking strategy. A mixture model combined with query likelihood relevance ranking is shown to modestly outperform a baseline relevance ranking on a data set used in the TREC interactive track.
A polylogarithmic approximation algorithm for the group Steiner tree problem
 Journal of Algorithms
, 2000
"... The group Steiner tree problem is a generalization of the Steiner tree problem where we ae given several subsets (groups) of vertices in a weighted graph, and the goal is to find a minimumweight connected subgraph containing at least one vertex from each group. The problem was introduced by Reich a ..."
Abstract

Cited by 134 (9 self)
 Add to MetaCart
(Show Context)
The group Steiner tree problem is a generalization of the Steiner tree problem where we ae given several subsets (groups) of vertices in a weighted graph, and the goal is to find a minimumweight connected subgraph containing at least one vertex from each group. The problem was introduced by Reich and Widmayer and finds applications in VLSI design. The group Steiner tree problem generalizes the set covering problem, and is therefore at least as had. We give a randomized O(log 3 n log k)approximation algorithm for the group Steiner tree problem on an nnode graph, where k is the number of groups. The best previous ink)v/ (Bateman, Helvig, performance guarantee was (1 +  Robins and Zelikovsky).
Topology Control and Routing in Ad hoc Networks: A Survey
 SIGACT News
, 2002
"... this article, we review some of the characteristic features of ad hoc networks, formulate problems and survey research work done in the area. We focus on two basic problem domains: topology control, the problem of computing and maintaining a connected topology among the network nodes, and routing. T ..."
Abstract

Cited by 129 (0 self)
 Add to MetaCart
(Show Context)
this article, we review some of the characteristic features of ad hoc networks, formulate problems and survey research work done in the area. We focus on two basic problem domains: topology control, the problem of computing and maintaining a connected topology among the network nodes, and routing. This article is not intended to be a comprehensive survey on ad hoc networking. The choice of the problems discussed in this article are somewhat biased by the research interests of the author
THE PRIMALDUAL METHOD FOR APPROXIMATION ALGORITHMS AND ITS APPLICATION TO NETWORK DESIGN PROBLEMS
"... The primaldual method is a standard tool in the design of algorithms for combinatorial optimization problems. This chapter shows how the primaldual method can be modified to provide good approximation algorithms for a wide variety of NPhard problems. We concentrate on results from recent researc ..."
Abstract

Cited by 124 (7 self)
 Add to MetaCart
The primaldual method is a standard tool in the design of algorithms for combinatorial optimization problems. This chapter shows how the primaldual method can be modified to provide good approximation algorithms for a wide variety of NPhard problems. We concentrate on results from recent research applying the primaldual method to problems in network design.
ConstantTime Distributed Dominating Set Approximation
 In Proc. of the 22 nd ACM Symposium on the Principles of Distributed Computing (PODC
, 2003
"... Finding a small dominating set is one of the most fundamental problems of traditional graph theory. In this paper, we present a new fully distributed approximation algorithm based on LP relaxation techniques. For an arbitrary parameter k and maximum degree #, our algorithm computes a dominating set ..."
Abstract

Cited by 122 (25 self)
 Add to MetaCart
(Show Context)
Finding a small dominating set is one of the most fundamental problems of traditional graph theory. In this paper, we present a new fully distributed approximation algorithm based on LP relaxation techniques. For an arbitrary parameter k and maximum degree #, our algorithm computes a dominating set of expected size O k# log #DSOPT rounds where each node has to send O k messages of size O(log #). This is the first algorithm which achieves a nontrivial approximation ratio in a constant number of rounds.
A new greedy approach for facility location problems
"... We present a simple and natural greedy algorithm for the metric uncapacitated facility location problem achieving an approximation guarantee of 1.61 whereas the best previously known was 1.73. Furthermore, we will show that our algorithm has a property which allows us to apply the technique of Lagra ..."
Abstract

Cited by 121 (9 self)
 Add to MetaCart
We present a simple and natural greedy algorithm for the metric uncapacitated facility location problem achieving an approximation guarantee of 1.61 whereas the best previously known was 1.73. Furthermore, we will show that our algorithm has a property which allows us to apply the technique of Lagrangian relaxation. Using this property, we can nd better approximation algorithms for many variants of the facility location problem, such as the capacitated facility location problem with soft capacities and a common generalization of the kmedian and facility location problem. We will also prove a lower bound on the approximability of the kmedian problem.