Results 1 - 10
of
28
A Tight Bound on Approximating Arbitrary Metrics by Tree Metrics
- In Proceedings of the 35th Annual ACM Symposium on Theory of Computing
, 2003
"... In this paper, we show that any n point metric space can be embedded into a distribution over dominating tree metrics such that the expected stretch of any edge is O(log n). This improves upon the result of Bartal who gave a bound of O(log n log log n). Moreover, our result is existentially tight; t ..."
Abstract
-
Cited by 216 (3 self)
- Add to MetaCart
In this paper, we show that any n point metric space can be embedded into a distribution over dominating tree metrics such that the expected stretch of any edge is O(log n). This improves upon the result of Bartal who gave a bound of O(log n log log n). Moreover, our result is existentially tight; there exist metric spaces where any tree embedding must have distortion#sto n)-distortion. This problem lies at the heart of numerous approximation and online algorithms including ones for group Steiner tree, metric labeling, buy-at-bulk network design and metrical task system. Our result improves the performance guarantees for all of these problems.
A survey of web caching schemes for the internet
- ACM Computer Communication Review
, 1999
"... The World Wide Web can be considered as a large distributed information system that provides access to shared data objects. As one of the most popular applications currently running on the Internet, the World Wide Web is of an exponential growth in size, which results in network congestion and serve ..."
Abstract
-
Cited by 200 (1 self)
- Add to MetaCart
The World Wide Web can be considered as a large distributed information system that provides access to shared data objects. As one of the most popular applications currently running on the Internet, the World Wide Web is of an exponential growth in size, which results in network congestion and server overloading. Web caching has been recognized as one of the effective schemes to alleviate the service bottleneck and reduce the network traffic, thereby minimize the user access latency. In this paper, we first describe the elements of a Web caching system and its desirable properties. Then, we survey the state-of-art techniques which have been used in Web caching systems. Finally, we discuss the research frontier
The Online Median Problem
- In Proceedings of the 41st Annual IEEE Symposium on Foundations of Computer Science
, 2000
"... We introduce a natural variant of the (metric uncapacitated) k-median problem that we call the online median problem. Whereas the k-median problem involves optimizing the simultaneous placement of k facilities, the online median problem imposes the following additional constraints: the facilities ar ..."
Abstract
-
Cited by 69 (2 self)
- Add to MetaCart
We introduce a natural variant of the (metric uncapacitated) k-median problem that we call the online median problem. Whereas the k-median problem involves optimizing the simultaneous placement of k facilities, the online median problem imposes the following additional constraints: the facilities are placed one at a time; a facility cannot be moved once it is placed, and the total number of facilities to be placed, k, is not known in advance. The objective of an online median algorithm is to minimize the competitive ratio, that is, the worst-case ratio of the cost of an online placement to that of an optimal offline placement. Our main result is a linear-time constant-competitive algorithm for the online median problem. In addition, we present a related, though substantially simpler, linear-time constant-factor approximation algorithm for the (metric uncapacitated) facility location problem. The latter algorithm is similar in spirit to the recent primal-dual-based facility location algorithm of Jain and Vazirani, but our approach is more elementary and yields an improved running time.
Coordinated Placement and Replacement for Large-Scale Distributed Caches
- IEEE Transactions on Knowledge and Data Engineering
, 1998
"... In a large-scale information system such as a digital library or the web, a set of distributed caches can improve their effectiveness by coordinating their data placement decisions. In this paper, we examine the design space for cooperative placement and replacement algorithms. Our main focus is on ..."
Abstract
-
Cited by 64 (7 self)
- Add to MetaCart
In a large-scale information system such as a digital library or the web, a set of distributed caches can improve their effectiveness by coordinating their data placement decisions. In this paper, we examine the design space for cooperative placement and replacement algorithms. Our main focus is on the placement algorithms, which attempt to solve the following problem: given a set of caches, the network distances between caches, and predictions of the access rates from each cache to a set of objects, determine where to place each object in order to minimize the average access cost. Replacement algorithms also attempt to minimize access cost, but they work by selecting which objects to evict when a cache miss occurs. Using simulation, we examine three practical cooperative placement algorithms including one that is provably close to optimal, and we compare these algorithms to the optimal placement algorithm and several cooperative and non-cooperative replacement algorithms. We draw fiv...
Approximation Algorithms for Data Placement in Arbitrary Networks
- in Proceedings of the 12th Annual ACM-SIAM Symposium on Discrete Algorithms
, 2001
"... Abstract We develop approximation algorithms for the problem of placing replicated data in arbitrary net-works, where the nodes may both issue requests for data objects and have capacity for storing data objects, so as to minimize the average data-access cost. We introduce the data placement problem ..."
Abstract
-
Cited by 47 (1 self)
- Add to MetaCart
Abstract We develop approximation algorithms for the problem of placing replicated data in arbitrary net-works, where the nodes may both issue requests for data objects and have capacity for storing data objects, so as to minimize the average data-access cost. We introduce the data placement problem tomodel this problem. We have a set of caches F, a set of clients D, and a set of data objects O. Each cache i can store at most ui data objects. Each client j 2 D has demand dj for a specific data object o(j) 2 O and has to be assigned to a cache that stores that object. Storing an object o in cache i incurs astorage cost of f oi, and assigning client j to cache i incurs an access cost of djcij. The goal is to find aplacement of the data objects to caches respecting the capacity constraints, and an assignment of clients
Choosing Replica Placement Heuristics for Wide-Area Systems
- In ICDCS ’04: Proceedings of the 24th International Conference on Distributed Computing Systems (ICDCS’04
, 2004
"... Data replication is used extensively in wide-area distributed systems to achieve low data-access latency. A large number of heuristics have been proposed to perform replica placement. Practical experience indicates that the choice of heuristic makes a big difference in terms of the cost of required ..."
Abstract
-
Cited by 36 (0 self)
- Add to MetaCart
Data replication is used extensively in wide-area distributed systems to achieve low data-access latency. A large number of heuristics have been proposed to perform replica placement. Practical experience indicates that the choice of heuristic makes a big difference in terms of the cost of required infrastructure (e.g., storage capacity and network bandwidth), depending on system topology, workload and performance goals.
A Framework for Evaluating Replica Placement Algorithms
, 2002
"... This paper introduces a framework for evaluating replica placement algorithms (RPA) for content delivery networks (CDN) as well as RPAs from other fields that might be applicable to current or future CDNs. First, the framework classifies and qualitatively compares RPAs using a generic set of primiti ..."
Abstract
-
Cited by 34 (1 self)
- Add to MetaCart
This paper introduces a framework for evaluating replica placement algorithms (RPA) for content delivery networks (CDN) as well as RPAs from other fields that might be applicable to current or future CDNs. First, the framework classifies and qualitatively compares RPAs using a generic set of primitives that capture problem definitions and heuristics. Second, it provides estimates for the decision times of RPAs using an analytic model. To achieve accuracy, the model takes into account disk accesses and message sizes, in addition to computational complexity and message numbers that have been considered traditionally. Third, it uses the "goodness" of produced placements to compare RPAs even when they have different problem definitions. Based on these evaluations, we identify open issues and potential areas for future research.
Do We Need Replica Placement Algorithms in Content Delivery Networks
- In Proceedings of the International Workshop on Web Content Caching and Distribution (WCW
, 2002
"... Numerous replica placement algorithms have been proposed in the literature for use in content delivery networks. However, little has been done to compare the various placement algorithms against each other and against caching. This paper debates whether we need replica placement algorithms in conten ..."
Abstract
-
Cited by 30 (3 self)
- Add to MetaCart
Numerous replica placement algorithms have been proposed in the literature for use in content delivery networks. However, little has been done to compare the various placement algorithms against each other and against caching. This paper debates whether we need replica placement algorithms in content delivery networks or not.
CDN: Content Distribution Network
, 2003
"... Internet evolves and operates largely without a central coordination, the lack of which was and is critically important to the rapid growth and evolution of Internet. However, the lack of management in turn makes it very difficult to guarantee proper performance and to deal systematically with perfo ..."
Abstract
-
Cited by 17 (0 self)
- Add to MetaCart
Internet evolves and operates largely without a central coordination, the lack of which was and is critically important to the rapid growth and evolution of Internet. However, the lack of management in turn makes it very difficult to guarantee proper performance and to deal systematically with performance problems. Meanwhile, the available network bandwidth and server capacity continue to be overwhelmed by the skyrocketing Internet utilization and the accelerating growth of bandwidth intensive content. As a result, Internet service quality perceived by customers is largely unpredictable and unsatisfactory. Content Distribution Network (CDN) is an e ective approach to improve Internet service quality. CDN replicates the content from the place of origin to the replica servers scattered over the Internet and serves a request from a replica server close to where the request originates. In this paper, we first give an overview about CDN. We then present the critical issues involved in designing and implementing an effective CDN and survey the approaches proposed in literature to address these problems. An example of CDN is described to show how a real commercial CDN operates. After this, we present a scheme that provides fast service location for peer-to-peer systems, a special type of CDN with no infrastructure support. We conclude with a brief projection about CDN.
Designing overlay multicast networks for streaming
- In Proceedings of ACM Symposium on Parallel Algorithms and Architectures
, 2003
"... In this paper we present a polynomial time approximation algorithm for designing a multicast overlay network. The algorithm finds a solution that satisfies capacity and reliability constraints to within a constant factor of optimal, and cost to within a logarithmic factor. The class of networks that ..."
Abstract
-
Cited by 13 (2 self)
- Add to MetaCart
In this paper we present a polynomial time approximation algorithm for designing a multicast overlay network. The algorithm finds a solution that satisfies capacity and reliability constraints to within a constant factor of optimal, and cost to within a logarithmic factor. The class of networks that our algorithm applies to includes the one used by Akamai Technologies to deliver live media streams over the Internet. In particular, we analyze networks consisting of three stages of nodes. The nodes in the first stage are the sources where live streams originate. A source forwards each of its streams to one or more nodes in the second stage, which are called reflectors. A reflector can split an incoming stream into multiple identical outgoing streams, which are then sent on to nodes in the third and final stage, which are called the sinks. As the packets in a stream travel from one stage to the next, some of them may be lost. The job of a sink is to combine the packets from multiple instances of the same stream (by reordering packets and discarding duplicates) to form a single instance of the stream with minimal loss. We assume that the loss rate between any pair of nodes in the network is known, and that losses between different pairs are independent, but discuss extensions in which some losses may be correlated.

