Results 1 - 10
of
24
Replication for web hosting systems
- ACM COMPUTING SURVEYS
, 2004
"... Replication is a well-known technique to improve the accessibility of Web sites. It generally offers reduced client latencies and increases a site’s availability. However, applying replication techniques is not trivial, and various Content Delivery Networks (CDNs) have been created to facilitate rep ..."
Abstract
-
Cited by 40 (9 self)
- Add to MetaCart
Replication is a well-known technique to improve the accessibility of Web sites. It generally offers reduced client latencies and increases a site’s availability. However, applying replication techniques is not trivial, and various Content Delivery Networks (CDNs) have been created to facilitate replication for digital content providers. The
Do We Need Replica Placement Algorithms in Content Delivery Networks
- In Proceedings of the International Workshop on Web Content Caching and Distribution (WCW
, 2002
"... Numerous replica placement algorithms have been proposed in the literature for use in content delivery networks. However, little has been done to compare the various placement algorithms against each other and against caching. This paper debates whether we need replica placement algorithms in conten ..."
Abstract
-
Cited by 30 (3 self)
- Add to MetaCart
Numerous replica placement algorithms have been proposed in the literature for use in content delivery networks. However, little has been done to compare the various placement algorithms against each other and against caching. This paper debates whether we need replica placement algorithms in content delivery networks or not.
Latency-driven replica placement
- Proceedings of the International Symposium on Applications and the Internet
, 2005
"... This paper presents HotZone, an algorithm to place replicas in a wide-area network such that the client-toreplica latency is minimized. Similar to the previously proposed HotSpot algorithm, HotZone places replicas on nodes that along with their neighboring nodes generate the highest load. In contras ..."
Abstract
-
Cited by 18 (6 self)
- Add to MetaCart
This paper presents HotZone, an algorithm to place replicas in a wide-area network such that the client-toreplica latency is minimized. Similar to the previously proposed HotSpot algorithm, HotZone places replicas on nodes that along with their neighboring nodes generate the highest load. In contrast to HotSpot, however, HotZone provides nearly-optimal results by considering overlapping neighborhoods. HotZone relies on a geometric model of Internet latencies, which effectively reduces the cost of placing K replicas among N potential replica locations from O(N 2) to O(N · max(logN, K)). 1
Replica Placement in Adaptive Content Distribution Networks
- ACM SAC 2004
, 2004
"... Adaptive content networking is a promising new approach aimed at scalable delivery of content to a pervasive client population. By adaptive content delivery networks (A-CDN) content is adapted, replicated and delivered to the clients in a cost-quality-optimized fashion. The integration of content ad ..."
Abstract
-
Cited by 16 (0 self)
- Add to MetaCart
Adaptive content networking is a promising new approach aimed at scalable delivery of content to a pervasive client population. By adaptive content delivery networks (A-CDN) content is adapted, replicated and delivered to the clients in a cost-quality-optimized fashion. The integration of content adaptation into CDNs minimizes the interference of adaptation with replication e#ectiveness.
Continuous replica placement schemes in distributed systems
- in Proceedings of the 19th ACM International Conference on Supercomputing (ACM ICS
, 2005
"... The Replica Placement Problem (RPP) aims at creating a set of duplicated data objects across the nodes of a distributed system in order to optimize certain criteria. Typically, RPP formulations fall into two categories: static and dynamic. The first assumes that access statistics are estimated in ad ..."
Abstract
-
Cited by 11 (2 self)
- Add to MetaCart
The Replica Placement Problem (RPP) aims at creating a set of duplicated data objects across the nodes of a distributed system in order to optimize certain criteria. Typically, RPP formulations fall into two categories: static and dynamic. The first assumes that access statistics are estimated in advance and remain static, and, therefore, a one-time replica distribution is sufficient (1RPP). In contrast, dynamic methods change the replicas in the network potentially upon every request. This paper proposes an alternative technique, named Continuous Replica Placement Problem (CRPP), which falls between the two extreme approaches. CRPP can be defined as: Given an already implemented replication scheme and estimated access statistics for the next time period, define a new replication scheme, subject to optimization criteria and constraints. As we show in the problem formulation, CRPP is different in that the existing heuristics in the literature cannot be used either statically or dynamically to solve the problem. In fact, even with the most careful design, their performance will be inferior since CRPP embeds a scheduling problem to facilitate the proposed mechanism. We provide insight on the intricacies of CRPP and propose various heuristics.
Replica Placement and Access Policies in Tree Networks
"... Abstract—In this paper, we discuss and compare several policies to place replicas in tree networks, subject to server capacity and Quality-of-Service (QoS) constraints. The client requests are known beforehand, while the number and location of the servers are to be determined. The standard approach ..."
Abstract
-
Cited by 5 (3 self)
- Add to MetaCart
Abstract—In this paper, we discuss and compare several policies to place replicas in tree networks, subject to server capacity and Quality-of-Service (QoS) constraints. The client requests are known beforehand, while the number and location of the servers are to be determined. The standard approach in the literature is to enforce that all requests of a client be served by the closest server in the tree. We introduce and study two new policies. In the first policy, all requests from a given client are still processed by the same server, but this server can be located anywhere in the path from the client to the root. In the second policy, the requests of a given client can be processed by multiple servers. One major contribution of this paper is to assess the impact of these new policies on the total replication cost. Another important goal is to assess the impact of server heterogeneity, both from a theoretical and a practical perspective. In this paper, we establish several new complexity results and provide several efficient polynomial heuristics for NP-complete instances of the problem. These heuristics are compared one to the other, and their absolute performance is assessed by comparison with the optimal solution provided by an integer linear program.
Optimizing Network Performance In Replicated Hosting
- IN THE TENTH INTERNATIONAL WORKSHOP ON WEB CACHING AND CONTENT DISTRIBUTION (WCW
, 2005
"... Most important commercial Web sites maintain multiple replicas of their server infrastructure to increase both reliability and performance. In this paper, we study how many replicas should be used and where they should be placed in order to improve client network performance, including both the late ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
Most important commercial Web sites maintain multiple replicas of their server infrastructure to increase both reliability and performance. In this paper, we study how many replicas should be used and where they should be placed in order to improve client network performance, including both the latency (e.g., round-trip time) between clients and the replicas, and the bandwidth performance between them. This study is based on a large scale measurement study from an 18-node infrastructure, which reveals for the first time the distribution of today's Internet end-user access bandwidth. For example, we find that 50% of end users have access bandwidth less than 4.2Mbps. Using a greedy algorithm, we show that the first five replicas dominate latency optimization in our measurement infrastructure, while the first two replicas dominate bandwidth optimization. We also found that geographic diversity does not help as much for bandwidth optimization as it does for latency. To determine the proper trade-off between latency and bandwidth, we use a simplified TCP model to show that, when content size is less than 10KB, the deployment should focus on optimizing latency, while for content sizes larger than 1MB, the deployment should focus on optimizing bandwidth.
Impact of QoS on Replica Placement in Tree Networks
, 2006
"... This paper discusses and compares several policies to place replicas in tree networks, subject to server capacity and QoS constraints. The client requests are known beforehand, while the number and location of the servers are to be determined. We study three strategies. The first two strategies assi ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
This paper discusses and compares several policies to place replicas in tree networks, subject to server capacity and QoS constraints. The client requests are known beforehand, while the number and location of the servers are to be determined. We study three strategies. The first two strategies assign each client to a unique server while the third allows requests of a client to be processed by multiple servers. The main contribution of this paper is to assess the impact of QoS constraints on the total replication cost. In this paper, we establish the NP-completeness of the problem on homogeneous networks when the requests of a given client can be processed by multiple servers. We provide several efficient polynomial heuristic algorithms for NP-complete instances of the problem. These heuristics are compared to the optimal solution provided by the formulation of the problem in terms of the solution of an integer linear program. 1
Document Replication and Distribution in Extensible Geographically Distributed Web Server
- J. of Parallel and Distributed Computing
, 2002
"... A geographically distributed web server (GDWS) system, consisting of multiple server nodes interconnected by a MAN or a WAN, can achieve better efficiency in handling the ever-increasing web requests than centralized web servers because of the proximity of server nodes to clients. It is also more ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
A geographically distributed web server (GDWS) system, consisting of multiple server nodes interconnected by a MAN or a WAN, can achieve better efficiency in handling the ever-increasing web requests than centralized web servers because of the proximity of server nodes to clients. It is also more scalable since the throughput will not be limited by available bandwidth connecting to a central server. The key research issue in the design of GDWS is how to replicate and distribute the documents of a web site among the server nodes. This paper proposes a density-based replication scheme and applies it to our proposed Extensible GDWS (EGDWS) architecture. Its document distribution scheme supports partial replication targeting only at hot objects among the documents. To distribute the replicas generated via the density-based replication scheme, we propose four different document distribution algorithms: Greedy-cost, Maximal-density, Greedypenalty, and Proximity-aware. A proximity-based routing mechanism is designed to incorporate these algorithms for achieving better web server performance in a WAN environment. Simulation results show that our document replication and distribution algorithms achieve better response times and load balancing than existing dynamic schemes. To further reduce user's response time, we propose two document grouping algorithms that can cut down on the request redirection overheads.
Bounded-Latency Content Distribution: Feasibility and Evaluation
- IEEE Transaction on Computers
, 2005
"... Abstract—This paper investigates the performance of a content distribution network designed to provide bounded content access latency. Content can be divided into multiple classes with different configurable per-class delay bounds. The network uses a simple distributed algorithm to dynamically selec ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
Abstract—This paper investigates the performance of a content distribution network designed to provide bounded content access latency. Content can be divided into multiple classes with different configurable per-class delay bounds. The network uses a simple distributed algorithm to dynamically select subsets of its proxy servers for different classes such that a global per-class delay bound is achieved on content access. The content distribution algorithm is implemented and tested on PlanetLab [25], a world-wide distributed Internet testbed. Evaluation results demonstrate that, despite Internet delay variability, subsecond delay bounds (of 200-500ms) can be guaranteed with a very high probability at only a moderate content replication cost. The distribution algorithm achieves a four to fivefold reduction in the number of response-time violations compared to prior content distribution approaches that attempt to minimize average latency. To the authors ’ knowledge, this paper presents the first wide-area performance evaluation of an algorithm designed to bound maximum content access latency, as opposed to optimizing an average performance metric. Index Terms—Content distribution networks, distributed systems, performance evaluation. 1

