Results 1 - 10
of
14
GlobeDB: Autonomic data replication for web applications
- In Proc. Intl. WWW Conf
, 2005
"... We present GlobeDB, a system for hosting Web applications that performs autonomic replication of application data. GlobeDB offers data-intensive Web applications the benefits of low access latencies and reduced update traffic. The major distinction in our system compared to existing edge computing i ..."
Abstract
-
Cited by 40 (8 self)
- Add to MetaCart
We present GlobeDB, a system for hosting Web applications that performs autonomic replication of application data. GlobeDB offers data-intensive Web applications the benefits of low access latencies and reduced update traffic. The major distinction in our system compared to existing edge computing infrastructures is that the process of distribution and replication of application data is handled by the system automatically with very little manual administration. We show that significant performance gains can be obtained this way. Performance evaluations with the TPC-W benchmark over an emulated wide-area network show that GlobeDB reduces latencies by a factor of 4 compared to non-replicated systems and reduces update traffic by a factor of 6 compared to fully replicated systems.
Analysis of caching and replication strategies for Web applications
- IEEE INTERNET COMPUTING
, 2007
"... Replication and caching mechanisms are often employed to enhance the performance of Web applications. In this article, we present a qualitative and quantitative analysis of state-of-the-art replication and caching techniques used to host Web applications. Our analysis shows that the selection of bes ..."
Abstract
-
Cited by 18 (3 self)
- Add to MetaCart
Replication and caching mechanisms are often employed to enhance the performance of Web applications. In this article, we present a qualitative and quantitative analysis of state-of-the-art replication and caching techniques used to host Web applications. Our analysis shows that the selection of best mechanism is heavily dependant on the data workload and requires careful analysis of the application characteristics. To this end, we propose a technique that will enable Web practitioners to compare the performance of different caching/replication mechanisms.
GlobeCBC: Content-blind Result Caching for Dynamic Web Applications
, 2006
"... Abstract. In this paper, we present GlobeCBC, a content-blind query caching middleware for hosting Web applications in an edge computing infrastructure. Unlike existing data caching middleware systems, GlobeCBC stores the query results independently and does not merge different query results. We stu ..."
Abstract
-
Cited by 16 (5 self)
- Add to MetaCart
Abstract. In this paper, we present GlobeCBC, a content-blind query caching middleware for hosting Web applications in an edge computing infrastructure. Unlike existing data caching middleware systems, GlobeCBC stores the query results independently and does not merge different query results. We study the potential performance of this approach using extensive experimentations on our prototype implementation and compare it with other systems over an emulated wide-area network. Our evaluations show that content-blind caching performs well in terms of client latency for applications that exhibit high locality. It allows the system to sustain higher throughput by offloading the origin server database. We also present the design and evaluation of different online cache replacement algorithms for edge servers that have limited resource capabilities. In our evaluations, we find that the best heuristic must exploit temporal locality and take into Edge service architectures have become the most widespread platform for distributing Web content over the Internet. Commercial Content Delivery Networks (CDNs) like
GlobeTP: Template-based database replication for scalable web applications
- IN PROC. WWW
, 2007
"... Generic database replication algorithms do not scale linearly in throughput as all update, deletion and insertion (UDI) queries must be applied to every database replica. The throughput is therefore limited to the point where the number of UDI queries alone is sufficient to overload one server. In s ..."
Abstract
-
Cited by 11 (0 self)
- Add to MetaCart
Generic database replication algorithms do not scale linearly in throughput as all update, deletion and insertion (UDI) queries must be applied to every database replica. The throughput is therefore limited to the point where the number of UDI queries alone is sufficient to overload one server. In such scenarios, partial replication of a database can help, as UDI queries are executed only by a subset of all servers. In this paper we propose GlobeTP, a system that employs partial replication to improve database throughput. GlobeTP exploits the fact that a Web application’s query workload is composed of a small set of read and write templates. Using knowledge of these templates and their respective execution costs, GlobeTP provides database table placements that produce significant improvements in database throughput. We demonstrate the efficiency of this technique using two different industry standard benchmarks. In our experiments, GlobeTP increases the throughput by 57 % to 150% compared to full replication, while using identical hardware configuration. Furthermore, adding a single query cache improves the throughput by another 30 % to 60%.
Service-Oriented Data Denormalization for Scalable Web Applications
, 2008
"... Many techniques have been proposed to scale web applications. However, the data interdependencies between the database queries and transactions issued by the applications limit their efficiency. We claim that major scalability improvements can be gained by restructuring the web application data into ..."
Abstract
-
Cited by 6 (3 self)
- Add to MetaCart
Many techniques have been proposed to scale web applications. However, the data interdependencies between the database queries and transactions issued by the applications limit their efficiency. We claim that major scalability improvements can be gained by restructuring the web application data into multiple independent data services with exclusive access to their private data store. While this restructuring does not provide performance gains by itself, the implied simplification of each database workload allows a much more efficient use of classical techniques. We illustrate the data denormalization process on three benchmark applications: TPC-W, RUBiS and RUBBoS. We deploy the resulting service-oriented implementation of TPC-W across an 85-node cluster and show that restructuring its data can provide at least an order of magnitude improvement in the maximum sustainable throughput compared to master-slave database replication, while preserving strong consistency and transactional properties.
High Availability and Scalability Support for Web Applications
"... A database query caching technique, GlobeCBC, can be used to improve the scalability of Web applications. This paper addresses the availability issues in GlobeCBC. Even though high availability is achieved by adding more resources, proper algorithms must be designed to ensure that the clients receiv ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
A database query caching technique, GlobeCBC, can be used to improve the scalability of Web applications. This paper addresses the availability issues in GlobeCBC. Even though high availability is achieved by adding more resources, proper algorithms must be designed to ensure that the clients receive consistent responses amidst failures of the edge and origin servers. We present lightweight algorithms to detect and correct server failures while providing read-your-writes consistency. They exploit the fact that the query workload of Web applications is based on a fixed set of read and write templates. We show that these algorithms incur very low overhead using several microbenchmarks and a complete Web application benchmark. 1
Autonomic Data Placement Strategies for Update-intensive Web applications
"... Edge computing infrastructures have become the leading platform for hosting Web applications. One of the key challenges in these infrastructures is the replication of application data. In our earlier research, we presented GlobeDB, a middleware for edge computing infrastructures that performs autono ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
Edge computing infrastructures have become the leading platform for hosting Web applications. One of the key challenges in these infrastructures is the replication of application data. In our earlier research, we presented GlobeDB, a middleware for edge computing infrastructures that performs autonomic replication of application data. In this paper, we study the problem of data unit placement for updateintensive Web applications in the context of GlobeDB. Our hypothesis is that there exists a continuous spectrum of placement choices between complete partitioning of sets of data units across edge servers and full replication of data units to all servers. We propose and evaluate different families of heuristics for this problem of replica placement. As we show in our experiments, a heuristic that takes into account both the individual characteristics of data units and the overall system load performs best. 1.
Towards autonomic hosting of multi-tier internet applications
- In Proceedings of the USENIX/IEEE HotAC-I Workshop
, 2006
"... A vast amount of caching and replication solutions have been proposed in the literature to improve the performance of multi-tiered Web applications (which we call as Internet services). These solutions aim to alleviate the scalability bottleneck of only a single tier and different techniques are sui ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
A vast amount of caching and replication solutions have been proposed in the literature to improve the performance of multi-tiered Web applications (which we call as Internet services). These solutions aim to alleviate the scalability bottleneck of only a single tier and different techniques are suitable for services of different nature. However, from the view point of an administrator who wants to host a service scalably, it is not easy to determine the right set of techniques to apply. This leads to either gross over-provisioning of resources or poor performance. We believe that the decision process of choosing the right techniques for a service can be automated. To strengthen our position, we propose the design of an autonomic hosting system that uses a combination of multi queue models and online simulations to achieve our goals. Even though our work is very much in progress, we believe the techniques used in our system can provide a good start in taming the complex problem of scalable hosting of services. 1
Towards Autonomic Computing: Service Discovery and Web Hotspot Rescue
, 2006
"... Autonomic computing is a vision that addresses the growing complexity of computing systems by enabling them to manage themselves without direct human intervention. This thesis studies two related problems, service discovery and web hotspot rescue, which can serve as a building block and a prototype ..."
Abstract
- Add to MetaCart
Autonomic computing is a vision that addresses the growing complexity of computing systems by enabling them to manage themselves without direct human intervention. This thesis studies two related problems, service discovery and web hotspot rescue, which can serve as a building block and a prototype for autonomic networking and distributed systems, respectively. Service discovery allows end systems to discover desired services on networks au-tomatically, eliminating administrative configuration. We made four enhancements to the Service Location Protocol (SLP): mesh enhancement, remote service discovery, preference filters, and global attributes. These enhancements improve SLP efficiency and scalability, and enable SLP to better support new and advanced discovery scenar-ios. The SLP mesh enhancement (mSLP), remote service discovery, and preference filters are now experimental RFCs (Request for Comments). We expect that similar techniques can be applied to other service discovery systems. During the development of mSLP, we designed selective anti-entropy, a generic mechanism for high availability partial replication. Traditional anti-entropy only

