## Caching and Scheduling for Broadcast Disk Systems (1998)

Venue: | in Proceedings of the 2nd Workshop on Algorithm Engineering and Experiments (ALENEX |

Citations: | 10 - 3 self |

### BibTeX

@TECHREPORT{Liberatore98cachingand,

author = {Vincenzo Liberatore},

title = {Caching and Scheduling for Broadcast Disk Systems},

institution = {in Proceedings of the 2nd Workshop on Algorithm Engineering and Experiments (ALENEX},

year = {1998}

}

### Years of Citing Articles

### OpenURL

### Abstract

Unicast connections lead to performance and scalability problems when a large client population attempts to access the same data. Broadcast push and broadcast disk technology address the problem by broadcasting data items from a server to a large number of clients. Broadcast disk performance depends mainly on caching strategies at the client site and on how the broadcast is scheduled at the server site. An on-line broadcast disk paging strategy makes caching decisions without knowing access probabilities. In this paper, we subject on-line paging algorithms to extensive empirical investigation. The Gray algorithm [25] always outperformed other on-line strategies on both synthetic and Web traces. Moreover, caching limited the skewness needed from a broadcast schedule, and led to favor efficient caching algorithms over refined scheduling strategies when the cache was not small. Prior to this paper, no work had empirically investigated on-line paging algorithms and their relation with serv...

### Citations

8985 |
Introduction to algorithms
- Cormen, Leiserson, et al.
- 2001
(Show Context)
Citation Context ...age. When LRU faults, it removes the top of the heap and inserts the newly requested page. Thus, LRU takes O(log CacheSize) time per fault. CF's implementation maintains the cache as a red-black tree =-=[16]-=- ordered by transmission times. When CF faults on page p, CF inserts p in the red-black trees. Then, CF tries to find p's successor in the tree, that is, the smallest tree element q that is larger tha... |

2324 |
The Art of Computer Programming
- Knuth
- 1973
(Show Context)
Citation Context ...e defined in the existing literature on broadcast disks [4, 5]. The Zipf distribution is often used to model skewed access patterns because it gives some pages a higher probability of being requested =-=[28]-=-. The synthetic trace is generated as follows. At the very beginning, a set of AccessRange ! ServerDBSize pages is extracted uniformly at random from the server database. The synthetic trace will cont... |

955 | Disconnected Operation in the Coda File System
- Kistler, Satyanarayanan
- 1992
(Show Context)
Citation Context ...ent paper. No previous paper has given efficient implementation of broadcast disk paging algorithms. Prefetching in broadcast disks is somewhat related to other techniques used in mobile environments =-=[13, 27, 32]-=-. The difference is that broadcast disks prefetching aims at improving performance, whereas other works focus on increasing availability or avoiding accesses to stale data. 16 Broadcast disk paging po... |

929 | Modern operating systems
- Tanenbaum
- 1992
(Show Context)
Citation Context ...ithms. LRU and CF are antithetic in the following sense. LRU evicts pages independently of the waiting time needed to reload them. The gist of LRU is that past accesses should predict future accesses =-=[33]-=-, and so LRU should incur few page faults. To the contrary, CF does not base evictions on previous history, but only on waiting times. However, if CF makes an eviction mistake and if CF immediately de... |

831 | Generating Representative Web Workloads for Network and Server Performance Evaluation
- Barford, Crovella
- 1998
(Show Context)
Citation Context ...e chosen. For the two dierent values of bcThresh, the value of n is very dierent, whereas the trace Length is much closer. Therefore, there are many documents that are accessed few times (see also [Ba=-=rford and Crovella 19-=-98]). Since n is dierent in the two traces, experiments were performed for dierent sets of k values. CF was severely outmatched by the other strategies, and so its performance is not reported here. Fi... |

396 | Broadcast disks: Data Management for Asymmetric Communication Environments
- Acharya, Alonso, et al.
- 1995
(Show Context)
Citation Context ...ts are helped by local caching because they avoid waiting on the network if they can find data items in their own cache. Cyclical schedules help caching strategies to weigh different eviction choices =-=[4, 25]-=-. Moreover, cyclical schedules lead to scalable and widely supported multicast techniques over the Internet [8] and are necessary in a mobile environment where clients need to know when to tune in to ... |

219 | Sleepers and Workaholics: Caching strategies in mobile environments
- Barbara, Imielinski
- 1994
(Show Context)
Citation Context ...ent paper. No previous paper has given efficient implementation of broadcast disk paging algorithms. Prefetching in broadcast disks is somewhat related to other techniques used in mobile environments =-=[13, 27, 32]-=-. The difference is that broadcast disks prefetching aims at improving performance, whereas other works focus on increasing availability or avoiding accesses to stale data. 16 Broadcast disk paging po... |

139 |
The art of computer programming, volume 3
- Knuth
- 1998
(Show Context)
Citation Context ...on broadcast disks [Acharya et al. 1995; Acharya et al. 1996]. The Zipf distribution is often used to model skewed access patterns because it gives some pages a higher probability of being requested [=-=Knuth 1973-=-]. We generate a synthetic trace is generated as follows. At the very beginning, we extract a set of AccessRangespages uniformly at random from the server database. The synthetic trace will contain on... |

130 |
The Design of Dynamic Data Structures
- Overmars
- 1983
(Show Context)
Citation Context ...se, the gray tree is removed and the black tree becomes gray. The removal of the gray tree takes O(k) time per phase, which becomes O(1) time for each phase request if we utilize Overmars' technique [=-=Overmars 1983-=-]. On the whole, gray's implementation takes O(log k) time per request. PT. The last replacement strategy considered here is PT [Acharya et al. 1996]. PT maintains two values for each page i in the se... |

113 | Beyond hierarchies: design considerations for distributed caching on the Internet
- Tewari, Dahlin, et al.
- 1998
(Show Context)
Citation Context ...esses to stale data. 16 Broadcast disk paging poses a trade-off between the number of page faults and the cost per fault. Similar trade-offs exist in a variety of context, as for example, Web caching =-=[7, 35]-=- and hierarchical paging [15]. The problem of finding an optimal cyclic schedule is NP-hard [12], but it can be solved in polynomial time if ServerDBSize = 2 [11]. The square-root law has been propose... |

112 | Broadcast scheduling for information distribution
- Su, Tassiulas
- 1997
(Show Context)
Citation Context ... experiments. Scheduling is the problem of establishing a broadcast schedule. Ideally, scheduling should depend on client access patters. For example, typical schedules broadcast hot pages more often =-=[12, 31]-=-. Scheduling is strongly interrelated with caching because client access patterns are modified by caching. Conversely, caching strategy are affected by scheduling because a schedule determines the cos... |

100 | Scheduling on-demand Broadcasts: New Metrics and Algorithms
- Acharya, Muthukrishnan
- 1998
(Show Context)
Citation Context ...s along the server broadcast and perform scheduling in order to reduce both response and tuning time [26]. Several authors have studied the problem of broadcast scheduling when pull is also supported =-=[6, 3, 30]-=-. The crux of scheduling is the estimation of page popularities. Stathatos et al. use a pull backchannel for data communication and, indirectly, to estimate page popularity and its dynamic over time [... |

100 | rxw: A scheduling approach for large-scale on-demand data broadcast
- Aksoy, Franklin
- 1999
(Show Context)
Citation Context ...s along the server broadcast and perform scheduling in order to reduce both response and tuning time [26]. Several authors have studied the problem of broadcast scheduling when pull is also supported =-=[6, 3, 30]-=-. The crux of scheduling is the estimation of page popularities. Stathatos et al. use a pull backchannel for data communication and, indirectly, to estimate page popularity and its dynamic over time [... |

94 | Prefetching From a Broadcast Disk
- Acharya, Franklin, et al.
- 1996
(Show Context)
Citation Context ... tax our algorithms more than any other stochastic workload. Our second workload assumes a stationary Zipf distribution and is similar to the one defined in the existing literature on broadcast disks =-=[4, 5]-=-. The Zipf distribution is often used to model skewed access patterns because it gives some pages a higher probability of being requested [28]. The synthetic trace is generated as follows. At the very... |

93 | Minimizing service and operation costs of periodic scheduling
- Bar-Noy, Bhatia, et al.
- 1997
(Show Context)
Citation Context ... experiments. Scheduling is the problem of establishing a broadcast schedule. Ideally, scheduling should depend on client access patters. For example, typical schedules broadcast hot pages more often =-=[12, 31]-=-. Scheduling is strongly interrelated with caching because client access patterns are modified by caching. Conversely, caching strategy are affected by scheduling because a schedule determines the cos... |

88 | Adaptive data broadcast in hybrid networks
- Stathatos, Roussopoulos, et al.
- 1997
(Show Context)
Citation Context ... . Pages have all the same size and it 1 The ServerDBSize pages need not be the whole server database, but they can simply be the database portion that the server has assigned for broadcast (see e.g. =-=[8, 30]-=-). The important assumption is that the broadcast data set changes so slowly 1 0 1 2 3 4 5 6 7 8 ServerDBSize 1 Figure 1: Example of a flat broadcast program. Pages are numbered from 0 to ServerDBSize... |

87 | Quickly generating billion-record synthetic databases - Gray, Sundaresan, et al. - 1994 |

73 | Server-initiated document dissemination for the WWW
- Bestavros, Cunha
- 1996
(Show Context)
Citation Context ...of research, and the position of broadcast disks among other push and push/pull data dissemination architectures [20]. Information dissemination on the Internet has been considered by various authors =-=[14, 17, 37]-=- and systems [1, 2]. Cyclic multicast over the Internet is discussed in [8]. Ammar gives a prefetching strategy that loads pages on the basis of links embedded in previously loaded pages [9]. Other ap... |

69 |
The design of Teletext broadcast cycles
- Ammar, Wong
- 1985
(Show Context)
Citation Context ...ing [15]. The problem of finding an optimal cyclic schedule is NP-hard [12], but it can be solved in polynomial time if ServerDBSize = 2 [11]. The square-root law has been proposed by several authors =-=[10, 9, 21, 31]-=-. The golden ratio algorithm instantiate the rule and gives a 1.125-approximation for all ServerDBSizes [12]. A simpler approximation of the rule is the MAD algorithm [31, 12]. Scheduling with non-uni... |

64 | A Framework for Scalable Dissemination-based systems
- Franklin, Zdonik
- 1997
(Show Context)
Citation Context ...porated in several commercial systems. For example, Hughes Network System [1] delivers Web pages via satellite links, and Hybrid Networks Inc. [2] will broadcast data via cable lines. Broadcast Disks =-=[20]-=- attempt to improve broadcast push performance by the combination of two methods: they establish client caching and fix a cyclical broadcast schedule over long periods of time. Clients are helped by l... |

53 | Intelligent file hoarding for mobile computers
- Tait, Lei, et al.
- 1995
(Show Context)
Citation Context ...ent paper. No previous paper has given efficient implementation of broadcast disk paging algorithms. Prefetching in broadcast disks is somewhat related to other techniques used in mobile environments =-=[13, 27, 32]-=-. The difference is that broadcast disks prefetching aims at improving performance, whereas other works focus on increasing availability or avoiding accesses to stale data. 16 Broadcast disk paging po... |

46 | Scalable delivery of web pages using cyclic best-effort multicast
- Almeroth, Ammar, et al.
- 1998
(Show Context)
Citation Context ...ache. Cyclical schedules help caching strategies to weigh different eviction choices [4, 25]. Moreover, cyclical schedules lead to scalable and widely supported multicast techniques over the Internet =-=[8]-=- and are necessary in a mobile environment where clients need to know when to tune in to receive data [26]. Our main contributions are the first empirical study of on-line algorithms for broadcast dis... |

40 | Schabanel: The Data Broadcast Problem with Non-Uniform Transmission Rimes
- Kenyon, Nicolas
- 1999
(Show Context)
Citation Context ...gives a 1.125-approximation for all ServerDBSizes [12]. A simpler approximation of the rule is the MAD algorithm [31, 12]. Scheduling with non-uniform transmission times has been investigated as well =-=[24, 36]-=-. In a mobile environment, the objective of scheduling is to minimize a combination of response time and tuning time. Khanna et al. present an algorithm that inserts index pages along the server broad... |

26 | Inversive and linear congruential pseudorandom number generators in empirical tests - Leeb, Wegenkittl - 1997 |

25 | The scheduling of maintenance service
- Anily, Glass, et al.
- 1998
(Show Context)
Citation Context ...ntext, as for example, Web caching [7, 35] and hierarchical paging [15]. The problem of finding an optimal cyclic schedule is NP-hard [12], but it can be solved in polynomial time if ServerDBSize = 2 =-=[11]-=-. The square-root law has been proposed by several authors [10, 9, 21, 31]. The golden ratio algorithm instantiate the rule and gives a 1.125-approximation for all ServerDBSizes [12]. A simpler approx... |

25 | On indexed data broadcast
- Khanna, Zhou
- 1998
(Show Context)
Citation Context ... strategies (e.g. Gray) hard to implement efficiently. In mobile computing, non-flat schedules require that a complicate indexing structure be maintained in order to keep track of the schedule itself =-=[26]-=-. As a consequence, complicate algorithms are needed at the server and client sites to build and look-up the index and part of the bandwidth is now employed to broadcast the index rather than data pag... |

25 |
The design of teletext broadcast cycles, Performance Evaluation 5
- Ammar, Wong
- 1985
(Show Context)
Citation Context ...al time if the server broadcasts only two pages [Anily et al. 1998]. A scheduling principle is expressed by the square-root law , which has been proposed independently by several authors [Ammar 1987; =-=Ammar and Wong 1985-=-; Gecsei 1983; Su and Tassiulas 1997]: broadcast pages with frequency proportional to the square root of the probability those pages will be requested. The golden ratio algorithm approximates the rule... |

24 |
Response time in a Teletext system: An individual user's perspective
- Ammar
- 1987
(Show Context)
Citation Context ... [14, 17, 37] and systems [1, 2]. Cyclic multicast over the Internet is discussed in [8]. Ammar gives a prefetching strategy that loads pages on the basis of links embedded in previously loaded pages =-=[9]-=-. Other approaches execute prefetching without using hints. Acharya et al. considered prefetching on broadcast disks, and propose the PT algorithm, which we described in x3 [5]. Subsequently, Tassiula... |

24 | Online computation
- Irani, Karlin
- 1995
(Show Context)
Citation Context ...gorithm unmarks all marked pages and starts another phase afresh. We remark that 1-bit LRU evicts unmarked pages in any arbitrary order (the 1-bit LRU algorithm is also known as the marking algorithm =-=[23]-=-). A theoretical result establishes that, in terms of number of page faults, the worst-case performance ratio of 1-bit LRU is exactly the same as LRU's [23]. In other words, one bit per page achieves ... |

24 |
Log time algorithms for scheduling single and multiple channel data broadcast
- Vaidya, Hameed
- 1997
(Show Context)
Citation Context ...gives a 1.125-approximation for all ServerDBSizes [12]. A simpler approximation of the rule is the MAD algorithm [31, 12]. Scheduling with non-uniform transmission times has been investigated as well =-=[24, 36]-=-. In a mobile environment, the objective of scheduling is to minimize a combination of response time and tuning time. Khanna et al. present an algorithm that inserts index pages along the server broad... |

20 | and C.Su. Optimal memory management strategies for a mobile user in a broadcast data delivery system
- Tassiulas
- 1997
(Show Context)
Citation Context ... for servers and clients [5]. Previous broadcast disk paging algorithms assumed that clients requested data items with given probabilities and that those probabilities were known to paging strategies =-=[4, 5, 34]-=-. In practice, probabilistic assumptions could be difficult to find and to validate, and so it is critical to have efficient paging strategies that operate without probabilistic parameters. A breakthr... |

18 | On Broadcast Disk Paging
- Khanna, Liberatore
(Show Context)
Citation Context ...oadcast disk paging strategy makes caching decisions without knowing access probabilities. In this paper, we subject on-line paging algorithms to extensive empirical investigation. The Gray algorithm =-=[25]-=- always outperformed other on-line strategies on both synthetic and Web traces. Moreover, caching limited the skewness needed from a broadcast schedule, and led to favor efficient caching algorithms o... |

11 |
Information dissemination in hybrid satellite/terrestrial networks
- Dao, Perry
- 1996
(Show Context)
Citation Context ...of research, and the position of broadcast disks among other push and push/pull data dissemination architectures [20]. Information dissemination on the Internet has been considered by various authors =-=[14, 17, 37]-=- and systems [1, 2]. Cyclic multicast over the Internet is discussed in [8]. Ammar gives a prefetching strategy that loads pages on the basis of links embedded in previously loaded pages [9]. Other ap... |

7 | Competitive algorithms for multilevel caching and relaxed list update
- Chrobak, Noga
(Show Context)
Citation Context ...t disk paging poses a trade-off between the number of page faults and the cost per fault. Similar trade-offs exist in a variety of context, as for example, Web caching [7, 35] and hierarchical paging =-=[15]-=-. The problem of finding an optimal cyclic schedule is NP-hard [12], but it can be solved in polynomial time if ServerDBSize = 2 [11]. The square-root law has been proposed by several authors [10, 9, ... |

6 |
replacement for general caching problems
- Page
- 1999
(Show Context)
Citation Context ...esses to stale data. 16 Broadcast disk paging poses a trade-off between the number of page faults and the cost per fault. Similar trade-offs exist in a variety of context, as for example, Web caching =-=[7, 35]-=- and hierarchical paging [15]. The problem of finding an optimal cyclic schedule is NP-hard [12], but it can be solved in polynomial time if ServerDBSize = 2 [11]. The square-root law has been propose... |

4 | Empirical investigation of the Markov Reference Model - Liberatore - 1999 |

2 |
The architecture of videotext systems
- Gecsei
- 1983
(Show Context)
Citation Context ... Background Let p i be the probability that page i is requested andsi = p p i P j p p j : The square-root law suggests that page i should be broadcast with frequencysi (and not with probability p i ) =-=[12, 21, 31]-=-. The Mean Aggregate Delay (MAD) algorithm is a scheduling algorithm that approximates the square-root law [12, 31]. The MAD algorithm maintains a value s i associated with each page i. The quantity s... |

1 | Inversive congruential pseudorandom numbers: a tutorial - Eichenauer-Herrman - 1992 |

1 | Pseudorandom number generation by nonlinear methods - Eichenauer-Herrman - 1995 |

1 |
Efficient dissemination of information on the Internet. Data Engineering
- Yan, Garcia-Molina
- 1996
(Show Context)
Citation Context ...of research, and the position of broadcast disks among other push and push/pull data dissemination architectures [20]. Information dissemination on the Internet has been considered by various authors =-=[14, 17, 37]-=- and systems [1, 2]. Cyclic multicast over the Internet is discussed in [8]. Ammar gives a prefetching strategy that loads pages on the basis of links embedded in previously loaded pages [9]. Other ap... |