Results 1 - 10
of
11
Exploiting semantic proximity in peer-to-peer content searching
- In 10th International Workshop on Future Trends in Distributed Computing Systems (FTDCS 2004), Suzhu
, 2004
"... A lot of recent work has dealt with improving performance of content searching in peer-to-peer file sharing systems. In this paper we attack this problem by modifying the overlay topology describing the peer relations in the system. More precisely, we create a semantic overlay, linking nodes that ar ..."
Abstract
-
Cited by 36 (10 self)
- Add to MetaCart
A lot of recent work has dealt with improving performance of content searching in peer-to-peer file sharing systems. In this paper we attack this problem by modifying the overlay topology describing the peer relations in the system. More precisely, we create a semantic overlay, linking nodes that are “semantically close”, by which we mean that they are interested in similar documents. This semantic overlay provides the primary search mechanism, while the initial peer-to-peer system provides the fail-over search mechanism. We focus on implicit approaches for discovering semantic proximity. We evaluate and compare three candidate methods, and review open questions. 1.
Self-Organizing Linear Search
- ACM Computing Surveys
, 1985
"... this article. Two examples of simple permutation algorithms are move-to-front, which moves the accessed record to the front of the list, shifting all records previously ahead of it back one position; and transpose, which merely exchanges the accessed record with the one immediately ahead of it in th ..."
Abstract
-
Cited by 28 (3 self)
- Add to MetaCart
this article. Two examples of simple permutation algorithms are move-to-front, which moves the accessed record to the front of the list, shifting all records previously ahead of it back one position; and transpose, which merely exchanges the accessed record with the one immediately ahead of it in the list. These will be described in more detail later. Knuth [1973] describes several search methods that are usually more efficient than linear search. Bentley and McGeoch [1985] justify the use of self-organizing linear search in the following three contexts:
Average Case Analyses of List Update Algorithms, with Applications to Data Compression
- Algorithmica
, 1998
"... We study the performance of the Timestamp (0) (TS(0)) algorithm for self-organizing sequential search on discrete memoryless sources. We demonstrate that TS(0) is better than Move-to-front on such sources, and determine performance ratios for TS(0) against the optimal off-line and static adversaries ..."
Abstract
-
Cited by 19 (4 self)
- Add to MetaCart
We study the performance of the Timestamp (0) (TS(0)) algorithm for self-organizing sequential search on discrete memoryless sources. We demonstrate that TS(0) is better than Move-to-front on such sources, and determine performance ratios for TS(0) against the optimal off-line and static adversaries in this situation. Previous work on such sources compared on-line algorithms only with static adversaries. One practical motivation for our work is the use of the Move-to-front heuristic in various compression algorithms. Our theoretical results suggest that in many cases using TS(0) in place of Move-to-front in schemes that use the latter should improve compression. Tests using implementations on a standard corpus of test documents demonstrate that TS(0) leads to improved compression.
Asymptotic Approximation of the Move-To-Front Search Cost Distribution and Least-Recently-Used Caching Fault Probabilities
, 1999
"... Consider a finite list of items n = 1; 2; : : : ; N , that are requested according to an i.i.d. process. Each time an item is requested it is moved to the front of the list. The associated search cost C N for accessing an item is equal to its position before being moved. If the request distributio ..."
Abstract
-
Cited by 19 (7 self)
- Add to MetaCart
Consider a finite list of items n = 1; 2; : : : ; N , that are requested according to an i.i.d. process. Each time an item is requested it is moved to the front of the list. The associated search cost C N for accessing an item is equal to its position before being moved. If the request distribution converges to a proper distribution as N ! 1, then the stationary search cost C N converges in distribution to a limiting search cost C. We show that, when the (limiting) request distribution has a heavy tail (e.g., generalized Zipf's law) P[R = n] ¸ c=n ff as n !1, ff ? 1, then the limiting stationary search cost distribution P[C ? n], or, equivalently, the Least-Recently-Used (LRU) caching fault probability, satisfies lim n!1 P[C ? n] P[R ? n] = ` 1 \Gamma 1 ff ' \Gamma ` 1 \Gamma 1 ff ' ff % e fl as ff !1; where \Gamma is the Gamma function and fl (= 0:5772 : : : ) is Euler's constant. When the request distribution has a light tail P[R = n] ¸ ce \Gamman fi as...
Self-Organizing Data Structures
- In
, 1998
"... . We survey results on self-organizing data structures for the search problem and concentrate on two very popular structures: the unsorted linear list, and the binary search tree. For the problem of maintaining unsorted lists, also known as the list update problem, we present results on the competit ..."
Abstract
-
Cited by 16 (0 self)
- Add to MetaCart
. We survey results on self-organizing data structures for the search problem and concentrate on two very popular structures: the unsorted linear list, and the binary search tree. For the problem of maintaining unsorted lists, also known as the list update problem, we present results on the competitiveness achieved by deterministic and randomized on-line algorithms. For binary search trees, we present results for both on-line and off-line algorithms. Self-organizing data structures can be used to build very effective data compression schemes. We summarize theoretical and experimental results. 1 Introduction This paper surveys results in the design and analysis of self-organizing data structures for the search problem. The general search problem in pointer data structures can be phrased as follows. The elements of a set are stored in a collection of nodes. Each node also contains O(1) pointers to other nodes and additional state data which can be used for navigation and self-organizati...
Two New Families of List Update Algorithms
- In ISSAC'98, LCNS 1533
, 1998
"... . We consider the online list accessing problem and present a new family of competitive-optimal deterministic list update algorithms which is the largest class of such algorithms known to-date. This family, called Sort-by-Rank (sbr), is parametrized with a real 0 ff 1, where sbr(0) is the Move ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
. We consider the online list accessing problem and present a new family of competitive-optimal deterministic list update algorithms which is the largest class of such algorithms known to-date. This family, called Sort-by-Rank (sbr), is parametrized with a real 0 ff 1, where sbr(0) is the Move-to-Front algorithm and sbr(1) is equivalent to the Timestamp algorithm. The behaviour of sbr(ff) mediates between the eager strategy of Move-to-Front and the more conservative behaviour of Timestamp. We also present a family of algorithms Sort-by-Delay (sbd) which is parametrized by the positive integers, where sbd(1) is Move-toFront and sbd(2) is equivalent to Timestamp. In general, sbd(k) is k-competitive for k 2. This is the first class of algorithms that is asymptotically optimal for independent, identically distributed requests while each algorithm is constant-competitive. Empirical studies with with both generated and real-world data are also included. 1 Introduction Co...
On-line Algorithms: Competitive Analysis and Beyond
- in Algorithms and Theory of Computation
, 1999
"... this article, but rather a deep principle of on-line analysis known as Yao's minimax theorem [Yao, 1980]. This theorem is actually an adaptation of the famous minimax theorem of game theory [von Neumann and Morgenstern, 1947]. It states that the best ratio achievable by a deterministic algorithm aga ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
this article, but rather a deep principle of on-line analysis known as Yao's minimax theorem [Yao, 1980]. This theorem is actually an adaptation of the famous minimax theorem of game theory [von Neumann and Morgenstern, 1947]. It states that the best ratio achievable by a deterministic algorithm against any distribution is exactly the same as the best ratio achievable by a randomized algorithm against a worst-case adversary. More formally, for a given on-line problem let F n denote the family of input instances of size at most n. Let D n denote the set of all probability distributions over the instances in F n . Let A n
Self-organizing data structures with dependent accesses
- ICALP'96, LNCS 1099
, 1995
"... We consider self-organizing data structures in the case where the sequence of accesses can be modeled by a first order Markov chain. For the simple-k- and batched-k--move-to-front schemes, explicit formulae for the expected search costs are derived and compared. We use a new approach that employs th ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
We consider self-organizing data structures in the case where the sequence of accesses can be modeled by a first order Markov chain. For the simple-k- and batched-k--move-to-front schemes, explicit formulae for the expected search costs are derived and compared. We use a new approach that employs the technique of expanding a Markov chain. This approach generalizes the results of Gonnet/Munro/Suwanda. In order to analyze arbitrary memory-free move-forward heuristics for linear lists, we restrict our attention to a special access sequence, thereby reducing the state space of the chain governing the behaviour of the data structure. In the case of accesses with locality (inert transition behaviour), we find that the hierarchies of self-organizing data structures with respect to the expected search time are reversed, compared with independent accesses. Finally we look at self-organizing binary trees with the move-to-root rule and compare the expected search cost with the entropy of the Markov chain of accesses.
Inner product spaces for minsum coordination mechanisms
- In STOC
, 2011
"... We study policies aiming to minimize the weighted sum of completion times of jobs in the context of coordination mechanisms for selfish scheduling problems. Our goal is to design local policies that achieve a good price of anarchy in the resulting equilibria for unrelated machine scheduling. To obta ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
We study policies aiming to minimize the weighted sum of completion times of jobs in the context of coordination mechanisms for selfish scheduling problems. Our goal is to design local policies that achieve a good price of anarchy in the resulting equilibria for unrelated machine scheduling. To obtain the approximation bounds, we introduce a new technique that while conceptually simple, seems to be quite powerful. The method entails mapping strategy vectors into a carefully chosen inner product space; costs are shown to correspond to the norm in this space, and the Nash condition also has a simple description. With this structure in place, we are able to prove a number of results, as follows. First, we consider Smith’s Rule, which orders the jobs on a machine in ascending processing time to weight ratio, and show that it achieves an approximation ratio of 4. We also demonstrate that this is the best possible for deterministic non-preemptive strongly local policies. Since Smith’s Rule is always optimal for a given fixed assignment, this may seem unsurprising, but we then show that better approximation ratios can be obtained if either preemption or randomization is allowed.
§1. The Potential Framework Lecture VI Page 1 Lecture VI
"... Many algorithms amount to a sequence of operations on a data structure. For instance, the well-known heapsort algorithm is a sequence of insert’s into an initially empty priority queue, followed by a sequence of deleteMin’s from the queue until it is empty. Thus if ci is the cost of the ith operatio ..."
Abstract
- Add to MetaCart
Many algorithms amount to a sequence of operations on a data structure. For instance, the well-known heapsort algorithm is a sequence of insert’s into an initially empty priority queue, followed by a sequence of deleteMin’s from the queue until it is empty. Thus if ci is the cost of the ith operation, the algorithm’s running time is ∑ 2n i=1 ci, since there are 2n operations for sorting n elements. In worst case analysis, we ensure that each operation is efficient, say ci = O(log n), leading to the conclusion that the overall algorithm is O(n log n). The idea of amortization exploits the fact that we may be able to obtain the same bound ∑ 2n i=1 ci = O(n log n) without ensuring that each ci is logarithmic. We then say that the amortized cost of each operation is logarithmic. Thus “amortized complexity ” is a kind of average complexity although it has nothing to do with probability. Tarjan [9] gives the first systematic account of this topic. Why amortize? Even in problems where we could have ensured each operation is logarithmic time, it may be advantageous to achieve only logarithmic behavior in the amortized sense. This is because the extra flexibility of amortized bounds may lead to simpler or more practical algorithms. In fact, many “amortized ” data structures are relatively easy to implement. To give a concrete example, consider any balance binary search tree scheme. The algorithms for such trees must perform considerable book-keeping to maintain its balanced shape. In contrast, when we study splay trees, we will see an amortization scheme for binary search trees which is considerably simpler and “lax ” about balancing. The operative word in such amortized data structures is 1 laziness: try to defer the book-keeping work to the future, when it might be more convenient to do this work. This remark will be illustrated by splay trees below. This chapter is in 3 parts: we begin by introducing the potential function framework for doing amortization analysis. Then we introduce two data structures, splay trees and Fibonacci heaps, which can be analyzed using this framework. We give a non-trivial application of each data structure: splay trees are used to maintain the convex hull of a set of points in the plane, and Fibonacci heaps are used for implement Prim’s algorithm for minimum spanning trees.

