Results 1  10
of
120
Novel Architectures for P2P Applications: the ContinuousDiscrete Approach
 ACM TRANSACTIONS ON ALGORITHMS
, 2007
"... We propose a new approach for constructing P2P networks based on a dynamic decomposition of a continuous space into cells corresponding to processors. We demonstrate the power of these design rules by suggesting two new architectures, one for DHT (Distributed Hash Table) and the other for dynamic ex ..."
Abstract

Cited by 142 (8 self)
 Add to MetaCart
We propose a new approach for constructing P2P networks based on a dynamic decomposition of a continuous space into cells corresponding to processors. We demonstrate the power of these design rules by suggesting two new architectures, one for DHT (Distributed Hash Table) and the other for dynamic expander networks. The DHT network, which we call Distance Halving, allows logarithmic routing and load, while preserving constant degrees. Our second construction builds a network that is guaranteed to be an expander. The resulting topologies are simple to maintain and implement. Their simplicity makes it easy to modify and add protocols. We show it is possible to reduce the dilation and the load of the DHT with a small increase of the degree. We present a provably good protocol for relieving hot spots and a construction with high fault tolerance. Finally we show that, using our approach, it is possible to construct any family of constant degree graphs in a dynamic environment, though with worst parameters. Therefore we expect that more distributed data structures could be designed and implemented in a dynamic environment.
Transport Layer Identification of P2P Traffic
, 2004
"... Since the emergence of peertopeer (P2P) networking in the late '90s, P2P applications have multiplied, evolved and established themselves as the leading `growth app' of Internet traffic workload. In contrast to firstgeneration P2P networks which used welldefined port numbers, current P2P applica ..."
Abstract

Cited by 87 (1 self)
 Add to MetaCart
Since the emergence of peertopeer (P2P) networking in the late '90s, P2P applications have multiplied, evolved and established themselves as the leading `growth app' of Internet traffic workload. In contrast to firstgeneration P2P networks which used welldefined port numbers, current P2P applications have the ability to disguise their existence through the use of arbitrary ports. As a result, reliable estimates of P2P traffic require examination of packet payload, a methodological landmine from legal, privacy, technical, logistic, and fiscal perspectives. Indeed, access to user payload is often rendered impossible by one of these factors, inhibiting trustworthy estimation of P2P traffic growth and dynamics. In this paper, we develop a systematic methodology to identify P2P flows at the transport layer, i.e., based on connection patterns of P2P networks, and without relying on packet payload. We believe our approach is the first method for characterizing P2P traffic using only knowledge of network dynamics rather than any user payload. To evaluate our methodology, we also develop a payload technique for P2P traffic identification, by reverse engineering and analyzing the nine most popular P2P protocols, and demonstrate its efficacy with the discovery of P2P protocols in our traces that were previously unknown to us. Finally, our results indicate that P2P traffic continues to grow unabatedly, contrary to reports in the popular media.
Hybrid search schemes for unstructured peertopeer networks
 In Proceedings of IEEE INFOCOM
, 2005
"... Abstract — We study hybrid search schemes for unstructured peertopeer networks. We quantify performance in terms of number of hits, network overhead, and response time. Our schemes combine flooding and random walks, look ahead and replication. We consider both regular topologies and topologies wit ..."
Abstract

Cited by 72 (1 self)
 Add to MetaCart
Abstract — We study hybrid search schemes for unstructured peertopeer networks. We quantify performance in terms of number of hits, network overhead, and response time. Our schemes combine flooding and random walks, look ahead and replication. We consider both regular topologies and topologies with supernodes. We introduce a general search scheme, of which flooding and random walks are special instances, and show how to use locally maintained network information to improve the performance of searching. Our main findings are: (a)A small number of supernodes in an otherwise regular topology can offer sharp savings in the performance of search, both in the case of search by flooding and search by random walk, particularly when it is combined with 1step replication. We quantify, analytically and experimentally, that the reason of these savings is that the search is biased towards nodes that yield more information. (b)There is a generalization of search, of which flooding and random walk are special instances, which may take further advantage of locally maintained network information, and yield better performance than both flooding and random walk in clustered topologies. The method determines edge criticality and is reminiscent of fundamental heuristics from the area of approximation algorithms. I.
Walking in Facebook: A Case Study of Unbiased Sampling of OSNs
 in Proc. IEEE INFOCOM
, 2010
"... Abstract—With more than 250 million active users [1], Facebook (FB) is currently one of the most important online social networks. Our goal in this paper is to obtain a representative (unbiased) sample of Facebook users by crawling its social graph. In this quest, we consider and implement several c ..."
Abstract

Cited by 48 (11 self)
 Add to MetaCart
Abstract—With more than 250 million active users [1], Facebook (FB) is currently one of the most important online social networks. Our goal in this paper is to obtain a representative (unbiased) sample of Facebook users by crawling its social graph. In this quest, we consider and implement several candidate techniques. Two approaches that are found to perform well are the MetropolisHasting random walk (MHRW) and a reweighted random walk (RWRW). Both have pros and cons, which we demonstrate through a comparison to each other as well as to the ”groundtruth ” (UNI obtained through true uniform sampling of FB userIDs). In contrast, the traditional BreadthFirstSearch (BFS) and Random Walk (RW) perform quite poorly, producing substantially biased results. In addition to offline performance assessment, we introduce onlineformal convergence diagnostics to assess sample quality during the data collection process. We show how these can be used to effectively determine when a random walk sample is of adequate size and quality for subsequent use (i.e., when it is safe to cease sampling). Using these methods, we collect the first, to the best of our knowledge, unbiased sample of Facebook. Finally, we use one of our representative datasets, collected through MHRW, to characterize several key properties of Facebook. IndexTerms—Measurements,onlinesocial networks,Facebook, graph sampling, crawling, bias. I.
On Unbiased Sampling for Unstructured PeertoPeer Networks
 in Proc. ACM IMC
, 2006
"... This paper addresses the difficult problem of selecting representative samples of peer properties (e.g., degree, link bandwidth, number of files shared) in unstructured peertopeer systems. Due to the large size and dynamic nature of these systems, measuring the quantities of interest on every peer ..."
Abstract

Cited by 47 (6 self)
 Add to MetaCart
This paper addresses the difficult problem of selecting representative samples of peer properties (e.g., degree, link bandwidth, number of files shared) in unstructured peertopeer systems. Due to the large size and dynamic nature of these systems, measuring the quantities of interest on every peer is often prohibitively expensive, while sampling provides a natural means for estimating systemwide behavior efficiently. However, commonlyused sampling techniques for measuring peertopeer systems tend to introduce considerable bias for two reasons. First, the dynamic nature of peers can bias results towards shortlived peers, much as naively sampling flows in a router can lead to bias towards shortlived flows. Second, the heterogeneous nature of the overlay topology can lead to bias towards highdegree peers. We present a detailed examination of the ways that the behavior of peertopeer systems can introduce bias and suggest the Metropolized Random Walk with Backtracking (MRWB) as a viable and promising technique for collecting nearly unbiased samples. We conduct an extensive simulation study to demonstrate that the proposed technique works well for a wide variety of common peertopeer network conditions. Using the Gnutella network, we empirically show that our implementation of the MRWB technique yields more accurate samples than relying on commonlyused sampling techniques. Furthermore, we provide insights into the causes of the observed differences. The tool we have developed, ionsampler, selects peer addresses uniformly at random using the MRWB technique. These addresses may then be used as input to another measurement tool to collect data on a particular property.
Optimal and scalable distribution of content updates over a mobile social network
 In Proc. IEEE INFOCOM
, 2009
"... Number: CRPRL2008080001 ..."
Distributed Approaches to Triangulation and Embedding
 In Proceedings 16th ACMSIAM Symposium on Discrete Algorithms (SODA
, 2005
"... A number of recent papers in the networking community study the distance matrix defined by the nodetonode latencies in the Internet and, in particular, provide a number of quite successful distributed approaches that embed this distance into a lowdimensional Euclidean space. In such algorithms it ..."
Abstract

Cited by 30 (6 self)
 Add to MetaCart
A number of recent papers in the networking community study the distance matrix defined by the nodetonode latencies in the Internet and, in particular, provide a number of quite successful distributed approaches that embed this distance into a lowdimensional Euclidean space. In such algorithms it is feasible to measure distances among only a linear or nearlinear number of node pairs; the rest of the distances are simply not available. Moreover, for applications it is desirable to spread the load evenly among the participating nodes. Indeed, several recent studies use this ’fully distributed ’ approach and achieve, empirically, a low distortion for all but a small fraction of node pairs. This is concurrent with the large body of theoretical work on metric embeddings, but there is a fundamental distinction: in the theoretical approaches to metric embeddings, full and centralized access to the distance matrix is assumed and heavily used. In this paper we present the first fully distributed embedding algorithm with provable distortion guarantees for doubling metrics (which have been proposed as a reasonable abstraction of Internet latencies), thus providing some insight into the empirical success of the recent Vivaldi algorithm [7]. The main ingredient of our embedding algorithm is an improved fully distributed algorithm for a more basic problem of triangulation, where the triangle inequality is used to infer the distances that have not been measured; this problem received a considerable attention in the networking community, and has also been studied theoretically in [19]. We use our techniques to extend ɛrelaxed embeddings and triangulations to infinite metrics and arbitrary measures, and to improve on the approximate distance labeling scheme of Talwar [36]. 1
Brahms: Byzantine Resilient Random Membership Sampling
, 2008
"... We present Brahms, an algorithm for sampling random nodes in a large dynamic system prone to malicious behavior. Brahms stores small membership views at each node, and yet overcomes Byzantine attacks by a linear portion of the system. Brahms is composed of two components. The first one is a resilien ..."
Abstract

Cited by 27 (2 self)
 Add to MetaCart
We present Brahms, an algorithm for sampling random nodes in a large dynamic system prone to malicious behavior. Brahms stores small membership views at each node, and yet overcomes Byzantine attacks by a linear portion of the system. Brahms is composed of two components. The first one is a resilient gossipbased membership protocol. The second one uses a novel memoryefficient approach for uniform sampling from a possibly biased stream of ids that traverse the node. We evaluate Brahms using rigorous analysis, backed by simulations, which show that our theoretical model captures the protocol’s essentials. We study two representative attacks, and show that with high probability, an attacker cannot create a partition between correct nodes. We further prove that each node’s sample converges to a uniform one over time. To our knowledge, no such properties were proven for gossip protocols in the past.
RaWMS  Random Walk based Lightweight Membership Service for Wireless Ad Hoc Networks
, 2008
"... This paper presents RaWMS, a novel lightweight random membership service for ad hoc networks. The service provides each node with a partial uniformly chosen view of network nodes. Such a membership service is useful, e.g., in data dissemination algorithms, lookup and discovery services, peer samplin ..."
Abstract

Cited by 27 (7 self)
 Add to MetaCart
This paper presents RaWMS, a novel lightweight random membership service for ad hoc networks. The service provides each node with a partial uniformly chosen view of network nodes. Such a membership service is useful, e.g., in data dissemination algorithms, lookup and discovery services, peer sampling services, and complete membership construction. The design of RaWMS is based on a novel reverse random walk (RW) sampling technique. The paper includes a formal analysis of both the reverse RW sampling technique and RaWMS and verifies it through a detailed simulation study. In addition, RaWMS is compared both analytically and by simulations with a number of other known methods such as flooding and gossipbased techniques.
On the cover time and mixing time of random geometric graphs
 Theor. Comput. Sci
, 2007
"... The cover time and mixing time of graphs has much relevance to algorithmic applications and has been extensively investigated. Recently, with the advent of adhoc and sensor networks, an interesting class of random graphs, namely random geometric graphs, has gained new relevance and its properties ..."
Abstract

Cited by 26 (2 self)
 Add to MetaCart
The cover time and mixing time of graphs has much relevance to algorithmic applications and has been extensively investigated. Recently, with the advent of adhoc and sensor networks, an interesting class of random graphs, namely random geometric graphs, has gained new relevance and its properties have been the subject of much study. A random geometric graph G(n, r) is obtained by placing n points uniformly at random on the unit square and connecting two points iff their Euclidean distance is at most r. The phase transition behavior with respect to the radius r of such graphs has been of special interest. We show that there exists a critical radius ropt such that for any r ≥ ropt G(n, r) has optimal cover time of Θ(n log n) with high probability, and, importantly, ropt = Θ(rcon) where rcon denotes the critical radius guaranteeing asymptotic connectivity. Moreover, since a disconnected graph has infinite cover time, there is a phase transition and the corresponding threshold width is O(rcon). On the other hand, the radius required for rapid mixing rrapid = ω(rcon), and, in particular, rrapid = Θ(1/poly(log n)). We are able to draw our results by giving a tight bound on the electrical resistance and conductance of G(n, r) via certain constructed flows.