Results 1 - 10
of
226
Novel Architectures for P2P Applications: the Continuous-Discrete Approach
- ACM TRANSACTIONS ON ALGORITHMS
, 2007
"... We propose a new approach for constructing P2P networks based on a dynamic decomposition of a continuous space into cells corresponding to processors. We demonstrate the power of these design rules by suggesting two new architectures, one for DHT (Distributed Hash Table) and the other for dynamic ex ..."
Abstract
-
Cited by 166 (8 self)
- Add to MetaCart
(Show Context)
We propose a new approach for constructing P2P networks based on a dynamic decomposition of a continuous space into cells corresponding to processors. We demonstrate the power of these design rules by suggesting two new architectures, one for DHT (Distributed Hash Table) and the other for dynamic expander networks. The DHT network, which we call Distance Halving, allows logarithmic routing and load, while preserving constant degrees. Our second construction builds a network that is guaranteed to be an expander. The resulting topologies are simple to maintain and implement. Their simplicity makes it easy to modify and add protocols. We show it is possible to reduce the dilation and the load of the DHT with a small increase of the degree. We present a provably good protocol for relieving hot spots and a construction with high fault tolerance. Finally we show that, using our approach, it is possible to construct any family of constant degree graphs in a dynamic environment, though with worst parameters. Therefore we expect that more distributed data structures could be designed and implemented in a dynamic environment.
Transport Layer Identification of P2P Traffic
, 2004
"... Since the emergence of peer-to-peer (P2P) networking in the late '90s, P2P applications have multiplied, evolved and established themselves as the leading `growth app' of Internet traffic workload. In contrast to first-generation P2P networks which used well-defined port numbers, current P ..."
Abstract
-
Cited by 124 (2 self)
- Add to MetaCart
Since the emergence of peer-to-peer (P2P) networking in the late '90s, P2P applications have multiplied, evolved and established themselves as the leading `growth app' of Internet traffic workload. In contrast to first-generation P2P networks which used well-defined port numbers, current P2P applications have the ability to disguise their existence through the use of arbitrary ports. As a result, reliable estimates of P2P traffic require examination of packet payload, a methodological landmine from legal, privacy, technical, logistic, and fiscal perspectives. Indeed, access to user payload is often rendered impossible by one of these factors, inhibiting trustworthy estimation of P2P traffic growth and dynamics. In this paper, we develop a systematic methodology to identify P2P flows at the transport layer, i.e., based on connection patterns of P2P networks, and without relying on packet payload. We believe our approach is the first method for characterizing P2P traffic using only knowledge of network dynamics rather than any user payload. To evaluate our methodology, we also develop a payload technique for P2P traffic identification, by reverse engineering and analyzing the nine most popular P2P protocols, and demonstrate its efficacy with the discovery of P2P protocols in our traces that were previously unknown to us. Finally, our results indicate that P2P traffic continues to grow unabatedly, contrary to reports in the popular media.
Walking in Facebook: A Case Study of Unbiased Sampling of OSNs
- IN PROC. IEEE INFOCOM
, 2010
"... With more than 250 million active users [1], Facebook (FB) is currently one of the most important online social networks. Our goal in this paper is to obtain a representative (unbiased) sample of Facebook users by crawling its social graph. In this quest, we consider and implement several candidate ..."
Abstract
-
Cited by 114 (13 self)
- Add to MetaCart
With more than 250 million active users [1], Facebook (FB) is currently one of the most important online social networks. Our goal in this paper is to obtain a representative (unbiased) sample of Facebook users by crawling its social graph. In this quest, we consider and implement several candidate techniques. Two approaches that are found to perform well are the Metropolis-Hasting random walk (MHRW) and a re-weighted random walk (RWRW). Both have pros and cons, which we demonstrate through a comparison to each other as well as to the ”ground-truth ” (UNI- obtained through true uniform sampling of FB userIDs). In contrast, the traditional Breadth-First-Search (BFS) and Random Walk (RW) perform quite poorly, producing substantially biased results. In addition to offline performance assessment, we introduce onlineformal convergence diagnostics to assess sample quality during the data collection process. We show how these can be used to effectively determine when a random walk sample is of adequate size and quality for subsequent use (i.e., when it is safe to cease sampling). Using these methods, we collect the first, to the best of our knowledge, unbiased sample of Facebook. Finally, we use one of our representative datasets, collected through MHRW, to characterize several key properties of Facebook.
Hybrid search schemes for unstructured peer-to-peer networks
- In Proceedings of IEEE INFOCOM
, 2005
"... Abstract — We study hybrid search schemes for unstructured peer-to-peer networks. We quantify performance in terms of number of hits, network overhead, and response time. Our schemes combine flooding and random walks, look ahead and replication. We consider both regular topologies and topologies wit ..."
Abstract
-
Cited by 100 (2 self)
- Add to MetaCart
(Show Context)
Abstract — We study hybrid search schemes for unstructured peer-to-peer networks. We quantify performance in terms of number of hits, network overhead, and response time. Our schemes combine flooding and random walks, look ahead and replication. We consider both regular topologies and topologies with supernodes. We introduce a general search scheme, of which flooding and random walks are special instances, and show how to use locally maintained network information to improve the performance of searching. Our main findings are: (a)A small number of supernodes in an otherwise regular topology can offer sharp savings in the performance of search, both in the case of search by flooding and search by random walk, particularly when it is combined with 1-step replication. We quantify, analytically and experimentally, that the reason of these savings is that the search is biased towards nodes that yield more information. (b)There is a generalization of search, of which flooding and random walk are special instances, which may take further advantage of locally maintained network information, and yield better performance than both flooding and random walk in clustered topologies. The method determines edge criticality and is reminiscent of fundamental heuristics from the area of approximation algorithms. I.
On Unbiased Sampling for Unstructured Peer-to-Peer Networks
- in Proc. ACM IMC
, 2006
"... This paper addresses the difficult problem of selecting representative samples of peer properties (e.g., degree, link bandwidth, number of files shared) in unstructured peer-to-peer systems. Due to the large size and dynamic nature of these systems, measuring the quantities of interest on every peer ..."
Abstract
-
Cited by 81 (8 self)
- Add to MetaCart
(Show Context)
This paper addresses the difficult problem of selecting representative samples of peer properties (e.g., degree, link bandwidth, number of files shared) in unstructured peer-to-peer systems. Due to the large size and dynamic nature of these systems, measuring the quantities of interest on every peer is often prohibitively expensive, while sampling provides a natural means for estimating system-wide behavior efficiently. However, commonly-used sampling techniques for measuring peer-to-peer systems tend to introduce considerable bias for two reasons. First, the dynamic nature of peers can bias results towards short-lived peers, much as naively sampling flows in a router can lead to bias towards short-lived flows. Second, the heterogeneous nature of the overlay topology can lead to bias towards high-degree peers. We present a detailed examination of the ways that the behavior of peer-to-peer systems can introduce bias and suggest the Metropolized Random Walk with Backtracking (MRWB) as a viable and promising technique for collecting nearly unbiased samples. We conduct an extensive simulation study to demonstrate that the proposed technique works well for a wide variety of common peer-to-peer network conditions. Using the Gnutella network, we empirically show that our implementation of the MRWB technique yields more accurate samples than relying on commonlyused sampling techniques. Furthermore, we provide insights into the causes of the observed differences. The tool we have developed, ion-sampler, selects peer addresses uniformly at random using the MRWB technique. These addresses may then be used as input to another measurement tool to collect data on a particular property.
Optimal and scalable distribution of content updates over a mobile social network
- In Proc. IEEE INFOCOM
, 2009
"... Number: CR-PRL-2008-08-0001 ..."
Brahms: Byzantine Resilient Random Membership Sampling
, 2008
"... We present Brahms, an algorithm for sampling random nodes in a large dynamic system prone to malicious behavior. Brahms stores small membership views at each node, and yet overcomes Byzantine attacks by a linear portion of the system. Brahms is composed of two components. The first one is a resilien ..."
Abstract
-
Cited by 47 (2 self)
- Add to MetaCart
(Show Context)
We present Brahms, an algorithm for sampling random nodes in a large dynamic system prone to malicious behavior. Brahms stores small membership views at each node, and yet overcomes Byzantine attacks by a linear portion of the system. Brahms is composed of two components. The first one is a resilient gossip-based membership protocol. The second one uses a novel memory-efficient approach for uniform sampling from a possibly biased stream of ids that traverse the node. We evaluate Brahms using rigorous analysis, backed by simulations, which show that our theoretical model captures the protocol’s essentials. We study two representative attacks, and show that with high probability, an attacker cannot create a partition between correct nodes. We further prove that each node’s sample converges to a uniform one over time. To our knowledge, no such properties were proven for gossip protocols in the past.
Many Random Walks Are Faster Than One
"... We pose a new and intriguing question motivated by distributed computing regarding random walks on graphs: How long does it take for several independent random walks, starting from the same vertex, to cover an entire graph? We study the cover time–the expected time required to visit every node in a ..."
Abstract
-
Cited by 46 (3 self)
- Add to MetaCart
(Show Context)
We pose a new and intriguing question motivated by distributed computing regarding random walks on graphs: How long does it take for several independent random walks, starting from the same vertex, to cover an entire graph? We study the cover time–the expected time required to visit every node in a graph at least once–and we show that for a large collection of interesting graphs, running many random walks in parallel yields a speed-up in the cover time that is linear in the number of parallel walks. We demonstrate that an exponential speed-up is sometimes possible, but that some natural graphs allow only a logarithmic speed-up. A problem related to ours (in which the walks start from some probabilistic distribution on vertices) was previously studied in the context of space efficient algorithms for undirected s-t-connectivity and our results yield, in certain cases, an improvement upon some of the earlier bounds.
BotGrep: Finding P2P Bots with Structured Graph Analysis
"... A key feature that distinguishes modern botnets from earlier counterparts is their increasing use of structured overlay topologies. This lets them carry out sophisticated coordinated activities while being resilient to churn, but it can also be used as a point of detection. In this work, we devise t ..."
Abstract
-
Cited by 42 (3 self)
- Add to MetaCart
(Show Context)
A key feature that distinguishes modern botnets from earlier counterparts is their increasing use of structured overlay topologies. This lets them carry out sophisticated coordinated activities while being resilient to churn, but it can also be used as a point of detection. In this work, we devise techniques to localize botnet members based on the unique communication patterns arising from their overlay topologies used for command and control. Experimental results on synthetic topologies embedded within Internet traffic traces from an ISP’s backbone network indicate that our techniques (i) can localize the majority of bots with low false positive rate, and (ii) are resilient to incomplete visibility arising from partial deployment of monitoring systems and measurement inaccuracies from dynamics of background traffic. 1
On the cover time and mixing time of random geometric graphs
- Theor. Comput. Sci
, 2007
"... The cover time and mixing time of graphs has much relevance to algorithmic appli-cations and has been extensively investigated. Recently, with the advent of ad-hoc and sensor networks, an interesting class of random graphs, namely random geo-metric graphs, has gained new relevance and its properties ..."
Abstract
-
Cited by 41 (2 self)
- Add to MetaCart
(Show Context)
The cover time and mixing time of graphs has much relevance to algorithmic appli-cations and has been extensively investigated. Recently, with the advent of ad-hoc and sensor networks, an interesting class of random graphs, namely random geo-metric graphs, has gained new relevance and its properties have been the subject of much study. A random geometric graph G(n, r) is obtained by placing n points uni-formly at random on the unit square and connecting two points iff their Euclidean distance is at most r. The phase transition behavior with respect to the radius r of such graphs has been of special interest. We show that there exists a critical radius ropt such that for any r ≥ ropt G(n, r) has optimal cover time of Θ(n log n) with high probability, and, importantly, ropt = Θ(rcon) where rcon denotes the critical radius guaranteeing asymptotic connectivity. Moreover, since a disconnected graph has in-finite cover time, there is a phase transition and the corresponding threshold width is O(rcon). On the other hand, the radius required for rapid mixing rrapid = ω(rcon), and, in particular, rrapid = Θ(1/poly(log n)). We are able to draw our results by giv-ing a tight bound on the electrical resistance and conductance of G(n, r) via certain constructed flows.