Results 1 - 10
of
16
Towards understanding modern web traffic (extended abstract
- In Proc. ACM SIGMETRICS
, 2011
"... As the nature of Web traffic evolves over time, we must update our understanding of underlying nature of today’s Web, which is necessary to improve response time, understand caching effectiveness, and to design intermediary systems, such as firewalls, security analyzers, and reporting or management ..."
Abstract
-
Cited by 42 (4 self)
- Add to MetaCart
(Show Context)
As the nature of Web traffic evolves over time, we must update our understanding of underlying nature of today’s Web, which is necessary to improve response time, understand caching effectiveness, and to design intermediary systems, such as firewalls, security analyzers, and reporting or management systems. In this paper, we analyze five years (2006-2010) of real Web traffic from a globally-distributed proxy system, which captures the browsing behavior of over 70,000 daily users from 187 countries. Using this data set, we examine major changes in Web traffic characteristics during this period, and also investigate the redundancy of this traffic, using both traditional object-level caching as well as content-based approaches.
ExperimenTor: A Testbed for Safe and Realistic Tor Experimentation
"... Tor is one of the most widely-used privacy enhancing technologies for achieving online anonymity and resisting censorship. Simultaneously, Tor is also an evolving research network in which investigators perform experiments to improve the network’s resilience to attacks and enhance its performance. E ..."
Abstract
-
Cited by 21 (8 self)
- Add to MetaCart
(Show Context)
Tor is one of the most widely-used privacy enhancing technologies for achieving online anonymity and resisting censorship. Simultaneously, Tor is also an evolving research network in which investigators perform experiments to improve the network’s resilience to attacks and enhance its performance. Existing methods for studying Tor have included analytical modeling, simulations, small-scale network emulations, small-scale PlanetLab deployments, and measurement and analysis of the live Tor network. Despite the growing body of work concerning Tor, there is no widely accepted methodology for conducting Tor research in a manner that preserves realism while protecting live users ’ privacy. In an effort to propose a standard, rigorous experimental framework for conducting Tor research in a way that ensures safety and realism, we present the design of ExperimenTor, a largescale Tor network emulation toolkit and testbed. We report our early experiences with prototype ExperimenTor testbeds deployed at three research institutions. 1
ANDaNA : Anonymous named data networking application
- In NDSS
, 2011
"... Content-centric networking — also known as information-centric networking (ICN) — shifts empha-sis from hosts and interfaces (as in today’s Internet) to data. Named data becomes addressable and routable, while locations that currently store that data become ir-relevant to applications. Named Data N ..."
Abstract
-
Cited by 20 (9 self)
- Add to MetaCart
Content-centric networking — also known as information-centric networking (ICN) — shifts empha-sis from hosts and interfaces (as in today’s Internet) to data. Named data becomes addressable and routable, while locations that currently store that data become ir-relevant to applications. Named Data Networking (NDN) is a large collabora-tive research effort that exemplifies the content-centric approach to networking. NDN has some innate privacy-friendly features, such as lack of source and destina-tion addresses on packets. However, as discussed in this paper, NDN architecture prompts some privacy con-cerns mainly stemming from the semantic richness of names. We examine privacy-relevant characteristics of NDN and present an initial attempt to achieve communi-cation privacy. Specifically, we design an NDN add-on tool, called ANDaNA, that borrows a number of features from Tor. As we demonstrate via experiments, it provides comparable anonymity with lower relative overhead. 1
Quantifying the Benefits of Joint Content and Network Routing
"... Online service providers aim to provide good performance for an increasingly diverse set of applications and services. One of the most effective ways to improve service performance is to replicate the service closer to the end users. Replication alone, however, has its limits: while operators can re ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
(Show Context)
Online service providers aim to provide good performance for an increasingly diverse set of applications and services. One of the most effective ways to improve service performance is to replicate the service closer to the end users. Replication alone, however, has its limits: while operators can replicate static content, wide-scale replication of dynamic content is not always feasible or cost effective. To improve the latency of such services many operators turn to Internet traffic engineering. In this paper, we study the benefits of performing replica-to-end-user mappings in conjunction with active Internet traffic engineering. We present the design of PECAN, a system that controls both the selection of replicas (“content routing”) and the routes between the clients and their associated replicas (“network routing”). We emulate a replicated service that can perform both content and network routing by deploying PECAN on a distributed testbed. In our testbed, we see that jointly performing content and network routing can reduce round-trip latency by 4.3% on average over performing content routing alone (potentially reducing service response times by tens of milliseconds or more) and that most of these gains can be realized with no more than five alternate routes at each replica.
Understanding and Improving Modern Web Traffic Caching
, 2011
"... The WorldWide Web is one of the most popular and important Internet applications, and our daily lives heavily rely on it. Despite its importance, the current Web access is still limited for two reasons: (1) the Web has changed and grown significantly as social networking, video streaming, and file h ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
(Show Context)
The WorldWide Web is one of the most popular and important Internet applications, and our daily lives heavily rely on it. Despite its importance, the current Web access is still limited for two reasons: (1) the Web has changed and grown significantly as social networking, video streaming, and file hosting sites have become popular, requiring more and more bandwidth, and (2) the need for Web access also has grown, and many users in bandwidth-limited environments, such as people in the developing world or mobile device users, still suffer from poor Web access. There was a burst of research a decade ago aimed at understanding the nature of Web traffic and thus improving Web access, but unfortunately, it has dropped off just as the Web has changed significantly. As a result, we have little understanding of the underlying nature of today’s Web traffic, and thus miss traffic optimization opportunitiesforimprovingWebaccess. TohelpimproveWebaccess, thisdissertation attempts to fill the missing gap between previous research and today’s Web. Forabetter understanding of today’sWebtraffic, we first analyze five years(2006-2010) of real Web traffic from a globally-distributed proxy system, which captures
Pitfalls in HTTP Traffic Measurements and Analysis
"... Abstract. Being responsible for more than half of the total traffic volume in the Internet, HTTP is a popular subject for traffic analysis. From our experiences with HTTP traffic analysis we identified a number of pitfalls which can render a carefully executed study flawed. Often these pitfalls can ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
(Show Context)
Abstract. Being responsible for more than half of the total traffic volume in the Internet, HTTP is a popular subject for traffic analysis. From our experiences with HTTP traffic analysis we identified a number of pitfalls which can render a carefully executed study flawed. Often these pitfalls can be avoided easily. Based on passive traffic measurements of 20.000 European residential broadband customers, we quantify the potential error of three issues: Non-consideration of persistent or pipelined HTTP requests, mismatches between the Content-Type header field and the actual content, and mismatches between the Content-Length header and the actual transmitted volume. We find that 60 % (30 %) of all HTTP requests (bytes) are persistent (i. e., not the first in a TCP connection) and 4 % are pipelined. Moreover, we observe a Content-Type mismatch for 35 % of the total HTTP volume. In terms of Content-Length accuracy our data shows a factor of at least 3.2 more bytes reported in the HTTP header than actually transferred. 1
Host Supervisor:
, 2012
"... Researchers who aim to evaluate proposed modifications to the Internet’s architecture face a unique set of challenges. Internet-based measurements provide limited value to such evaluations, as the quantities being measured are easily lost to ambiguity and idiosyncrasy. While simulations offer more c ..."
Abstract
- Add to MetaCart
(Show Context)
Researchers who aim to evaluate proposed modifications to the Internet’s architecture face a unique set of challenges. Internet-based measurements provide limited value to such evaluations, as the quantities being measured are easily lost to ambiguity and idiosyncrasy. While simulations offer more control, Internet-like environments are difficult to construct due to the lack of ground truth in critical areas, such as topological structure and traffic patterns. This thesis develops a network topology and traffic models for a simulation-based evaluation of the PURSUIT rendezvous system, the name-based interdomain routing mechanism of an information-centric future Internet architecture. Although the empirical data used to construct the employed models is imperfect, it is nonetheless useful for identifying invariants which can shed light upon significant architectural characteristics. The contribution of this work is twofold. In addition to being directly applicable to the evaluation of PURSUIT’s rendezvous system, the methods used in this thesis may be applied more generally to any studies which aim to simulate Internet-like systems.
Case Western Reserve University,
"... Abstract—This paper discusses a way to communicate without relying on fixed infrastructure at some central hub. This can be useful for bootstrapping loosely connected peer-to-peer systems, as well as for circumventingegregious policy-based blocking(e.g., for censorship purposes). Our techniques leve ..."
Abstract
- Add to MetaCart
(Show Context)
Abstract—This paper discusses a way to communicate without relying on fixed infrastructure at some central hub. This can be useful for bootstrapping loosely connected peer-to-peer systems, as well as for circumventingegregious policy-based blocking(e.g., for censorship purposes). Our techniques leverage the caching and aging properties of DNS records to create a covert channel of sorts that can be used to store ephemeral information. The only requirement imposed on the actors wishing to publish and/or retrieve this information is that they share a secret that only manifests outside the system and is never directly encoded within the network itself. We conduct several experiments that illustrate the efficacy of our techniques to exchange an IP address that is presumed to be a rendezvous point for future communication. I.
On Browser-Level Event Logging
, 2012
"... In this paper we offer an initial sketch of a new vantage point we are developing to study “the Web ” and users ' interactions with it: we have instrumented the Web browser itself. The Google Chrome browser provides an API to developers that allows the building of extensions to the base functio ..."
Abstract
- Add to MetaCart
(Show Context)
In this paper we offer an initial sketch of a new vantage point we are developing to study “the Web ” and users ' interactions with it: we have instrumented the Web browser itself. The Google Chrome browser provides an API to developers that allows the building of extensions to the base functionality. As part of this system, Chrome allows developers to add listeners to various browser events. Our extension adds listeners that log these events. We discuss the data we obtain from Chrome, our method for addressing privacy issues in the collected data, and initial findings from observing a small set of real users ' Web browsing activities. The findings are modest in absolute terms, but serve to show the efficacy of our monitoring approach.
Understanding HTTP Traffic Performance in TDMA Mesh Networks
"... Abstract—TDMA based wireless mesh networks have gained prominence as some of the recent standards such as WiMAX, 801.11s have proposed the use of TDMA based MAC protocol for mesh networks. But as of yet there have been no attempts to study the performance of HTTP based web browsing traffic in TDMA m ..."
Abstract
- Add to MetaCart
(Show Context)
Abstract—TDMA based wireless mesh networks have gained prominence as some of the recent standards such as WiMAX, 801.11s have proposed the use of TDMA based MAC protocol for mesh networks. But as of yet there have been no attempts to study the performance of HTTP based web browsing traffic in TDMA mesh networks. HTTP web browsing traffic has different characteristics compared with other types of traffic. In particular, as HTTP traffic consists of large number of small sized file transfers (median file size is 10KB), it can impose high scheduling overhead. As we highlight, HTTP traffic requires that RTT (round trip time) be small and also it requires that large sized flows be allocated higher share of bandwidth. Given these characteristics of HTTP traffic, it is not clear what protocol design for TDMA mesh networks performs best. In this work we have studied the comparison of four different TDMA MAC protocols for HTTP web browsing traffic. Two of these protocols follow distributed scheduling, one centralized and the other naive static fixed schedule approach. Comparison of the different protocols enables us to understand as to how the different scheduling mechanisms used by the different protocols affect the HTTP traffic performance. A particularly crucial aspect that our results point out is that the performance of the two distributed protocols specified by the recent WiMAX and 802.11s standard is poor in comparison with the naive static approach for some commonly arising conditions. This implies that there is need for further improvement of these standard protocols. Specifically we observed that the two distributed protocols perform well under high load and single channel operation. But, in comparison with the static approach, their performance is quite poor in presence of wireless packet loss or co-existing large sized HTTP file downloads. Likewise we observed that none of the protocols perform well under all the dimensions that we have considered implying a need to devise a better protocol that can efficiently support HTTP traffic. We believe that our results lay foundation for further efficient protocol design, for TDMA mesh networks.