Results 1 - 10
of
110
detecting the unexpected in distributed systems
- In NSDI’06: Proceedings of the 3rd conference on 3rd Symposium on Networked Systems Design & Implementation
"... Bugs in distributed systems are often hard to find. Many bugs reflect discrepancies between a system’s behavior and the programmer’s assumptions about that behavior. We present Pip 1, an infrastructure for comparing actual behavior and expected behavior to expose structural errors and performance pr ..."
Abstract
-
Cited by 75 (6 self)
- Add to MetaCart
Bugs in distributed systems are often hard to find. Many bugs reflect discrepancies between a system’s behavior and the programmer’s assumptions about that behavior. We present Pip 1, an infrastructure for comparing actual behavior and expected behavior to expose structural errors and performance problems in distributed systems. Pip allows programmers to express, in a declarative language, expectations about the system’s communications structure, timing, and resource consumption. Pip includes system instrumentation and annotation tools to log actual system behavior, and visualization and query tools for exploring expected and unexpected behavior 2. Pip allows a developer to quickly understand and debug both familiar and unfamiliar systems. We applied Pip to several applications, including FAB, SplitStream, Bullet, and RanSub. We generated most of the instrumentation for all four applications automatically. We found the needed expectations easy to write, starting in each case with automatically generated expectations. Pip found unexpected behavior in each application, and helped to isolate the causes of poor performance and incorrect behavior. 1
Declarative Networking: Language, Execution and Optimization
, 2006
"... The networking and distributed systems communities have recently explored a variety of new network architectures, both for applicationlevel overlay networks, and as prototypes for a next-generation Internet architecture. In this context, we have investigated declarative networking: the use of a dis ..."
Abstract
-
Cited by 57 (18 self)
- Add to MetaCart
The networking and distributed systems communities have recently explored a variety of new network architectures, both for applicationlevel overlay networks, and as prototypes for a next-generation Internet architecture. In this context, we have investigated declarative networking: the use of a distributed recursive query engine as a powerful vehicle for accelerating innovation in network architectures [23, 24, 33]. Declarative networking represents a significant new application area for database research on recursive query processing. In this paper, we address fundamental database issues in this domain. First, we motivate and formally define the Network Datalog (NDlog) language for declarative network specifications. Second, we introduce and prove correct relaxed versions of the traditional semi-na ve query evaluation technique, to overcome fundamental problems of the traditional technique in an asynchronous distributed setting. Third, we consider the dynamics of network state, and formalize the "eventual consistency" of our programs even when bursts of updates can arrive in the midst of query execution. Fourth, we present a number of query optimization opportunities that arise in the declarative networking context, including applications of traditional techniques as well as new optimizations. Last, we present evaluation results of the above ideas implemented in our P2 declarative networking system, running on 100 machines over the Emulab network testbed.
The design and implementation of a declarative sensor network system
- In ACM SenSys
, 2006
"... Sensor networks are notoriously difficult to program, given that they encompass the complexities of both distributed and embedded systems. To address this problem, we present the design and implementation of a declarative sensor network platform, DSN: a declarative language, compiler and runtime sui ..."
Abstract
-
Cited by 49 (11 self)
- Add to MetaCart
Sensor networks are notoriously difficult to program, given that they encompass the complexities of both distributed and embedded systems. To address this problem, we present the design and implementation of a declarative sensor network platform, DSN: a declarative language, compiler and runtime suitable for programming a broad range of sensornet applications. We demonstrate that our approach is a natural fit for sensor networks by specifying several very different classes of traditional sensor network protocols, services and applications entirely declaratively – these include tree and geographic routing, link estimation, data collection, event tracking, version coherency, and localization. To our knowledge, this is the first time these disparate sensornet tasks have been addressed by a single high-level programming environment. Moreover, the declarative approach accommodates the desire for architectural flexibility and simple management of limited resources. Our results suggest that the declarative approach is well-suited to sensor networks, and that it can produce concise and flexible code by focusing on what the code is doing, and not on how it is doing it.
CONMan: A Step Towards Network Manageability
- In Proc. of ACM SIGCOMM
, 2007
"... Networks are hard to manage and in spite of all the so called holistic management packages, things are getting worse. We argue that the difficulty of network management can partly be attributed to a fundamental flaw in the existing architecture: protocols expose all their internal details and hence, ..."
Abstract
-
Cited by 36 (1 self)
- Add to MetaCart
Networks are hard to manage and in spite of all the so called holistic management packages, things are getting worse. We argue that the difficulty of network management can partly be attributed to a fundamental flaw in the existing architecture: protocols expose all their internal details and hence, the complexity of the ever-evolving data plane encumbers the management plane. Guided by this observation, in this paper we explore an alternative approach and propose Complexity Oblivious Network Management (CONMan), a network architecture in which the management interface of data-plane protocols includes minimal protocol-specific information. This restricts the operational complexity of protocols to their implementation and allows the management plane to achieve high level policies in a structured fashion. We built the CON-Man interface of a few protocols and a management tool that can achieve high-level configuration goals based on this interface. Our preliminary experience with applying this tool to real world VPN configuration indicates the architecture’s potential to alleviate the difficulty of configuration management.
Declarative Networking
, 2009
"... Declarative Networking is a programming methodology that enables developers to concisely specify network protocols and services, which are directly compiled to a dataflow framework that executes the specifications. This paper provides an introduction to basic issues in declarative networking, includ ..."
Abstract
-
Cited by 27 (17 self)
- Add to MetaCart
Declarative Networking is a programming methodology that enables developers to concisely specify network protocols and services, which are directly compiled to a dataflow framework that executes the specifications. This paper provides an introduction to basic issues in declarative networking, including language design, optimization and dataflow execution. We present the intuition behind declarative programming of networks, including roots in Datalog, extensions for networked environments, and the semantics of long-running queries over network state. We focus on a sublanguage we call Network Datalog (NDlog), including execution strategies that provide crisp eventual consistency semantics with significant flexibility in execution. We also describe a more general language called Overlog, which makes some compromises between expressive richness and semantic guarantees. We provide an overview of declarative network protocols, with a focus on routing protocols and overlay networks. Finally, we highlight related work in declarative networking, and new declarative approaches to related problems.
Friday: Global comprehension for distributed replay
- In Proceedings of the Fourth Symposium on Networked Systems Design and Implementation (NSDI ’07
, 2007
"... Debugging and profiling large-scale distributed applications is a daunting task. We present Friday, a system for debugging distributed applications that combines deterministic replay of components with the power of symbolic, low-level debugging and a simple language for expressing higher-level distr ..."
Abstract
-
Cited by 26 (0 self)
- Add to MetaCart
Debugging and profiling large-scale distributed applications is a daunting task. We present Friday, a system for debugging distributed applications that combines deterministic replay of components with the power of symbolic, low-level debugging and a simple language for expressing higher-level distributed conditions and actions. Friday allows the programmer to understand the collective state and dynamics of a distributed collection of coordinated application components. To evaluate Friday, we consider several distributed problems, including routing consistency in overlay networks, and temporal state abnormalities caused by route flaps. We show via micro-benchmarks and larger-scale application measurement that Friday can be used interactively to debug large distributed applications under replay on common hardware. 1
A modular network layer for sensornets
- USENIX OSDI
, 2006
"... An overall sensornet architecture would help tame the increasingly complex structure of wireless sensornet software and help foster greater interoperability between different codebases. A previous step in this direction is the Sensornet Protocol (SP), a unifying link-abstraction layer. This paper ta ..."
Abstract
-
Cited by 23 (2 self)
- Add to MetaCart
An overall sensornet architecture would help tame the increasingly complex structure of wireless sensornet software and help foster greater interoperability between different codebases. A previous step in this direction is the Sensornet Protocol (SP), a unifying link-abstraction layer. This paper takes the natural next step by proposing a modular network-layer for sensornets that sits atop SP. This modularity eases implementation of new protocols by increasing code reuse, and enables co-existing protocols to share and reduce code and resources consumed at run-time. We demonstrate how current protocols can be decomposed into this modular structure and show that the costs, in performance and code footprint, are minimal relative to their monolithic counterparts. 1
Using Queries for Distributed Monitoring and Forensics
, 2006
"... Distributed systems are hard to build, profile, debug, and test. Monitoring a distributed system – to detect and analyze bugs, test for regressions, identify fault-tolerance problems or security compromises – can be difficult and error-prone. In this paper we argue that declarative development of di ..."
Abstract
-
Cited by 18 (1 self)
- Add to MetaCart
Distributed systems are hard to build, profile, debug, and test. Monitoring a distributed system – to detect and analyze bugs, test for regressions, identify fault-tolerance problems or security compromises – can be difficult and error-prone. In this paper we argue that declarative development of distributed systems is well suited to tackle these tasks. We present an application logging, monitoring, and debugging facility that we have built on top of the P2 system, comprising an introspection model, an execution tracing component, and a distributed query processor. We use this facility to demonstrate a range of on-line distributed diagnosis tools that range from simple, local state assertions to sophisticated global property detectors on consistent snapshots. These tools are small, simple, and can be deployed piecemeal on-line at any point during a system’s life cycle. Our evaluation suggests that the overhead of our approach to improving and monitoring running distributed systems continuously is well in tune with its benefits.
Efficient Querying and Maintenance of Network Provenance at Internet-Scale
"... Network accountability, forensic analysis, and failure diagnosis are becoming increasingly important for network management and security. Such capabilities often utilize network provenance – the ability to issue queries over network meta-data. For example, network provenance may be used to trace the ..."
Abstract
-
Cited by 17 (10 self)
- Add to MetaCart
Network accountability, forensic analysis, and failure diagnosis are becoming increasingly important for network management and security. Such capabilities often utilize network provenance – the ability to issue queries over network meta-data. For example, network provenance may be used to trace the path a message traverses on the network as well as to determine how message data were derived and which parties were involved in its derivation. This paper presents the design and implementation of ExSPAN, a generic and extensible framework that achieves efficient network provenance in a distributed environment. We utilize the database notion of data provenance to “explain ” the existence of any network state, providing a versatile mechanism for network provenance. To achieve such flexibility at Internet-scale, ExSPAN uses declarative networking in which network protocols can be modeled as continuous queries over distributed streams and specified concisely in a declarative query language. We extend existing data models for provenance developed in database literature to enable distribution at Internet-scale, and investigate numerous optimization techniques to maintain and query distributed network provenance efficiently. The ExSPAN prototype is developed using Rapid-Net, a declarative networking platform based on the emerging ns-3 toolkit. Experiments over a simulated network and an actual deployment in a testbed environment demonstrate that our system supports a wide range of distributed provenance computations efficiently, resulting in significant reductions in bandwidth costs compared to traditional approaches.
BFT Protocols Under Fire
"... Much recent work on Byzantine state machine replication focuses on protocols with improved performance under benign conditions (LANs, homogeneous replicas, limited crash faults), with relatively little evaluation under typical, practical conditions (WAN delays, packet loss, transient disconnection, ..."
Abstract
-
Cited by 16 (0 self)
- Add to MetaCart
Much recent work on Byzantine state machine replication focuses on protocols with improved performance under benign conditions (LANs, homogeneous replicas, limited crash faults), with relatively little evaluation under typical, practical conditions (WAN delays, packet loss, transient disconnection, shared resources). This makes it difficult for system designers to choose the appropriate protocol for a real target deployment. Moreover, most protocol implementations differ in their choice of runtime environment, crypto library, and transport, hindering direct protocol comparisons even under similar conditions. We present a simulation environment for such protocols that combines a declarative networking system with a robust network simulator. Protocols can be rapidly implemented from pseudocode in the high-level declarative language of the former, while network conditions and (measured) costs of communication packages and crypto primitives can be plugged into the latter. We show that the resulting simulator faithfully predicts the performance of native protocol implementations, both as published and as measured in our local network. We use the simulator to compare representative protocols under identical conditions and rapidly explore the effects of changes in the costs of crypto operations, workloads, network conditions and faults. For example, we show that Zyzzyva outperforms protocols like PBFT and Q/U under most but not all conditions, indicating that one-size-fits-all protocols may be hard if not impossible to design in practice. 1

