Results 1 -
3 of
3
AVMON: Optimal and scalable discovery of consistent availability monitoring overlays for distributed systems
- In Proc. ICDCS, 2007
, 2007
"... This paper addresses the problem of selection and discovery of a consistent availability monitoring overlay for computer hosts in a large-scale distributed application, where hosts may be selfish or colluding. We motivate six significant goals for the problem- consistency, verifiability, and randomn ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
This paper addresses the problem of selection and discovery of a consistent availability monitoring overlay for computer hosts in a large-scale distributed application, where hosts may be selfish or colluding. We motivate six significant goals for the problem- consistency, verifiability, and randomness, in selecting the availability monitors of nodes, as well as discoverability, load-balancing, and scalability in finding these monitors. We then present a new system, called AVMON, that is the first to satisfy these six requirements. The core algorithmic contribution of this paper is a protocol for discovering the availability monitoring overlay in a scalable and efficient manner, given any arbitrary monitor selection scheme that is consistent and verifiable. We mathematically analyze the performance of AVMON’s discovery protocols, and derive an optimal variant that minimizes memory, bandwidth, computation, and discovery time of monitors. Our experimental evaluations of AVMON use three types of availability traces- synthetic, from PlanetLab, and from a peer-to-peer system (Overnet)- and demonstrate that AVMON works well in a variety of distributed systems.
AVMEM- Availability-Aware Overlays for Management Operations in Non-cooperative Distributed Systems ⋆
"... Abstract. Monitoring and management operations that query nodes based on their availability can be extremely useful in a variety of largescale distributed systems containing hundreds to thousands of hosts, e.g., p2p systems, Grids, and PlanetLab. This paper presents decentralized and scalable soluti ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Abstract. Monitoring and management operations that query nodes based on their availability can be extremely useful in a variety of largescale distributed systems containing hundreds to thousands of hosts, e.g., p2p systems, Grids, and PlanetLab. This paper presents decentralized and scalable solutions to a subset of such availability-based management tasks. Specifically, we propose AVMEM, which is the first availabilityaware overlay to date. AVMEM is intended for generic non-cooperative scenarios where nodes may be selfish and may wish to route messages to a large set of other nodes, especially if the selfish node has low availability. Under this setting, our concrete contributions are the following: (1) AVMEM allows arbitrary classes of application-specified predicates to create the membership relationships in the overlay. In order to avoid selfish nodes from exploiting the system, we focus on predicates that are random and consistent. In other words, whether a given node y is a neighbor of a given node x is decided based on a consistent and probabilistic predicate, dependent solely on the identifiers and availabilities of these two nodes, but without using any external inputs. (2) AVMEM protocols discover and maintain the overlay spanned by the application-specified AVMEM predicate in a scalable and fast manner. (3) We use AVMEM to execute important availability-based management operations, focusing on range-anycast, range-multicast, threshold-anycast, and thresholdmulticast. AVMEM works well in the presence of selfish nodes, scales to thousands of nodes, and executes each of the targeted operations quickly and reliably. Our evaluation is driven by real-life churn traces from the Overnet p2p system, and shows that AVMEM works well in practical settings.
DESIGN OF AVAILABILITY-DEPENDENT DISTRIBUTED SERVICES IN LARGE-SCALE UNCOOPERATIVE SETTINGS
, 2009
"... Availability-dependent global predicates can be efficiently and scalably realized for a class of distributed services, in spite of specific selfish and colluding behaviors, using local and decentralized protocols. Several types of large-scale distributed systems spanning the Internet have to deal wi ..."
Abstract
- Add to MetaCart
Availability-dependent global predicates can be efficiently and scalably realized for a class of distributed services, in spite of specific selfish and colluding behaviors, using local and decentralized protocols. Several types of large-scale distributed systems spanning the Internet have to deal with availability variations among their constituent nodes. In dealing with churn and low availability nodes, we believe it is important to link the availability of a node to the service the node receives from the distributed system. In other words, high availability has to be incentivized with better service. There are two types of requirements for this problem. First, metrics such as message overhead, CPU usage, memory overhead and latency need to be optimized to achieve scalability and efficiency. Secondly, in open distributed systems spanning multiple organizations, the protocols have to tolerate selfish and colluding nodes, i.e., low availability nodes that attempt to receive better service. This thesis approaches this problem by explicitly linking each node’s service to its availability, via the notion of a global predicate. We present a class of novel distributed protocols that achieve a given availability-dependent global predicate, efficiently and scalably. These protocols execute in a

