Results 1 - 10
of
115
Communication-efficient distributed monitoring of thresholded counts
- In Proc. of SIGMOD’06
, 2006
"... Monitoring is an issue of primary concern in current and next gen-eration networked systems. For example, the objective of sensor networks is to monitor their surroundings for a variety of differ-ent applications like atmospheric conditions, wildlife behavior, and troop movements among others. Simil ..."
Abstract
-
Cited by 78 (11 self)
- Add to MetaCart
(Show Context)
Monitoring is an issue of primary concern in current and next gen-eration networked systems. For example, the objective of sensor networks is to monitor their surroundings for a variety of differ-ent applications like atmospheric conditions, wildlife behavior, and troop movements among others. Similarly, monitoring in data net-works is critical not only for accounting and management, but also for detecting anomalies and attacks. Such monitoring applications are inherently continuous and distributed, and must be designed to minimize the communication overhead that they introduce. In this context we introduce and study a fundamental class of problems called “thresholded counts ” where we must return the aggregate frequency count of an event that is continuously monitored by dis-tributed nodes with a user-specified accuracy whenever the actual count exceeds a given threshold value. In this paper we propose to address the problem of thresholded counts by setting local thresholds at each monitoring node and initi-ating communication only when the locally observed data exceeds these local thresholds. We explore algorithms in two categories: static thresholds and adaptive thresholds. In the static case, we consider thresholds based on a linear combination of two alternate strategies, and show that there exists an optimal blend of the two strategies that results in minimum communication overhead. We further show that this optimal blend can be found using a steep-est descent search. In the adaptive case, we propose algorithms that adjust the local thresholds based on the observed distributions of updated information in the distributed monitoring system. We use extensive simulations not only to verify the accuracy of our algorithms and validate our theoretical results, but also to evalu-ate the performance of the two approaches. We find that both ap-proaches yield significant savings over the naive approach of per-forming processing at a centralized location. 1.
Contour Map Matching for Event Detection in Sensor Networks
- In SIGMOD
, 2006
"... Many sensor network applications, such as object tracking and disaster monitoring, require effective techniques for event detection. In this paper, we propose a novel event detection mechanism based on matching the contour maps of in-network sensory data distribution. Our key observation is that eve ..."
Abstract
-
Cited by 54 (9 self)
- Add to MetaCart
Many sensor network applications, such as object tracking and disaster monitoring, require effective techniques for event detection. In this paper, we propose a novel event detection mechanism based on matching the contour maps of in-network sensory data distribution. Our key observation is that events in sensor networks can be abstracted into spatio-temporal patterns of sensory data and that pattern matching can be done efficiently through contour map matching. Therefore, we propose simple SQL extensions to allow users to specify common types of events as patterns in contour maps and study energy-efficient techniques of contour map construction and maintenance for our patternbased event detection. Our experiments with synthetic workloads derived from a real-world coal mine surveillance application validate the effectiveness and efficiency of our approach. 1.
Proof sketches: Verifiable in-network aggregation
- In IEEE Internation Conference on Data Engineering (ICDE
, 2007
"... Recent work on distributed, in-network aggregation assumes a benign population of participants. Unfortunately, modern distributed systems are plagued by malicious participants. In this paper we present a first step towards verifiable yet efficient distributed, in-network aggregation in adversarial s ..."
Abstract
-
Cited by 31 (6 self)
- Add to MetaCart
(Show Context)
Recent work on distributed, in-network aggregation assumes a benign population of participants. Unfortunately, modern distributed systems are plagued by malicious participants. In this paper we present a first step towards verifiable yet efficient distributed, in-network aggregation in adversarial settings. We describe a general framework and threat model for the problem and then present proof sketches, a compact verification mechanism that combines cryptographic signatures and Flajolet-Martin sketches to guarantee acceptable aggregation error bounds with high probability. We derive proof sketches for count aggregates and extend them for random sampling, which can be used to provide verifiable approximations for a broad class of dataanalysis queries, e.g., quantiles and heavy hitters. Finally, we evaluate the practical use of proof sketches, and observe that adversaries can often be reduced to much smaller violations in practice than our worst-case bounds suggest. 1.
Network Imprecision: A new consistency metric for scalable monitoring
- IN OSDI
, 2008
"... This paper introduces a new consistency metric, Network Imprecision (NI), to address a central challenge in largescale monitoring systems: safeguarding correctness despite node and network failures. To implement NI, an overlay that monitors a set of attributes also monitors its own state so that que ..."
Abstract
-
Cited by 27 (2 self)
- Add to MetaCart
(Show Context)
This paper introduces a new consistency metric, Network Imprecision (NI), to address a central challenge in largescale monitoring systems: safeguarding correctness despite node and network failures. To implement NI, an overlay that monitors a set of attributes also monitors its own state so that queries return not only attribute values but also information about the stability of the overlay—the number of nodes whose recent updates may be missing and the number of nodes whose inputs may be double counted due to overlay reconfigurations. When NI indicates that the network is stable, query results reflect the true state of the system, but when the network is unstable, NI puts applications on notice that query results should not be trusted, allowing them to take corrective action such as filtering out inconsistent results. To implement NI’s introspection scalably, our prototype introduces a key optimization, dual-tree prefix aggregation, which exploits overlay symmetry to reduce overheads by more than an order of magnitude. Evaluation of three monitoring applications demonstrates that NI flags inaccurate results while incurring low overheads, and monitoring applications that use NI to select good information can reduce their inaccuracy by nearly a factor of five.
Sparse data aggregation in sensor networks
- in IPSN 07
, 2007
"... We study the problem of aggregating data from a sparse set of nodes in a wireless sensor network. This is a common situation when a sensor network is deployed to detect relatively rare events. In such situations, each node that should participate in the aggregation knows this fact based on its own s ..."
Abstract
-
Cited by 26 (6 self)
- Add to MetaCart
(Show Context)
We study the problem of aggregating data from a sparse set of nodes in a wireless sensor network. This is a common situation when a sensor network is deployed to detect relatively rare events. In such situations, each node that should participate in the aggregation knows this fact based on its own sensor readings, but there is no global knowledge in the network of where all these interesting nodes are located. Instead of blindly querying all nodes in the network, we show how the interesting nodes can autonomously discover each other in a distributed fashion and form an ad hoc aggregation structure that can be used to compute cumulants, moments, or other statistical summaries. Key to our approach is the capability for two nodes that wish to communicate at roughly the same time to discover each other at a cost that is proportional to their network distance. We show how to build nearly optimal aggregation structures that can further deal with network volatility and compensate for the loss or duplication of data by exploiting probabilistic techniques.
Mergeable Summaries
"... We study the mergeability of data summaries. Informally speaking, mergeability requires that, given two summaries on two data sets, there is a way to merge the two summaries into a single summary on the union of the two data sets, while preserving the error and size guarantees. This property means t ..."
Abstract
-
Cited by 22 (7 self)
- Add to MetaCart
(Show Context)
We study the mergeability of data summaries. Informally speaking, mergeability requires that, given two summaries on two data sets, there is a way to merge the two summaries into a single summary on the union of the two data sets, while preserving the error and size guarantees. This property means that the summaries can be merged in a way like other algebraic operators such as sum and max, which is especially useful for computing summaries on massive distributed data. Several data summaries are trivially mergeable by construction, most notably all the sketches that are linear functions of the data sets. But some other fundamental ones like those for heavy hitters and quantiles, are not (known to be) mergeable. In this paper, we demonstrate that these summaries are indeed mergeable or can be made mergeable after appropriate modifications. Specifically, we show that for ε-approximate heavy hitters, there is a deterministic mergeable summary of size O(1/ε); for ε-approximate quantiles, there is a deterministic summary of size O ( 1 log(εn)) that has a restricted form of mergeability, ε and a randomized one of size O ( 1 1 log3/2) with full merge-ε ε ability. We also extend our results to geometric summaries such as ε-approximations and ε-kernels. We also achieve two results of independent interest: (1) we provide the best known randomized streaming bound for ε-approximate quantiles that depends only on ε, of size O ( 1 1 log3/2), and (2) we demonstrate that the MG and the ε ε SpaceSaving summaries for heavy hitters are isomorphic. Supported by NSF under grants CNS-05-40347, IIS-07-
Suppression and failures in sensor networks: a bayesian approach
- in VLDB ’07: Proceedings of the 33rd international
"... ABSTRACT Sensor networks allow continuous data collection on unprecedented scales. The primary limiting factor of such networks is energy, of which communication is the dominant consumer. The default strategy of nodes continually reporting their data to the root results in too much messaging. Suppr ..."
Abstract
-
Cited by 18 (0 self)
- Add to MetaCart
(Show Context)
ABSTRACT Sensor networks allow continuous data collection on unprecedented scales. The primary limiting factor of such networks is energy, of which communication is the dominant consumer. The default strategy of nodes continually reporting their data to the root results in too much messaging. Suppression stands to greatly alleviate this problem. The simplest such scheme is temporal suppression, in which a node transmits its reading only when it has changed beyond some since last transmitted. In the absence of a report, the root can infer that the value remains within ± ; hence, it is still able to derive the history of readings produced at the node. The critical weakness of suppression is message failure, to which sensor networks are particularly vulnerable. Failure creates ambiguity: a non-report may either be a suppression or a failure. Inferring the correct values for missing data and learning the parameters of the underlying process model become quite challenging. We propose a novel solution, BaySail, that incorporates the knowledge of the suppression scheme and application-level redundancy in Bayesian inference. We investigate several redundancy schemes and evaluate them in terms of in-network transmission costs and out-of-network inference efficacy, and the trade-off between these. Our experimental evaluation shows application-level redundancy outperforms retransmissions and basic sampling in both cost and accuracy of inference. The BaySail framework shows suppression schemes are generally effective for data collection, despite the presence of failures.
Attack Resilient Hierarchical Data Aggregation
- in Sensor Networks,” ACM SASN’06
, 2006
"... In a large sensor network, in-network data aggregation, i.e., com-bining partial results at intermediate nodes during message routing, significantly reduces the amount of communication and hence the energy consumed. Recently several researchers have proposed ro-bust aggregation frameworks, which com ..."
Abstract
-
Cited by 18 (2 self)
- Add to MetaCart
(Show Context)
In a large sensor network, in-network data aggregation, i.e., com-bining partial results at intermediate nodes during message routing, significantly reduces the amount of communication and hence the energy consumed. Recently several researchers have proposed ro-bust aggregation frameworks, which combine multi-path routing schemes with duplicate-insensitive algorithms, to accurately com-pute aggregates (e.g., Sum, Count, Average) in spite of message losses resulting from node and transmission failures. However, these aggregation frameworks have been designed without security in mind. Given the lack of hardware support for tamper-resistance and the unattended nature of sensor nodes, sensor networks are highly vulnerable to node compromises. We show that even if a few compromised nodes contribute false sub-aggregate values, this results in large errors in the aggregate computed at the root of the hierarchy. We present modifications to the aggregation algorithms that guard against such attacks, i.e., we present algorithms for re-silient hierarchical data aggregation despite the presence of com-promised nodes in the aggregation hierarchy. We evaluate the per-formance and costs of our approach via both analysis and simula-tion. Our results show that our approach is scalable and efficient.
Programming Modular Robots with Locally Distributed Predicates
"... Abstract — We present a high-level language for programming modular robotic systems, based on locally distributed predicates (LDP), which are distributed conditions that hold for a connected subensemble of the robotic system. An LDP program is a collection of LDPs with associated actions which are t ..."
Abstract
-
Cited by 16 (5 self)
- Add to MetaCart
(Show Context)
Abstract — We present a high-level language for programming modular robotic systems, based on locally distributed predicates (LDP), which are distributed conditions that hold for a connected subensemble of the robotic system. An LDP program is a collection of LDPs with associated actions which are triggered on any subensemble that matches the predicate. The result is a reactive programming language which efficiently and concisely supports ensemble-level programming. We demonstrate the utility of LDP by implementing three common, but diverse, modular robotic tasks. I.
Streaming in a Connected World: Querying and Tracking Distributed Data Streams
- SIGMOD'07
, 2007
"... Today, a majority of data is fundamentally distributed in nature. Data for almost any task is collected over a broad area, and streams in at a much greater rate than ever before. In particular, advances in sensor technology ..."
Abstract
-
Cited by 16 (7 self)
- Add to MetaCart
Today, a majority of data is fundamentally distributed in nature. Data for almost any task is collected over a broad area, and streams in at a much greater rate than ever before. In particular, advances in sensor technology