Results 1 - 10
of
58
Approximate data collection in sensor networks using probabilistic models
- IN ICDE
, 2006
"... Wireless sensor networks are proving to be useful in a variety of settings. A core challenge in these networks is to minimize energy consumption. Prior database research has proposed to achieve this by pushing data-reducing operators like aggregation and selection down into the network. This approac ..."
Abstract
-
Cited by 82 (6 self)
- Add to MetaCart
Wireless sensor networks are proving to be useful in a variety of settings. A core challenge in these networks is to minimize energy consumption. Prior database research has proposed to achieve this by pushing data-reducing operators like aggregation and selection down into the network. This approach has proven unpopular with early adopters of sensor network technology, who typically want to extract complete “dumps ” of the sensor readings, i.e., to run “SELECT *” queries. Unfortunately, because these queries do no data reduction, they consume significant energy in current sensornet query processors. In this paper we attack the “SELECT * ” problem for sensor networks. We propose a robust approximate technique called Ken that uses replicated dynamic probabilistic models to minimize communication from sensor nodes to the network’s PC base station. In addition to data collection, we show that Ken is well suited to anomaly- and event-detection applications. A key challenge in this work is to intelligently exploit spatial correlations across sensor nodes without imposing undue sensor-to-sensor communication burdens to maintain the models. Using traces from two real-world sensor network deployments, we demonstrate that relatively simple models can provide significant communication (and hence energy) savings without undue sacrifice in result quality or frequency. Choosing optimally among even our simple models is NPhard, but our experiments show that a greedy heuristic performs nearly as well as an exhaustive algorithm.
Using Probabilistic Models for Data Management in Acquisitional Environments
, 2005
"... Traditional database systems, particularly those focused on capturing and managing data from the real world, are poorly equipped to deal with the noise, loss, and uncertainty in data. We discuss a suite of techniques based on probabilistic models that are designed to allow database to tolerate noise ..."
Abstract
-
Cited by 35 (3 self)
- Add to MetaCart
Traditional database systems, particularly those focused on capturing and managing data from the real world, are poorly equipped to deal with the noise, loss, and uncertainty in data. We discuss a suite of techniques based on probabilistic models that are designed to allow database to tolerate noise and loss. These techniques are based on exploiting correlations to predict missing values and identify outliers. Interestingly, correlations also provide a way to give approximate answers to users at a significantly lower cost and enable a range of new types of queries over the correlation structure itself. We illustrate a host of applications for our new techniques and queries, ranging from sensor networks to network monitoring to data stream management. We also present a unified architecture for integrating such models into database systems, focusing in particular on acquisitional systems where the cost of capturing data (e.g., from sensors) is itself a significant part of the query processing cost.
Online filtering, smoothing and probabilistic modeling of streaming data
- in ICDE
, 2008
"... In this paper, we address the problem of extending a relational database system to facilitate efficient real-time application of dynamic probabilistic models to streaming data. We use the recently proposed abstraction of model-based views for this purpose, by allowing users to declaratively specify ..."
Abstract
-
Cited by 35 (3 self)
- Add to MetaCart
In this paper, we address the problem of extending a relational database system to facilitate efficient real-time application of dynamic probabilistic models to streaming data. We use the recently proposed abstraction of model-based views for this purpose, by allowing users to declaratively specify the model to be applied, and by presenting the output of the models to the user as a probabilistic database view. We support declarative querying over such views using an extended version of SQL that allows for querying probabilistic data. Underneath we use particle filters, a class of sequential Monte Carlo algorithms commonly used to implement dynamic probabilistic models, to represent the present and historical states of the model as sets of weighted samples (particles) that are kept up-to-date as new readings arrive. We develop novel techniques to convert the queries on the model-based view directly into queries over particle tables, enabling highly efficient query processing. Finally, we present experimental evaluation of our prototype implementation over sensor data from the Intel Lab dataset that demonstrates the feasibility of online modeling of streaming data using our system and establishes the advantages of such tight integration between dynamic probabilistic models and database systems. 1
Energy conservation in wireless sensor networks: A survey
"... In the last years, wireless sensor networks (WSNs) have gained increasing attention from both the research community and actual users. As sensor nodes are generally battery-powered devices, the critical aspects to face concern how to reduce the energy consumption of nodes, so that the network lifeti ..."
Abstract
-
Cited by 29 (6 self)
- Add to MetaCart
In the last years, wireless sensor networks (WSNs) have gained increasing attention from both the research community and actual users. As sensor nodes are generally battery-powered devices, the critical aspects to face concern how to reduce the energy consumption of nodes, so that the network lifetime can be extended to reasonable times. In this paper we first break down the energy consumption for the components of a typical sensor node, and discuss the main directions to energy conservation in WSNs. Then, we present a systematic and comprehensive taxonomy of the energy conservation schemes, which are subsequently discussed in depth. Special attention has been devoted to promising solutions which have not yet obtained a wide attention in the literature, such as techniques for energy efficient data acquisition. Finally we conclude the paper with insights for research directions about energy conservation in WSNs.
Adaptive Sampling for Sensor Networks
- Proc. DMSN’04
, 2004
"... A distributed data-stream architecture finds application in sensor networks for monitoring environment and activities. In such a network, large numbers of sensors deliver continuous data to a central server. The rate at which the data is sampled at each sensor affects the communication resource and ..."
Abstract
-
Cited by 24 (1 self)
- Add to MetaCart
A distributed data-stream architecture finds application in sensor networks for monitoring environment and activities. In such a network, large numbers of sensors deliver continuous data to a central server. The rate at which the data is sampled at each sensor affects the communication resource and the computational load at the central server. In this paper, we propose a novel adaptive sampling technique where the sampling rate at each sensor adapts to the streaming-data characteristics. Our approach employs a Kalman-Filter (KF)-based estimation technique wherein the sensor can use the KF estimation error to adaptively adjust its sampling rate within a given range, autonomously. When the desired sampling rate violates the range, a new sampling rate is requested from the server. The server allocates new sampling rates under the constraint of available resources such that KF estimation error over all the active streaming sensors is minimized. Through empirical studies, we demonstrate the flexibility and effectiveness of our model. 1
Communication-efficient online detection of network-wide anomalies
- In IEEE Conference on Computer Communications (INFOCOM
, 2007
"... Abstract—There has been growing interest in building largescale distributed monitoring systems for sensor, enterprise, and ISP networks. Recent work has proposed using Principal Component Analysis (PCA) over global traffic matrix statistics to effectively isolate network-wide anomalies. To allow suc ..."
Abstract
-
Cited by 23 (6 self)
- Add to MetaCart
Abstract—There has been growing interest in building largescale distributed monitoring systems for sensor, enterprise, and ISP networks. Recent work has proposed using Principal Component Analysis (PCA) over global traffic matrix statistics to effectively isolate network-wide anomalies. To allow such a PCAbased anomaly detection scheme to scale, we propose a novel approximation scheme that dramatically reduces the burden on the production network. Our scheme avoids the expensive step of centralizing all the data by performing intelligent filtering at the distributed monitors. This filtering reduces monitoring bandwidth overheads, but can result in the anomaly detector making incorrect decisions based on a perturbed view of the global data set. We employ stochastic matrix perturbation theory to bound such errors. Our algorithm selects the filtering parameters at local monitors such that the errors made by the detector are guaranteed to lie below a user-specified upper bound. Our algorithm thus allows network operators to explicitly balance the tradeoff between detection accuracy and the amount of data communicated over the network. In addition, our approach enables real-time detection because we exploit continuous monitoring at the distributed monitors. Experiments with traffic data from Abilene backbone network demonstrate that our methods yield significant communication benefits while simultaneously achieving high detection accuracy. I.
Adaptive stream filters for entity-based queries with non-value tolerance
- in VLDB
, 2005
"... We study the problem of applying adaptive filters for approximate query processing in a distributed stream environment. We propose filter bound assignment protocols with the objective of reducing communication cost. Most previous works focus on value-based queries (e.g., average) with numerical erro ..."
Abstract
-
Cited by 22 (3 self)
- Add to MetaCart
We study the problem of applying adaptive filters for approximate query processing in a distributed stream environment. We propose filter bound assignment protocols with the objective of reducing communication cost. Most previous works focus on value-based queries (e.g., average) with numerical error tolerance. In this paper, we cover entity-based queries (e.g., a nearest neighbor query returns object names rather than a single value). In particular, we study non-value-based tolerance (e.g., the answer to the nearest-neighbor query should rank third or above). We investigate different non-value-based error tolerance definitions and discuss how they are applied to two classes of entity-based queries: non-rankbased and rank-based queries. Extensive experiments show that our protocols achieve significant savings in both communication overhead and server computation. 1
Communication-efficient tracking of distributed triggers
, 2006
"... There has been growing interest in large-scale distributed monitoring systems, such as Dynamic Denial of Service attack detectors and sensornet-based environmental monitors. Recent work has posited that these infrastructures lack a critical component, namely a distributed-triggering mechanism that f ..."
Abstract
-
Cited by 19 (7 self)
- Add to MetaCart
There has been growing interest in large-scale distributed monitoring systems, such as Dynamic Denial of Service attack detectors and sensornet-based environmental monitors. Recent work has posited that these infrastructures lack a critical component, namely a distributed-triggering mechanism that fires when an aggregate of remote-site behavior exceeds some threshold. For several scenarios, the trigger conditions of interest are naturally cumulative, they continuously monitor the accumulation of threshold infractions (e.g., resource overuse) over time. In this paper, we develop a novel framework and communicationefficient protocols to support distributed cumulative triggers. In sharp contrast to earlier work focusing on instantaneous violations, we introduce a general model of threshold conditions that enables us to track distributed cumulative violations over time windows of any size. In our system, a central coordinator efficiently tracks aggregate time-series data at remote sites by adaptively informing the sites how to locally filter their data and when to ship new information. Our proposed algorithmic framework allows us to: (1) provide guarantees on the coordinator’s triggering accuracy; (2) flexibly tradeoff communication overhead versus accuracy; and, (3) develop an analytic solution for computing local filtering parameters. Our work is the first to solve the problem of communication-efficient monitoring for distributed cumulative trigger conditions using principled solutions with accuracy guarantees. We evaluate our system using time-series data generated from SNORT logs on PlanetLab nodes and demonstrate that our methods yield significant communication overhead reductions while simultaneously achieving high detection accuracy, even for highly variable data streams. 1
Information Fusion for Wireless Sensor Networks: Methods, Models, and Classifications
"... Wireless sensor networks produce a large amount of data that needs to be processed, delivered, and assessed according to the application objectives. The way these data are manipulated by the sensor nodes is a fundamental issue. Information fusion arises as a response to process data gathered by sens ..."
Abstract
-
Cited by 17 (1 self)
- Add to MetaCart
Wireless sensor networks produce a large amount of data that needs to be processed, delivered, and assessed according to the application objectives. The way these data are manipulated by the sensor nodes is a fundamental issue. Information fusion arises as a response to process data gathered by sensor nodes and benefits from their processing capability. By exploiting the synergy among the available data, information fusion techniques can reduce the amount of data traffic, filter noisy measurements, and make predictions and inferences about a monitored entity. In this work, we survey the current state-of-the-art of information fusion by presenting the known methods, algorithms, architectures, and models of information fusion, and
A.: Location-dependent queries in mobile contexts: Distributed processing using mobile agents
- IEEE TMC
, 2006
"... Abstract—With the current advances of mobile computing technology, we are witnessing an explosion in the development of applications that provide mobile users with a wide range of services. In this paper, we present a system that supports distributed processing of continuous location-dependent queri ..."
Abstract
-
Cited by 14 (7 self)
- Add to MetaCart
Abstract—With the current advances of mobile computing technology, we are witnessing an explosion in the development of applications that provide mobile users with a wide range of services. In this paper, we present a system that supports distributed processing of continuous location-dependent queries in mobile environments. The system that we propose presents the following main advantages: 1) it is a general solution for the processing of location-dependent queries in scenarios where not only the users issuing queries, but also other interesting objects can move; 2) it performs an efficient processing of these queries in a continuous way; 3) it is especially well adapted to environments where location data are distributed in a network and processing tasks can be performed in parallel, allowing a high scalability; and 4) it optimizes wireless communications. We use mobile agents to carry the processing tasks wherever they are needed. Thus, agents are in charge of tracking the location of interesting moving objects and refreshing the answer to a query efficiently. We evaluate the usefulness of the presented proposal showing that the system achieves a good precision and scales up well. Index Terms—Distributed processing of continuous location queries, tracking moving objects, mobile agents. 1

