Results 1 - 10
of
178
Near-optimal sensor placements: Maximizing information while minimizing communication cost
- In IPSN
, 2006
"... When monitoring spatial phenomena with wireless sensor networks, selecting the best sensor placements is a fundamental task. Not only should the sensors be informative, but they should also be able to communicate efficiently. In this paper, we present a data-driven approach that addresses the three ..."
Abstract
-
Cited by 152 (19 self)
- Add to MetaCart
When monitoring spatial phenomena with wireless sensor networks, selecting the best sensor placements is a fundamental task. Not only should the sensors be informative, but they should also be able to communicate efficiently. In this paper, we present a data-driven approach that addresses the three central aspects of this problem: measuring the predictive quality of a set of sensor locations (regardless of whether sensors were ever placed at these locations), predicting the communication cost involved with these placements, and designing an algorithm with provable quality guarantees that optimizes the NP-hard tradeoff. Specifically, we use data from a pilot deployment to build non-parametric probabilistic models called Gaussian Processes (GPs) both for the spatial phenomena of interest and for the spatial variability of link qualities, which allows us to estimate predictive power and communication cost of unsensed locations. Surprisingly, uncertainty in the representation of link qualities plays an important role in estimating communication costs. Using these models, we present a novel, polynomial-time, data-driven algorithm, pSPIEL, which selects Sensor Placements at Informative and cost-Effective Locations. Our approach exploits two important properties of this problem: submodularity, formalizing the intuition that adding a node to a small deployment can help more than adding a node to a large deployment; and locality, under which nodes that are far from each other provide almost independent information. Exploiting these properties, we prove strong approximation guarantees for our pSPIEL approach. We also provide extensive experimental validation of this practical approach on several real-world placement problems, and built a complete system implementation on 46 Tmote Sky motes, demonstrating significant advantages over existing methods.
Approximate data collection in sensor networks using probabilistic models
- IN ICDE
, 2006
"... Wireless sensor networks are proving to be useful in a variety of settings. A core challenge in these networks is to minimize energy consumption. Prior database research has proposed to achieve this by pushing data-reducing operators like aggregation and selection down into the network. This approac ..."
Abstract
-
Cited by 148 (7 self)
- Add to MetaCart
Wireless sensor networks are proving to be useful in a variety of settings. A core challenge in these networks is to minimize energy consumption. Prior database research has proposed to achieve this by pushing data-reducing operators like aggregation and selection down into the network. This approach has proven unpopular with early adopters of sensor network technology, who typically want to extract complete “dumps ” of the sensor readings, i.e., to run “SELECT *” queries. Unfortunately, because these queries do no data reduction, they consume significant energy in current sensornet query processors. In this paper we attack the “SELECT * ” problem for sensor networks. We propose a robust approximate technique called Ken that uses replicated dynamic probabilistic models to minimize communication from sensor nodes to the network’s PC base station. In addition to data collection, we show that Ken is well suited to anomaly- and event-detection applications. A key challenge in this work is to intelligently exploit spatial correlations across sensor nodes without imposing undue sensor-to-sensor communication burdens to maintain the models. Using traces from two real-world sensor network deployments, we demonstrate that relatively simple models can provide significant communication (and hence energy) savings without undue sacrifice in result quality or frequency. Choosing optimally among even our simple models is NPhard, but our experiments show that a greedy heuristic performs nearly as well as an exhaustive algorithm.
MauveDB: supporting model-based user views in database systems
- In SIGMOD
, 2006
"... Real-world data — especially when generated by distributed measurement infrastructures such as sensor networks — tends to be incomplete, imprecise, and erroneous, making it impossible to present it to users or feed it directly into applications. The traditional approach to dealing with this problem ..."
Abstract
-
Cited by 108 (7 self)
- Add to MetaCart
(Show Context)
Real-world data — especially when generated by distributed measurement infrastructures such as sensor networks — tends to be incomplete, imprecise, and erroneous, making it impossible to present it to users or feed it directly into applications. The traditional approach to dealing with this problem is to first process the data using statistical or probabilistic models that can provide more robust interpretations of the data. Current database systems, however, do not provide adequate support for applying models to such data, especially when those models need to be frequently updated as new data arrives in the system. Hence, most scientists and engineers who depend on models for managing their data do not use database systems for archival or querying at all; at best, databases serve as a persistent raw data store. In this paper we define a new abstraction called modelbased views and present the architecture of MauveDB, the system we are building to support such views. Just as traditional database views provide logical data independence, model-based views provide independence from the details of the underlying data generating mechanism and hide the irregularities of the data by using models to present a consistent view to the users. MauveDB supports a declarative language for defining model-based views, allows declarative querying over such views using SQL, and supports several different materialization strategies and techniques to efficiently maintain them in the face of frequent updates. We have implemented a prototype system that currently supports views based on regression and interpolation, using the Apache Derby open source DBMS, and we present results that show the utility and performance benefits that can be obtained by supporting several different types of modelbased views in a database system. 1.
Information fusion for wireless sensor networks: methods, models, and classifications,”
- Article ID 1267073,
, 2007
"... ..."
A robust architecture for distributed inference in sensor networks
, 2005
"... Abstract — Many inference problems that arise in sensor networks require the computation of a global conclusion that is consistent with local information known to each node. A large class of these problems— including probabilistic inference, regression, and control problems—can be solved by message ..."
Abstract
-
Cited by 73 (2 self)
- Add to MetaCart
(Show Context)
Abstract — Many inference problems that arise in sensor networks require the computation of a global conclusion that is consistent with local information known to each node. A large class of these problems— including probabilistic inference, regression, and control problems—can be solved by message passing on a data structure called a junction tree. In this paper, we present a distributed architecture for solving these problems that is robust to unreliable communication and node failures. In this architecture, the nodes of the sensor network assemble themselves into a junction tree and exchange messages between neighbors to solve the inference problem efficiently and exactly. A key part of the architecture is an efficient distributed algorithm for optimizing the choice of junction tree to minimize the communication and computation required by inference. We present experimental results from a prototype implementation on a 97-node Mica2 mote network, as well as simulation results for three applications: distributed sensor calibration, optimal control, and sensor field modeling. These experiments demonstrate that our distributed architecture can solve many important inference problems exactly, efficiently, and robustly. I.
Using Probabilistic Models for Data Management in Acquisitional Environments
, 2005
"... Traditional database systems, particularly those focused on capturing and managing data from the real world, are poorly equipped to deal with the noise, loss, and uncertainty in data. We discuss a suite of techniques based on probabilistic models that are designed to allow database to tolerate noise ..."
Abstract
-
Cited by 56 (3 self)
- Add to MetaCart
Traditional database systems, particularly those focused on capturing and managing data from the real world, are poorly equipped to deal with the noise, loss, and uncertainty in data. We discuss a suite of techniques based on probabilistic models that are designed to allow database to tolerate noise and loss. These techniques are based on exploiting correlations to predict missing values and identify outliers. Interestingly, correlations also provide a way to give approximate answers to users at a significantly lower cost and enable a range of new types of queries over the correlation structure itself. We illustrate a host of applications for our new techniques and queries, ranging from sensor networks to network monitoring to data stream management. We also present a unified architecture for integrating such models into database systems, focusing in particular on acquisitional systems where the cost of capturing data (e.g., from sensors) is itself a significant part of the query processing cost.
A sampling-based approach to optimizing top-k queries in sensor networks
- In ICDE
, 2006
"... Wireless sensor networks generate a vast amount of data. This data, however, must be sparingly extracted to conserve energy, usually the most precious resource in battery-powered sensors. When approximation is acceptable, a model-driven approach to query processing is effective in saving energy by a ..."
Abstract
-
Cited by 53 (3 self)
- Add to MetaCart
(Show Context)
Wireless sensor networks generate a vast amount of data. This data, however, must be sparingly extracted to conserve energy, usually the most precious resource in battery-powered sensors. When approximation is acceptable, a model-driven approach to query processing is effective in saving energy by avoiding contacting nodes whose values can be predicted or are unlikely to be in the result set. However, to optimize queries such as topk, reasoning directly with models of joint probability distributions can be prohibitively expensive. Instead of using models explicitly, we propose to use samples of past sensor readings. Not only are such samples simple to maintain, but they are also computationally efficient to use in query optimization. With these samples, we can formulate the problem of optimizing approximate topk queries under an energy constraint as a linear program. We demonstrate the power and flexibility of our sampling-based approach by developing a series of top-k query planning algorithms with linear programming, which are capable of efficiently producing plans with better performance and novel features. We show that our approach is both theoretically sound and practically effective on simulated and real-world datasets. 1
Model-based Approximate Querying in Sensor Networks
- VLDB JOURNAL
, 2005
"... Declarative queries are proving to be an attractive paradigm for interacting with networks of wireless sensors. The metaphor that “the sensornet is a database” is problematic, however, because sensors do not exhaustively represent the data in the real world. In order to map the raw sensor readings ..."
Abstract
-
Cited by 51 (0 self)
- Add to MetaCart
Declarative queries are proving to be an attractive paradigm for interacting with networks of wireless sensors. The metaphor that “the sensornet is a database” is problematic, however, because sensors do not exhaustively represent the data in the real world. In order to map the raw sensor readings onto physical reality, a model of that reality is required to complement the readings. In this article, we enrich interactive sensor querying with statistical modeling techniques. We demonstrate that such models can help provide answers that are both more meaningful, and, by introducing approximations with probabilistic confidences, significantly more efficient to compute in both time and energy. Utilizing the combination of a model and live data acquisition raises the challenging optimization problem of selecting the best sensor readings to acquire, balancing the increase in the confidence of our answer against the communication and data acquisition costs in the network. We describe an exponential time algorithm for finding the optimal solution to this optimization problem, and a polynomial-time heuristic for identifying solutions that perform well in practice. We evaluate our approach on several real-world sensor-network data sets, taking into account the real measured data and communication quality, demonstrating that our model-based approach provides a high-fidelity representation of the real phenomena and leads to significant performance gains versus traditional data acquisition techniques.
Robust Probabilistic Inference in Distributed Systems
- IN UAI
, 2004
"... Probabilistic inference problems arise naturally in distributed systems such as sensor networks and teams of mobile robots. Inference algorithms that use message passing are a natural fit for distributed systems, but they must be robust to the failure situations that arise in real-world setting ..."
Abstract
-
Cited by 44 (5 self)
- Add to MetaCart
Probabilistic inference problems arise naturally in distributed systems such as sensor networks and teams of mobile robots. Inference algorithms that use message passing are a natural fit for distributed systems, but they must be robust to the failure situations that arise in real-world settings, such as unreliable communication and node failures. Unfortunately, the popular sum--product algorithm can yield very poor estimates in these settings because the nodes' beliefs before convergence can be arbitrarily different from the correct posteriors. In this paper, we present a new message passing algorithm for probabilistic inference which provides several crucial guarantees that the standard sum--product algorithm does not. Not only does it converge to the correct posteriors, but it is also guaranteed to yield a principled approximation at any point before convergence. In addition, the computational complexity of the message passing updates depends only upon the model, and is independent of the network topology of the distributed system. We demonstrate the approach with detailed experimental results on a distributed sensor calibration task using data from an actual sensor network deployment.
An Adaptive Strategy for Quality-Based Data Reduction in Wireless Sensor Networks
- in: Proceedings of the 3rd International Conference on Networked Sensing Systems (INSS’06
, 2006
"... Wireless sensor networks allow fine-grained observations of real-world phenomena. However, providing constant measurement updates incurs high communication costs for each individual node, resulting in increased energy depletion in the network. Data reduction strategies aim at reducing the amount of ..."
Abstract
-
Cited by 40 (4 self)
- Add to MetaCart
Wireless sensor networks allow fine-grained observations of real-world phenomena. However, providing constant measurement updates incurs high communication costs for each individual node, resulting in increased energy depletion in the network. Data reduction strategies aim at reducing the amount of data sent by each node, for example by predicting the measured values both at the source and the sink node, thus only requiring nodes to send the readings that deviate from the prediction. While effectively reducing power consumption, such techniques so far needed to rely on a-priori knowledge to correctly model the expected values. Our approach instead employs an algorithm that requires no prior modeling, allowing nodes to work independently and without using global model parameters. Using the LeastMean -Square (LMS) adaptive algorithm on a publicly available, real-world (office environment) temperature data set, we have been able to achieve up to 92% communication reduction while maintaining a minimal accuracy of 0.5 degree Celsius.