Results 1  10
of
32
A ConstantFactor Approximation Algorithm for the Multicommodity RentorBuy Problem
"... ... Recent work on buyatbulk network design has concentrated on the special case where all sinks are identical; existing constantfactor approximation algorithms for this special case make crucial use of the assumption that all commodities ship flow to the same sink vertex and do not obviously ext ..."
Abstract

Cited by 82 (8 self)
 Add to MetaCart
(Show Context)
... Recent work on buyatbulk network design has concentrated on the special case where all sinks are identical; existing constantfactor approximation algorithms for this special case make crucial use of the assumption that all commodities ship flow to the same sink vertex and do not obviously extend to the multicommodity rentorbuy problem. Prior to our work, the best heuristics for the multicommodity rentorbuy problem achieved only logarithmic performance guarantees and relied on the machinery of relaxed metrical task systems or of metric embeddings. By contrast, we solve the network design problem directly via a novel primaldual algorithm.
Adaptive ordering of pipelined stream filters
, 2003
"... We consider the problem of pipelined filters, where a continuous stream of tuples is processed by a set of commutative filters. Pipelined filters are common in stream applications and capture a large class of multiway stream joins. We focus on the problem of ordering the filters adaptively to mini ..."
Abstract

Cited by 77 (13 self)
 Add to MetaCart
(Show Context)
We consider the problem of pipelined filters, where a continuous stream of tuples is processed by a set of commutative filters. Pipelined filters are common in stream applications and capture a large class of multiway stream joins. We focus on the problem of ordering the filters adaptively to minimize processing cost in an environment where stream and filter characteristics vary unpredictably over time. Our core algorithm, AGreedy (for Adaptive Greedy), has strong theoretical guarantees: If stream and filter characteristics were to stabilize, AGreedy would converge to an ordering within a small constant factor of optimal. (In experiments AGreedy usually converges to the optimal ordering.) One very important feature of AGreedy is that it monitors and responds to selectivities that are correlated across filters (i.e., that are nonindependent), which provides the strong quality guarantee but incurs runtime overhead. We identify a threeway tradeoff among provable convergence to good orderings, runtime overhead, and speed of adaptivity. We develop a suite of variants of AGreedy that lie at different points on this tradeoff spectrum. We have implemented all our algorithms in the STREAM prototype Data Stream Management System and a thorough performance evaluation is presented. 1.
Operator Placement for InNetwork Stream Query Processing
 In Proc. the 24th ACM SIGACTSIGMODSIGART Symposium on Principle of Database Systems(PODS
, 2005
"... In sensor networks, data acquisition frequently takes place at lowcapability devices. The acquired data is then transmitted through a hierarchy of nodes having progressively increasing network bandwidth and computational power. We consider the problem of executing queries over these data streams, p ..."
Abstract

Cited by 61 (0 self)
 Add to MetaCart
(Show Context)
In sensor networks, data acquisition frequently takes place at lowcapability devices. The acquired data is then transmitted through a hierarchy of nodes having progressively increasing network bandwidth and computational power. We consider the problem of executing queries over these data streams, posed at the root of the hierarchy. To minimize data transmission, it is desirable to perform “innetwork ” query processing: do some part of the work at intermediate nodes as the data travels to the root. Most previous work on innetwork query processing has focused on aggregation and inexpensive filters. In this paper, we address innetwork processing for queries involving possibly expensive conjunctive filters, and joins. We consider the problem of placing operators along the nodes of the hierarchy so that the overall cost of computation and data transmission is minimized. We show that the problem is tractable, give an optimal algorithm, and demonstrate that a simpler greedy operator placement algorithm can fail to find the optimal solution. Finally we define a number of interesting variations of the basic operator placement problem and demonstrate their hardness. 1
Efficient Search for the Topk Probable Nearest Neighbors in Uncertain Databases ABSTRACT
"... Uncertainty pervades many domains in our lives. Current reallife applications, e.g., location tracking using GPS devices or cell phones, multimedia feature extraction, and sensor data management, deal with different kinds of uncertainty. Finding the nearest neighbor objects to a given query point i ..."
Abstract

Cited by 41 (1 self)
 Add to MetaCart
(Show Context)
Uncertainty pervades many domains in our lives. Current reallife applications, e.g., location tracking using GPS devices or cell phones, multimedia feature extraction, and sensor data management, deal with different kinds of uncertainty. Finding the nearest neighbor objects to a given query point is an important query type in these applications. In this paper, we study the problem of finding objects with the highest marginal probability of being the nearest neighbors to a query object. We adopt a general uncertainty model allowing for data and query uncertainty. Under this model, we define new query semantics, and provide several efficient evaluation algorithms. We analyze the cost factors involved in query evaluation, and present novel techniques to address the tradeoffs among these factors. We give multiple extensions to our techniques including handling dependencies among data objects, and answering threshold queries. We conduct an extensive experimental study to evaluate our techniques on both real and synthetic data. 1.
Exploiting correlated attributes in acquisitional query processing
 In ICDE
, 2005
"... Sensor networks and other distributed information systems (such as the Web) must frequently access data that has a high perattribute acquisition cost, in terms of energy, latency, or computational resources. When executing queries that contain several predicates over such expensive attributes, we o ..."
Abstract

Cited by 33 (6 self)
 Add to MetaCart
(Show Context)
Sensor networks and other distributed information systems (such as the Web) must frequently access data that has a high perattribute acquisition cost, in terms of energy, latency, or computational resources. When executing queries that contain several predicates over such expensive attributes, we observe that it can be beneficial to use correlations to automatically introduce lowcost attributes whose observation will allow the query processor to better estimate the selectivity of these expensive predicates. In particular, we show how to build conditional plans that branch into one or more subplans, each with a different ordering for the expensive query predicates, based on the runtime observation of lowcost attributes. We frame the problem of constructing the optimal conditional plan for a given user query and set of candidate lowcost attributes as an optimization problem. We describe an exponential time algorithm for finding such optimal plans, and describe a polynomialtime heuristic for identifying conditional plans that perform well in practice. We also show how to compactly model conditional probability distributions needed to identify correlations and build these plans. We evaluate our algorithms against several realworld sensornetwork data sets, showing severaltimes performance increases for a variety of queries versus traditional optimization techniques. 1.
An online algorithm for maximizing submodular functions
, 2007
"... We present an algorithm for solving a broad class of online resource allocation jobs arrive one at a time, and one can complete the jobs by investing time in a number of abstract activities, according to some schedule. We assume that the fraction of jobs completed by a schedule is a monotone, submod ..."
Abstract

Cited by 31 (9 self)
 Add to MetaCart
(Show Context)
We present an algorithm for solving a broad class of online resource allocation jobs arrive one at a time, and one can complete the jobs by investing time in a number of abstract activities, according to some schedule. We assume that the fraction of jobs completed by a schedule is a monotone, submodular function of a set of pairs (v, τ), where τ is the time invested in activity v. Under this assumption, our online algorithm performs nearoptimally according to two natural metrics: (i) the fraction of jobs completed within time T, for some fixed deadline T> 0, and (ii) the average time required to complete each job. We evaluate our algorithm experimentally by using it to learn, online, a schedule for allocating CPU time among solvers entered in the 2007 SAT solver competition. 1
Adaptive submodularity: Theory and applications in active learning and stochastic optimization
 J. Artificial Intelligence Research
, 2011
"... Many problems in artificial intelligence require adaptively making a sequence of decisions with uncertain outcomes under partial observability. Solving such stochastic optimization problems is a fundamental but notoriously difficult challenge. In this paper, we introduce the concept of adaptive subm ..."
Abstract

Cited by 19 (6 self)
 Add to MetaCart
(Show Context)
Many problems in artificial intelligence require adaptively making a sequence of decisions with uncertain outcomes under partial observability. Solving such stochastic optimization problems is a fundamental but notoriously difficult challenge. In this paper, we introduce the concept of adaptive submodularity, generalizing submodular set functions to adaptive policies. We prove that if a problem satisfies this property, a simple adaptive greedy algorithm is guaranteed to be competitive with the optimal policy. In addition to providing performance guarantees for both stochastic maximization and coverage, adaptive submodularity can be exploited to drastically speed up the greedy algorithm by using lazy evaluations. We illustrate the usefulness of the concept by giving several examples of adaptive submodular objectives arising in diverse AI applications including management of sensing resources, viral marketing and active learning. Proving adaptive submodularity for these problems allows us to recover existing results in these applications as special cases, improve approximation guarantees and handle natural generalizations. 1.
Approximating Optimal Binary Decision Trees
"... Abstract. We give a (ln n + 1)approximation for the decision tree (DT) problem. We also show that DT does not have a PTAS unless P=NP. An instance of DT is a set of m binary tests T = (T1,..., Tm) and a set of n items X = (X1,..., Xn). The goal is to output a binary tree where each internal node is ..."
Abstract

Cited by 10 (0 self)
 Add to MetaCart
(Show Context)
Abstract. We give a (ln n + 1)approximation for the decision tree (DT) problem. We also show that DT does not have a PTAS unless P=NP. An instance of DT is a set of m binary tests T = (T1,..., Tm) and a set of n items X = (X1,..., Xn). The goal is to output a binary tree where each internal node is a test, each leaf is an item and the total external path length of the tree is minimized. DT has a rich history in computer science with applications ranging from medical diagnosis to experiment design. Our work, while providing the first nontrivial upper and lower bounds on approximating DT, also demonstrates that DT and a subtly different problem which also bears the name decision tree (but which we call ConDT) have fundamentally different approximation complexity. We conclude with a stronger lower bound for a third decision tree problem called MinDT. 1
Greedy is Good: On Service Tree Placement for InNetwork Stream Processing
"... This paper is concerned with reducing communication costs when executing distributed user tasks in a sensor network. We take a serviceoriented abstraction of sensor networks, where a user task is composed of a set of data processing modules (called services) with dependencies. Communications in sen ..."
Abstract

Cited by 10 (0 self)
 Add to MetaCart
(Show Context)
This paper is concerned with reducing communication costs when executing distributed user tasks in a sensor network. We take a serviceoriented abstraction of sensor networks, where a user task is composed of a set of data processing modules (called services) with dependencies. Communications in sensor networks consume significant energy and introduce uncertainty in data fidelity due to high bit error rate. These constraints are abstracted as costs on the communication graph. The goal is to place the services within the sensor network so that the communication cost in performing the task is minimized. In addition, since the lifetime of a node, the quality of network links, and the composition of the service graph may change over time, the quality of the placement must be maintained in the face of these dynamics. In this paper, we take a fresh look at what is generally considered a simple but poor performance approach for service placement, namely the greedy algorithm. We prove that a modified greedy algorithm is guaranteed to have cost at most 8 times the optimum placement. In fact, the guarantee is even stronger if there is a high degree of data reduction in the service graph. The advantage of the greedy placement strategy is that when there are local changes in the service graph or when a hosting node fails, the repair only affects the placement of services that depend on the changes. Simulations suggest that in practice the greedy algorithm finds a low cost placement. Furthermore, the cost of repairing a greedy placement decreases rapidly as a function of the proximity of the services to be aggregated.
Information acquisition and exploitation in multichannel wireless systems
 IEEE Transactions on Information Theory
, 2007
"... A wireless system with multiple channels is considered, where each channel has several transmission states. A user learns about the instantaneous state of an available channel by transmitting a control packet in it. Since probing all channels consumes significant energy and time, a user needs to det ..."
Abstract

Cited by 8 (5 self)
 Add to MetaCart
(Show Context)
A wireless system with multiple channels is considered, where each channel has several transmission states. A user learns about the instantaneous state of an available channel by transmitting a control packet in it. Since probing all channels consumes significant energy and time, a user needs to determine what and how much information it needs to acquire about the instantaneous states of the available channels so that it can maximize its transmission rate. This motivates the study of the tradeoff between the cost of information acquisition and its value towards improving the transmission rate. A simple model is presented for studying this information acquisition and exploitation tradeoff when the channels are multistate, with different distributions and information acquisition costs. The objective is to maximize a utility function which depends on both the cost and value of information. Solution techniques are presented for computing nearoptimal policies with succinct representation in polynomial time. These policies provably achieve at least a fixed constant factor of the optimal utility on any problem instance, and in addition, have natural characterizations. The techniques are based on exploiting the structure of the optimal policy, and by using Lagrangean relaxations to simplify the space of approximately optimal solutions. 1