Results 1  10
of
24
A constantfactor approximation algorithm for the multicommodity rentorbuy problem
 In Proceedings of the 43rd Annual IEEE Symposium on Foundations of Computer Science
, 2002
"... We present the first constantfactor approximation algorithm for network design with multiple commodities and economies of scale. We consider the rentorbuy problem, a type of multicommodity buyatbulk network design in which there are two ways to install capacity on any given edge. Capacity can b ..."
Abstract

Cited by 83 (9 self)
 Add to MetaCart
We present the first constantfactor approximation algorithm for network design with multiple commodities and economies of scale. We consider the rentorbuy problem, a type of multicommodity buyatbulk network design in which there are two ways to install capacity on any given edge. Capacity can be rented, with cost incurred on a perunit of capacity basis, or bought, which allows unlimited use after payment of a large fixed cost. Given a graph and a set of sourcesink pairs, we seek a minimumcost way of installing sufficient capacity on edges so that a prescribed amount of flow can be sent simultaneously from each source to the corresponding sink. Recent work on buyatbulk network design has concentrated on the special case where all sinks are identical; existing constantfactor approximation algorithms for this special case make crucial use of the assumption that all commodities ship flow to the same sink vertex and do not obviously extend to the multicommodity rentorbuy problem. Prior to our work, the best heuristics for the multicommodity rentorbuy problem achieved only logarithmic performance guarantees and relied on the machinery of relaxed metrical task systems or of metric embeddings. By contrast, we solve the network design problem directly via a novel primaldual algorithm. 1
Operator Placement for InNetwork Stream Query Processing
 In Proc. the 24th ACM SIGACTSIGMODSIGART Symposium on Principle of Database Systems(PODS
, 2005
"... In sensor networks, data acquisition frequently takes place at lowcapability devices. The acquired data is then transmitted through a hierarchy of nodes having progressively increasing network bandwidth and computational power. We consider the problem of executing queries over these data streams, p ..."
Abstract

Cited by 58 (0 self)
 Add to MetaCart
In sensor networks, data acquisition frequently takes place at lowcapability devices. The acquired data is then transmitted through a hierarchy of nodes having progressively increasing network bandwidth and computational power. We consider the problem of executing queries over these data streams, posed at the root of the hierarchy. To minimize data transmission, it is desirable to perform “innetwork ” query processing: do some part of the work at intermediate nodes as the data travels to the root. Most previous work on innetwork query processing has focused on aggregation and inexpensive filters. In this paper, we address innetwork processing for queries involving possibly expensive conjunctive filters, and joins. We consider the problem of placing operators along the nodes of the hierarchy so that the overall cost of computation and data transmission is minimized. We show that the problem is tractable, give an optimal algorithm, and demonstrate that a simpler greedy operator placement algorithm can fail to find the optimal solution. Finally we define a number of interesting variations of the basic operator placement problem and demonstrate their hardness. 1
Efficient Search for the Topk Probable Nearest Neighbors in Uncertain Databases ABSTRACT
"... Uncertainty pervades many domains in our lives. Current reallife applications, e.g., location tracking using GPS devices or cell phones, multimedia feature extraction, and sensor data management, deal with different kinds of uncertainty. Finding the nearest neighbor objects to a given query point i ..."
Abstract

Cited by 41 (0 self)
 Add to MetaCart
Uncertainty pervades many domains in our lives. Current reallife applications, e.g., location tracking using GPS devices or cell phones, multimedia feature extraction, and sensor data management, deal with different kinds of uncertainty. Finding the nearest neighbor objects to a given query point is an important query type in these applications. In this paper, we study the problem of finding objects with the highest marginal probability of being the nearest neighbors to a query object. We adopt a general uncertainty model allowing for data and query uncertainty. Under this model, we define new query semantics, and provide several efficient evaluation algorithms. We analyze the cost factors involved in query evaluation, and present novel techniques to address the tradeoffs among these factors. We give multiple extensions to our techniques including handling dependencies among data objects, and answering threshold queries. We conduct an extensive experimental study to evaluate our techniques on both real and synthetic data. 1.
Exploiting correlated attributes in acquisitional query processing
 In ICDE
, 2005
"... Sensor networks and other distributed information systems (such as the Web) must frequently access data that has a high perattribute acquisition cost, in terms of energy, latency, or computational resources. When executing queries that contain several predicates over such expensive attributes, we o ..."
Abstract

Cited by 32 (6 self)
 Add to MetaCart
Sensor networks and other distributed information systems (such as the Web) must frequently access data that has a high perattribute acquisition cost, in terms of energy, latency, or computational resources. When executing queries that contain several predicates over such expensive attributes, we observe that it can be beneficial to use correlations to automatically introduce lowcost attributes whose observation will allow the query processor to better estimate the selectivity of these expensive predicates. In particular, we show how to build conditional plans that branch into one or more subplans, each with a different ordering for the expensive query predicates, based on the runtime observation of lowcost attributes. We frame the problem of constructing the optimal conditional plan for a given user query and set of candidate lowcost attributes as an optimization problem. We describe an exponential time algorithm for finding such optimal plans, and describe a polynomialtime heuristic for identifying conditional plans that perform well in practice. We also show how to compactly model conditional probability distributions needed to identify correlations and build these plans. We evaluate our algorithms against several realworld sensornetwork data sets, showing severaltimes performance increases for a variety of queries versus traditional optimization techniques. 1.
An online algorithm for maximizing submodular functions
, 2007
"... We present an algorithm for solving a broad class of online resource allocation jobs arrive one at a time, and one can complete the jobs by investing time in a number of abstract activities, according to some schedule. We assume that the fraction of jobs completed by a schedule is a monotone, submod ..."
Abstract

Cited by 28 (9 self)
 Add to MetaCart
We present an algorithm for solving a broad class of online resource allocation jobs arrive one at a time, and one can complete the jobs by investing time in a number of abstract activities, according to some schedule. We assume that the fraction of jobs completed by a schedule is a monotone, submodular function of a set of pairs (v, τ), where τ is the time invested in activity v. Under this assumption, our online algorithm performs nearoptimally according to two natural metrics: (i) the fraction of jobs completed within time T, for some fixed deadline T> 0, and (ii) the average time required to complete each job. We evaluate our algorithm experimentally by using it to learn, online, a schedule for allocating CPU time among solvers entered in the 2007 SAT solver competition. 1
Adaptive submodularity: Theory and applications in active learning and stochastic optimization
 J. Artificial Intelligence Research
, 2011
"... Many problems in artificial intelligence require adaptively making a sequence of decisions with uncertain outcomes under partial observability. Solving such stochastic optimization problems is a fundamental but notoriously difficult challenge. In this paper, we introduce the concept of adaptive subm ..."
Abstract

Cited by 12 (4 self)
 Add to MetaCart
Many problems in artificial intelligence require adaptively making a sequence of decisions with uncertain outcomes under partial observability. Solving such stochastic optimization problems is a fundamental but notoriously difficult challenge. In this paper, we introduce the concept of adaptive submodularity, generalizing submodular set functions to adaptive policies. We prove that if a problem satisfies this property, a simple adaptive greedy algorithm is guaranteed to be competitive with the optimal policy. In addition to providing performance guarantees for both stochastic maximization and coverage, adaptive submodularity can be exploited to drastically speed up the greedy algorithm by using lazy evaluations. We illustrate the usefulness of the concept by giving several examples of adaptive submodular objectives arising in diverse AI applications including management of sensing resources, viral marketing and active learning. Proving adaptive submodularity for these problems allows us to recover existing results in these applications as special cases, improve approximation guarantees and handle natural generalizations. 1.
Approximating Optimal Binary Decision Trees
"... Abstract. We give a (ln n + 1)approximation for the decision tree (DT) problem. We also show that DT does not have a PTAS unless P=NP. An instance of DT is a set of m binary tests T = (T1,..., Tm) and a set of n items X = (X1,..., Xn). The goal is to output a binary tree where each internal node is ..."
Abstract

Cited by 10 (0 self)
 Add to MetaCart
Abstract. We give a (ln n + 1)approximation for the decision tree (DT) problem. We also show that DT does not have a PTAS unless P=NP. An instance of DT is a set of m binary tests T = (T1,..., Tm) and a set of n items X = (X1,..., Xn). The goal is to output a binary tree where each internal node is a test, each leaf is an item and the total external path length of the tree is minimized. DT has a rich history in computer science with applications ranging from medical diagnosis to experiment design. Our work, while providing the first nontrivial upper and lower bounds on approximating DT, also demonstrates that DT and a subtly different problem which also bears the name decision tree (but which we call ConDT) have fundamentally different approximation complexity. We conclude with a stronger lower bound for a third decision tree problem called MinDT. 1
Greedy is Good: On Service Tree Placement for InNetwork Stream Processing
"... This paper is concerned with reducing communication costs when executing distributed user tasks in a sensor network. We take a serviceoriented abstraction of sensor networks, where a user task is composed of a set of data processing modules (called services) with dependencies. Communications in sen ..."
Abstract

Cited by 9 (0 self)
 Add to MetaCart
This paper is concerned with reducing communication costs when executing distributed user tasks in a sensor network. We take a serviceoriented abstraction of sensor networks, where a user task is composed of a set of data processing modules (called services) with dependencies. Communications in sensor networks consume significant energy and introduce uncertainty in data fidelity due to high bit error rate. These constraints are abstracted as costs on the communication graph. The goal is to place the services within the sensor network so that the communication cost in performing the task is minimized. In addition, since the lifetime of a node, the quality of network links, and the composition of the service graph may change over time, the quality of the placement must be maintained in the face of these dynamics. In this paper, we take a fresh look at what is generally considered a simple but poor performance approach for service placement, namely the greedy algorithm. We prove that a modified greedy algorithm is guaranteed to have cost at most 8 times the optimum placement. In fact, the guarantee is even stronger if there is a high degree of data reduction in the service graph. The advantage of the greedy placement strategy is that when there are local changes in the service graph or when a hosting node fails, the repair only affects the placement of services that depend on the changes. Simulations suggest that in practice the greedy algorithm finds a low cost placement. Furthermore, the cost of repairing a greedy placement decreases rapidly as a function of the proximity of the services to be aggregated.
Information acquisition and exploitation in multichannel wireless systems
 IEEE Transactions on Information Theory
, 2007
"... A wireless system with multiple channels is considered, where each channel has several transmission states. A user learns about the instantaneous state of an available channel by transmitting a control packet in it. Since probing all channels consumes significant energy and time, a user needs to det ..."
Abstract

Cited by 8 (5 self)
 Add to MetaCart
A wireless system with multiple channels is considered, where each channel has several transmission states. A user learns about the instantaneous state of an available channel by transmitting a control packet in it. Since probing all channels consumes significant energy and time, a user needs to determine what and how much information it needs to acquire about the instantaneous states of the available channels so that it can maximize its transmission rate. This motivates the study of the tradeoff between the cost of information acquisition and its value towards improving the transmission rate. A simple model is presented for studying this information acquisition and exploitation tradeoff when the channels are multistate, with different distributions and information acquisition costs. The objective is to maximize a utility function which depends on both the cost and value of information. Solution techniques are presented for computing nearoptimal policies with succinct representation in polynomial time. These policies provably achieve at least a fixed constant factor of the optimal utility on any problem instance, and in addition, have natural characterizations. The techniques are based on exploiting the structure of the optimal policy, and by using Lagrangean relaxations to simplify the space of approximately optimal solutions. 1
Multiple intents reranking
 In Proceedings 41st Annual ACM Symposium on Theory of Computing
, 2009
"... One of the most fundamental problems in web search is how to rerank result web pages based on user logs. Most traditional models for reranking assume each query has a single intent. That is, they assume all users formulating the same query have similar preferences over the result web pages. It is ..."
Abstract

Cited by 7 (1 self)
 Add to MetaCart
One of the most fundamental problems in web search is how to rerank result web pages based on user logs. Most traditional models for reranking assume each query has a single intent. That is, they assume all users formulating the same query have similar preferences over the result web pages. It is clear that this is not true for a large portion of queries as different users may have different preferences over the result web pages. Accordingly, a more accurate model should assume that queries have multiple intents. In this paper, we introduce the multiple intents reranking problem. This problem captures scenarios in which some user makes a query, and there is no information about its real search intent. In such cases, one would like to rerank the search results in a way that minimizes the efforts of all users in finding their relevant web pages. More formally, the setting of this problem consists of various types of users, each of which interested in some subset of the search results. Moreover, each user type has a nonnegative profile vector. Consider some ordering of the search results. This order sets a position for each search result, and induces a position vector of the results relevant to each user type. The overhead of a user type is the dot product of its profile vector and its induced position vector. The goal is to order the search results as to minimize the average overhead of the users.