Results 1  10
of
132
Costeffective outbreak detection in networks
 in KDD 2007
"... � One of the networks is a spread of a disease, the other one is product recommendations � Which is which? ☺ Diffusion in Social Networks � A fundamental process in social networks: Behaviors that cascade from node to node like an epidemic – News, opinions, rumors, fads, urban legends, … – Word‐of‐ ..."
Abstract

Cited by 191 (23 self)
 Add to MetaCart
� One of the networks is a spread of a disease, the other one is product recommendations � Which is which? ☺ Diffusion in Social Networks � A fundamental process in social networks: Behaviors that cascade from node to node like an epidemic – News, opinions, rumors, fads, urban legends, … – Word‐of‐mouth effects in marketing: rise of new websites, free web based services – Virus, disease propagation – Change in social priorities: smoking, recycling – Saturation news coverage: topic diffusion among bloggers – Internet‐energized political campaigns – Cascading failures in financial markets – Localized effects: riots, people walking out of a lectureEmpirical Studies of Diffusion � Experimental studies of diffusion have long history: – Spread of new agricultural practices [Ryan‐Gross 1943] • Adoption of a new hybrid‐corn between the 259 farmers in Iowa • Classical study of diffusion • Interpersonal network plays important role in adoption � Diffusion is a social process – Spread of new medical practices [Coleman et al 1966] • Studied the adoption of a new drug between doctors in Illinois • Clinical studies and scientific evaluations were not sufficient to convince the doctors • It was the social power of peers that led to adoptionDiffusion in Networks � Initially some nodes are active � Active nodes spread their influence on the other nodes, and so on … a d
Approximation algorithms for combinatorial auctions with complementfree bidders
 In Proceedings of the 37th Annual ACM Symposium on Theory of Computing (STOC
, 2005
"... We exhibit three approximation algorithms for the allocation problem in combinatorial auctions with complement free bidders. The running time of these algorithms is polynomial in the number of items m and in the number of bidders n, even though the “input size ” is exponential in m. The first algori ..."
Abstract

Cited by 104 (22 self)
 Add to MetaCart
We exhibit three approximation algorithms for the allocation problem in combinatorial auctions with complement free bidders. The running time of these algorithms is polynomial in the number of items m and in the number of bidders n, even though the “input size ” is exponential in m. The first algorithm provides an O(log m) approximation. The second algorithm provides an O ( √ m) approximation in the weaker model of value oracles. This algorithm is also incentive compatible. The third algorithm provides an improved 2approximation for the more restricted case of “XOS bidders”, a class which strictly contains submodular bidders. We also prove lower bounds on the possible approximations achievable for these classes of bidders. These bounds are not tight and we leave the gaps as open problems. 1
Nearoptimal nonmyopic value of information in graphical models
 In Annual Conference on Uncertainty in Artificial Intelligence
"... A fundamental issue in realworld systems, such as sensor networks, is the selection of observations which most effectively reduce uncertainty. More specifically, we address the long standing problem of nonmyopically selecting the most informative subset of variables in a graphical model. We present ..."
Abstract

Cited by 96 (18 self)
 Add to MetaCart
A fundamental issue in realworld systems, such as sensor networks, is the selection of observations which most effectively reduce uncertainty. More specifically, we address the long standing problem of nonmyopically selecting the most informative subset of variables in a graphical model. We present the first efficient randomized algorithm providing a constant factor (1 − 1/e − ε) approximation guarantee for any ε> 0 with high confidence. The algorithm leverages the theory of submodular functions, in combination with a polynomial bound on sample complexity. We furthermore prove that no polynomial time algorithm can provide a constant factor approximation better than (1 − 1/e) unless P = NP. Finally, we provide extensive evidence of the effectiveness of our method on two complex realworld datasets. 1
Algorithms for Facility Location Problems with Outliers (Extended Abstract)
 In Proceedings of the 12th Annual ACMSIAM Symposium on Discrete Algorithms
, 2000
"... ) Moses Charikar Samir Khuller y David M. Mount z Giri Narasimhan x Abstract Facility location problems are traditionally investigated with the assumption that all the clients are to be provided service. A significant shortcoming of this formulation is that a few very distant clients, called outlier ..."
Abstract

Cited by 70 (8 self)
 Add to MetaCart
) Moses Charikar Samir Khuller y David M. Mount z Giri Narasimhan x Abstract Facility location problems are traditionally investigated with the assumption that all the clients are to be provided service. A significant shortcoming of this formulation is that a few very distant clients, called outliers, can exert a disproportionately strong influence over the final solution. In this paper we explore a generalization of various facility location problems (Kcenter, Kmedian, uncapacitated facility location etc) to the case when only a specified fraction of the customers are to be served. What makes the problems harder is that we have to also select the subset that should get service. We provide generalizations of various approximation algorithms to deal with this added constraint. 1 Introduction The facility location problem and the related clustering problems, kmedian and kcenter, are widely studied in operations research and computer science [3, 7, 22, 24, 32]. Typically in...
Combination Can Be Hard: Approximability of the Unique Coverage Problem
 In Proceedings of the 17th Annual ACMSIAM Symposium on Discrete Algorithms
, 2006
"... Abstract We prove semilogarithmic inapproximability for a maximization problem called unique coverage:given a collection of sets, find a subcollection that maximizes the number of elements covered exactly once. Specifically, assuming that NP 6 ` BPTIME(2n " ) for an arbitrary "> ..."
Abstract

Cited by 66 (2 self)
 Add to MetaCart
Abstract We prove semilogarithmic inapproximability for a maximization problem called unique coverage:given a collection of sets, find a subcollection that maximizes the number of elements covered exactly once. Specifically, assuming that NP 6 ` BPTIME(2n &quot; ) for an arbitrary &quot;> 0, we prove O(1 / logoe n) inapproximability for some constant oe = oe(&quot;). We also prove O(1 / log1/3 &quot; n) inapproximability, forany &quot;> 0, assuming that refuting random instances of 3SAT is hard on average; and prove O(1 / log n)inapproximability under a plausible hypothesis concerning the hardness of another problem, balanced bipartite independent set. We establish an \Omega (1 / log n)approximation algorithm, even for a moregeneral (budgeted) setting, and obtain an \Omega (1 / log B)approximation algorithm when every set hasat most B elements. We also show that our inapproximability results extend to envyfree pricing, animportant problem in computational economics. We describe how the (budgeted) unique coverage problem, motivated by realworld applications, has close connections to other theoretical problemsincluding max cut, maximum coverage, and radio broadcasting. 1 Introduction In this paper we consider the approximability of the following natural maximization analog of set cover: Unique Coverage Problem. Given a universe U = {e1,..., en} of elements, and given a collection S = {S1,..., Sm} of subsets of U. Find a subcollection S0 ` S to maximize the number of elements that are uniquely covered, i.e., appear in exactly one set of S 0.
Inferring Networks of Diffusion and Influence
"... Information diffusion and virus propagation are fundamental processes talking place in networks. While it is often possible to directly observe when nodes become infected, observing individual transmissions (i.e., who infects whom or who influences whom) is typically very difficult. Furthermore, in ..."
Abstract

Cited by 61 (6 self)
 Add to MetaCart
Information diffusion and virus propagation are fundamental processes talking place in networks. While it is often possible to directly observe when nodes become infected, observing individual transmissions (i.e., who infects whom or who influences whom) is typically very difficult. Furthermore, in many applications, the underlying network over which the diffusions and propagations spread is actually unobserved. We tackle these challenges by developing a method for tracing paths of diffusion and influence through networks and inferring the networks over which contagions propagate. Given the times when nodes adopt pieces of information or become infected, we identify the optimal network that best explains the observed infection times. Since the optimization problem is NPhard to solve exactly, we develop an efficient approximation algorithm that scales to large datasets and in practice gives provably nearoptimal performance. We demonstrate the effectiveness of our approach by tracing information cascades in a set of 170 million blogs and news articles over a one year period to infer how information flows through the online media space. We find that the diffusion network of news tends to have a coreperiphery structure with a small set of core media sites that diffuse information to the rest of the Web. These sites tend to have stable circles of influence with more general news media sites acting as connectors between them.
Learning diverse rankings with multiarmed bandits
 In Proceedings of the 25 th ICML
, 2008
"... Algorithms for learning to rank Web documents usually assume a document’s relevance is independent of other documents. This leads to learned ranking functions that produce rankings with redundant results. In contrast, user studies have shown that diversity at high ranks is often preferred. We presen ..."
Abstract

Cited by 61 (5 self)
 Add to MetaCart
Algorithms for learning to rank Web documents usually assume a document’s relevance is independent of other documents. This leads to learned ranking functions that produce rankings with redundant results. In contrast, user studies have shown that diversity at high ranks is often preferred. We present two online learning algorithms that directly learn a diverse ranking of documents based on users ’ clicking behavior. We show that these algorithms minimize abandonment, or alternatively, maximize the probability that a relevant document is found in the top k positions of a ranking. Moreover, one of our algorithms asymptotically achieves optimal worstcase performance even if users’ interests change. 1.
Locating network monitors: Complexity, heuristics and coverage
 in Proceedings of IEEE Infocom
, 2005
"... Abstract — There is increasing interest in concurrent passive monitoring of IP flows at multiple locations within an IP network. The common objective of such a distributed monitoring system is to sample packets belonging to a large fraction of IP flows in a costeffective manner by carefully placing ..."
Abstract

Cited by 43 (0 self)
 Add to MetaCart
Abstract — There is increasing interest in concurrent passive monitoring of IP flows at multiple locations within an IP network. The common objective of such a distributed monitoring system is to sample packets belonging to a large fraction of IP flows in a costeffective manner by carefully placing monitors and controlling their sampling rates. In this paper, we consider the problem of where to place monitors within the network and how to control their sampling. To address the tradeoff between monitoring cost and monitoring coverage, we consider minimum cost and maximum coverage problems under various budget constraints. We show that all of the defined problems are NPhard. We propose greedy heuristics, and show that the heuristics provide solutions quite close to the optimal solutions through experiments using synthetic and real network topologies. In addition, our experiments show that a small number of monitors is often enough to monitor most of the traffic in an entire IP network. I.
Predicting Diverse Subsets Using Structural SVMs
"... In many retrieval tasks, one important goal involves retrieving a diverse set of results (e.g., documents covering a wide range of topics for a search query). First of all, this reduces redundancy, effectively showing more information with the presented results. Secondly, queries are often ambiguous ..."
Abstract

Cited by 42 (8 self)
 Add to MetaCart
In many retrieval tasks, one important goal involves retrieving a diverse set of results (e.g., documents covering a wide range of topics for a search query). First of all, this reduces redundancy, effectively showing more information with the presented results. Secondly, queries are often ambiguous at some level. For example, the query “Jaguar ” can refer to many different topics (such as the car or feline). A set of documents with high topic diversity ensures that fewer users abandon the query because no results are relevant to them. Unlike existing approaches to learning retrieval functions, we present a method that explicitly trains to diversify results. In particular, we formulate the learning problem of predicting diverse subsets and derive a training method based on structural SVMs. 1.