Results 1 - 10
of
221
Maximizing the Spread of Influence Through a Social Network
- In KDD
, 2003
"... Models for the processes by which ideas and influence propagate through a social network have been studied in a number of domains, including the diffusion of medical and technological innovations, the sudden and widespread adoption of various strategies in game-theoretic settings, and the effects of ..."
Abstract
-
Cited by 262 (6 self)
- Add to MetaCart
Models for the processes by which ideas and influence propagate through a social network have been studied in a number of domains, including the diffusion of medical and technological innovations, the sudden and widespread adoption of various strategies in game-theoretic settings, and the effects of “word of mouth ” in the promotion of new products. Recently, motivated by the design of viral marketing strategies, Domingos and Richardson posed a fundamental algorithmic problem for such social network processes: if we can try to convince a subset of individuals to adopt a new product or innovation, and the goal is to trigger a large cascade of further adoptions, which set of individuals should we target? We consider this problem in several of the most widely studied models in social network analysis. The optimization problem of selecting the most influential nodes is NP-hard here, and we provide the first provable approximation guarantees for efficient algorithms. Using an analysis framework based on submodular functions, we show that a natural greedy strategy obtains a solution that is provably within 63 % of optimal for several classes of models; our framework suggests a general approach for reasoning about the performance guarantees of algorithms for these types of influence problems in social networks. We also provide computational experiments on large collaboration networks, showing that in addition to their provable guarantees, our approximation algorithms significantly out-perform nodeselection heuristics based on the well-studied notions of degree centrality and distance centrality from the field of social networks.
Combinatorial Auctions with Decreasing Marginal Utilities
, 2001
"... This paper considers combinatorial auctions among such submodular buyers. The valuations of such buyers are placed within a hierarchy of valuations that exhibit no complementarities, a hierarchy that includes also OR and XOR combinations of singleton valuations, and valuations satisfying the gross s ..."
Abstract
-
Cited by 108 (20 self)
- Add to MetaCart
This paper considers combinatorial auctions among such submodular buyers. The valuations of such buyers are placed within a hierarchy of valuations that exhibit no complementarities, a hierarchy that includes also OR and XOR combinations of singleton valuations, and valuations satisfying the gross substitutes property. Those last valuations are shown to form a zero-measure subset of the submodular valuations that have positive measure. While we show that the allocation problem among submodular valuations is NP-hard, we present an efficient greedy 2-approximation algorithm for this case and generalize it to the case of limited complementarities. No such approximation algorithm exists in a setting allowing for arbitrary complementarities. Some results about strategic aspects of combinatorial auctions among players with decreasing marginal utilities are also presented.
Near-optimal sensor placements in gaussian processes
- In ICML
, 2005
"... When monitoring spatial phenomena, which can often be modeled as Gaussian processes (GPs), choosing sensor locations is a fundamental task. There are several common strategies to address this task, for example, geometry or disk models, placing sensors at the points of highest entropy (variance) in t ..."
Abstract
-
Cited by 91 (24 self)
- Add to MetaCart
When monitoring spatial phenomena, which can often be modeled as Gaussian processes (GPs), choosing sensor locations is a fundamental task. There are several common strategies to address this task, for example, geometry or disk models, placing sensors at the points of highest entropy (variance) in the GP model, and A-, D-, or E-optimal design. In this paper, we tackle the combinatorial optimization problem of maximizing the mutual information between the chosen locations and the locations which are not selected. We prove that the problem of finding the configuration that maximizes mutual information is NP-complete. To address this issue, we describe a polynomial-time approximation that is within (1 − 1/e) of the optimum by exploiting the submodularity of mutual information. We also show how submodularity can be used to obtain online bounds, and design branch and bound search procedures. We then extend our algorithm to exploit lazy evaluations and local structure in the GP, yielding significant speedups. We also extend our approach to find placements which are robust against node failures and uncertainties in the model. These extensions are again associated with rigorous theoretical approximation guarantees, exploiting the submodularity of the objective function. We demonstrate the advantages of our approach towards optimizing mutual information in a very extensive empirical study on two real-world data sets.
Pseudo-Boolean Optimization
- DISCRETE APPLIED MATHEMATICS
, 2001
"... This survey examines the state of the art of a variety of problems related to pseudo-Boolean optimization, i.e. to the optimization of set functions represented by closed algebraic expressions. The main parts of the survey examine general pseudo-Boolean optimization, the specially important case of ..."
Abstract
-
Cited by 72 (2 self)
- Add to MetaCart
This survey examines the state of the art of a variety of problems related to pseudo-Boolean optimization, i.e. to the optimization of set functions represented by closed algebraic expressions. The main parts of the survey examine general pseudo-Boolean optimization, the specially important case of quadratic pseudo-Boolean optimization (to which every pseudo-Boolean optimization can be reduced), several other important special classes, and approximation algorithms.
Improved Approximation Algorithms for Capacitated Facility Location Problems
"... In a surprising result, Korupolu, Plaxton, and Rajaraman [13] showed that a simple local search heuristic for the capacitated facility location problem (CFLP) in which the service costs obey the triangle inequality produces a solution in polynomial time which is within a factor of 8+ # of the val ..."
Abstract
-
Cited by 68 (1 self)
- Add to MetaCart
In a surprising result, Korupolu, Plaxton, and Rajaraman [13] showed that a simple local search heuristic for the capacitated facility location problem (CFLP) in which the service costs obey the triangle inequality produces a solution in polynomial time which is within a factor of 8+ # of the value of an optimal solution. By simplifying their analysis, we are able to show that the same heuristic produces a solution which is within a factor of 6(1 + #) of the value of an optimal solution. Our simplified analysis uses the supermodularity of the cost function of the problem and the integrality of the transshipment polyhedron. Additionally
Diversifying Search Results
, 2009
"... We study the problem of answering ambiguous web queries in a setting where there exists a taxonomy of information, and that both queries and documents may belong to more than one category according to this taxonomy. We present a systematic approach to diversifying results that aims to minimize the r ..."
Abstract
-
Cited by 62 (4 self)
- Add to MetaCart
We study the problem of answering ambiguous web queries in a setting where there exists a taxonomy of information, and that both queries and documents may belong to more than one category according to this taxonomy. We present a systematic approach to diversifying results that aims to minimize the risk of dissatisfaction of the average user. We propose an algorithm that well approximates this objective in general, and is provably optimal for a natural special case. Furthermore, we generalize several classical IR metrics, including NDCG, MRR, and MAP, to explicitly account for the value of diversification. We demonstrate empirically that our algorithm scores higher in these generalized metrics compared to results produced by commercial search engines.
Segmentation Problems
- INFORMATION PROCESSING LETTERS
, 1998
"... We introduce and study a novel genre of optimization problems, which we call segmentation problems. Our motivation, in part, is the development of a framework for evaluating certain data mining and clustering operations in terms of their utility in decision-making. For any classical optimization pro ..."
Abstract
-
Cited by 57 (5 self)
- Add to MetaCart
We introduce and study a novel genre of optimization problems, which we call segmentation problems. Our motivation, in part, is the development of a framework for evaluating certain data mining and clustering operations in terms of their utility in decision-making. For any classical optimization problem, the corresponding segmentation problem seeks to partition a set of cost vectors into several segments, so that the overall cost is optimized. This framework contains a number of standard combinatorial clustering problems as special cases, and many segmentation problems turn out to be MAXSNP-complete even when the corresponding "un-segmented" version is easy to solve. We develop approximation algorithms for two natural and interesting problems in this class --- the HYPERCUBE SEGMENTATION PROBLEM and the CATALOG SEGMENTATION PROBLEM --- and present a general greedy scheme, which can be specialized to approximate a large class of segmentation problems. Finally, we indicate some connection...
Near-optimal nonmyopic value of information in graphical models
- In Annual Conference on Uncertainty in Artificial Intelligence
"... A fundamental issue in real-world systems, such as sensor networks, is the selection of observations which most effectively reduce uncertainty. More specifically, we address the long standing problem of nonmyopically selecting the most informative subset of variables in a graphical model. We present ..."
Abstract
-
Cited by 50 (13 self)
- Add to MetaCart
A fundamental issue in real-world systems, such as sensor networks, is the selection of observations which most effectively reduce uncertainty. More specifically, we address the long standing problem of nonmyopically selecting the most informative subset of variables in a graphical model. We present the first efficient randomized algorithm providing a constant factor (1 − 1/e − ε) approximation guarantee for any ε> 0 with high confidence. The algorithm leverages the theory of submodular functions, in combination with a polynomial bound on sample complexity. We furthermore prove that no polynomial time algorithm can provide a constant factor approximation better than (1 − 1/e) unless P = NP. Finally, we provide extensive evidence of the effectiveness of our method on two complex real-world datasets. 1
Active learning literature survey
, 2010
"... The key idea behind active learning is that a machine learning algorithm can achieve greater accuracy with fewer labeled training instances if it is allowed to choose the data from which is learns. An active learner may ask queries in the form of unlabeled instances to be labeled by an oracle (e.g., ..."
Abstract
-
Cited by 49 (1 self)
- Add to MetaCart
The key idea behind active learning is that a machine learning algorithm can achieve greater accuracy with fewer labeled training instances if it is allowed to choose the data from which is learns. An active learner may ask queries in the form of unlabeled instances to be labeled by an oracle (e.g., a human annotator). Active learning is well-motivated in many modern machine learning problems, where unlabeled data may be abundant but labels are difficult, time-consuming, or expensive to obtain. This report provides a general introduction to active learning and a survey of the literature. This includes a discussion of the scenarios in which queries can be formulated, and an overview of the query strategy frameworks proposed in the literature to date. An analysis of the empirical and theoretical evidence for active learning, a summary of several problem setting variants, and a discussion
Maximizing non-monotone submodular functions
- In Proceedings of 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS
, 2007
"... Submodular maximization generalizes many important problems including Max Cut in directed/undirected graphs and hypergraphs, certain constraint satisfaction problems and maximum facility location problems. Unlike the problem of minimizing submodular functions, the problem of maximizing submodular fu ..."
Abstract
-
Cited by 47 (10 self)
- Add to MetaCart
Submodular maximization generalizes many important problems including Max Cut in directed/undirected graphs and hypergraphs, certain constraint satisfaction problems and maximum facility location problems. Unlike the problem of minimizing submodular functions, the problem of maximizing submodular functions is NP-hard. In this paper, we design the first constant-factor approximation algorithms for maximizing nonnegative submodular functions. In particular, we give a deterministic local search 1 2-approximation and a randomized-approximation algo-

