Results 1 
9 of
9
Fast and deterministic computation of fixation probability in evolutionary graphs
 In: CIB ’11: The Sixth IASTED Conference on Computational Intelligence and Bioinformatics (accepted). IASTED
, 2011
"... In evolutionary graph theory [1] biologists study the problem of determining the probability that a small number of mutants overtake a population that is structured on a weighted, possibly directed graph. Currently Monte Carlo simulations are used for estimating such fixation probabilities on direct ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
(Show Context)
In evolutionary graph theory [1] biologists study the problem of determining the probability that a small number of mutants overtake a population that is structured on a weighted, possibly directed graph. Currently Monte Carlo simulations are used for estimating such fixation probabilities on directed graphs, since no good analytical methods exist. In this paper, we introduce a novel deterministic algorithm for computing fixation probabilities for strongly connected directed, weighted evolutionary graphs under the case of neutral drift, which we show to be a lower bound for the case where the mutant is more fit than the rest of the population (previously, this was only observed from simulation). We also show that, in neutral drift, fixation probability is additive under the weighted, directed case. We implement our algorithm and show experimentally that it consistently outperforms Monte Carlo simulations by several orders of magnitude, which can allow researchers to study fixation probability on much larger graphs.
Fast Influencebased Coarsening for Large Networks
"... Given a social network, can we quickly ‘zoomout ’ of the graph? Is there a smaller equivalent representation of the graph that preserves its propagation characteristics? Can we group nodes together based on their influence properties? These are important problems with applications to influence anal ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
(Show Context)
Given a social network, can we quickly ‘zoomout ’ of the graph? Is there a smaller equivalent representation of the graph that preserves its propagation characteristics? Can we group nodes together based on their influence properties? These are important problems with applications to influence analysis, epidemiology and viral marketing applications. In this paper, we first formulate a novel Graph Coarsening Problem to find a succinct representation of any graph while preserving key characteristics for diffusion processes on that graph. We then provide a fast and effective nearlineartime (in nodes and edges) algorithm coarseNet for the same. Using extensive experiments on multiple real datasets, we demonstrate the quality and scalability of coarseNet, enabling us to reduce the graph by 90 % in some cases without much loss of information. Finally we also show how our method can help in diverse applications like influence maximization and detecting patterns of propagation at the level of automatically created groups on real cascade data.
A Using Generalized Annotated Programs to Solve Social Network Diffusion Optimization Problems
"... There has been extensive work in many different fields on how phenomena of interest (e.g. diseases, innovation, product adoption) “diffuse ” through a social network. As social networks increasingly become a fabric of society, there is a need to make “optimal ” decisions with respect to an observed ..."
Abstract
 Add to MetaCart
There has been extensive work in many different fields on how phenomena of interest (e.g. diseases, innovation, product adoption) “diffuse ” through a social network. As social networks increasingly become a fabric of society, there is a need to make “optimal ” decisions with respect to an observed model of diffusion. For example, in epidemiology, officials want to find a set of k individuals in a social network which, if treated, would minimize spread of a disease. In marketing, campaign managers try to identify a set of k customers that, if given a free sample, would generate maximal “buzz ” about the product. In this paper, we first show that the wellknown Generalized Annotated Program (GAP) paradigm can be used to express many existing diffusion models. We then define a class of problems called Social Network Diffusion Optimization Problems (SNDOPs). SNDOPs have four parts: (i) a diffusion model expressed as a GAP, (ii) an objective function we want to optimize with respect to a given diffusion model, (iii) an integer k> 0 describing resources (e.g. medication) that can be placed at nodes, (iv) a logical condition V C that governs which nodes can have a resource (e.g. only children above the age of 5 can be treated with a given medication). We study the computational complexity of SNDOPs and show both NPcompleteness results as well as results on complexity of approximation. We then develop an exact and a heuristic algorithm to solve a large class of SNDOP problems and show that our GREEDYSNDOP algorithm achieves the best possible approximation ratio that a polynomial
The VLDB Journal DOI 10.1007/s0077801202991 REGULAR PAPER Extending the power of datalog recursion
"... Abstract Supporting aggregates in recursive logic rules represents a very important problem for Datalog. To solve this problem, we propose a simple extension, called DatalogFS (Datalog extended with frequency support goals), that supports queries and reasoning about the number of distinct variable a ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract Supporting aggregates in recursive logic rules represents a very important problem for Datalog. To solve this problem, we propose a simple extension, called DatalogFS (Datalog extended with frequency support goals), that supports queries and reasoning about the number of distinct variable assignments satisfying given goals, or conjunctions of goals, in rules. This monotonic extension greatly enhances the power of Datalog, while preserving (i) its declarative semantics and (ii) its amenability to efficient implementation via differential fixpoint and other optimization techniques presented in the paper. Thus, Datalog FS enables the efficient formulation of queries that could not be expressed efficiently or could not be expressed at all in Datalog with stratified negation and aggregates. In fact, using a generalized notion of multiplicity called frequency, we show that diffusion models and page rank computations can be easily expressed and efficiently implemented using Datalog FS.
Using RASCAL to Find Key Villages in
, 2012
"... Though not directly supported by U.S. ground forces, the villagers had received training and logistical support from U.S. Special Forces. Many hoped that the actions of the villagers could be replicated throughout the country. However, there are a limited number of U.S. Special Forces teams to condu ..."
Abstract
 Add to MetaCart
Though not directly supported by U.S. ground forces, the villagers had received training and logistical support from U.S. Special Forces. Many hoped that the actions of the villagers could be replicated throughout the country. However, there are a limited number of U.S. Special Forces teams to conduct such missions. Can we identify a limited number of villages, such that if we provide them Special Forces support, the local revolts against the Taliban would be most likely to spread and stabilize at a large scale? In this paper, we take a network science approach to the problem. Using the tipping model from the social science literature, a network of villages created using tribal and spatial relationships, and a new software package called RASCAL (that the authors developed), we find that villages in Zabul have a significant influence on Helmand and Kandahar. In our results, villages in Zabul represented 40.00 % of the influential villages for those three provinces – while representing only 26.25 % of the total villages examined. THE TIPPING MODEL AND THE HKZ VILLAGE RELATIONSHIP NETWORK In the late 1970’s, Nobellaureate Thomas Schelling introduced what is known as the “tipping model.”[2] He originally used this model to describe how neighborhoods could become racially segregated. In his example, family of race Amoves out of their household after a certain number of families of the race B
DOI 10.1007/s0077801202991 REGULAR PAPER Extending the power of datalog recursion
"... Abstract Supporting aggregates in recursive logic rules represents a very important problem for Datalog. To solve this problem, we propose a simple extension, called DatalogF S (Datalog extended with frequency support goals), that supports queries and reasoning about the number of distinct variabl ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract Supporting aggregates in recursive logic rules represents a very important problem for Datalog. To solve this problem, we propose a simple extension, called DatalogF S (Datalog extended with frequency support goals), that supports queries and reasoning about the number of distinct variable assignments satisfying given goals, or conjunctions of goals, in rules. This monotonic extension greatly enhances the power of Datalog, while preserving (i) its declarative semantics and (ii) its amenability to efficient implementation via differential fixpoint and other optimization techniques presented in the paper. Thus, DatalogF S enables the efficient formulation of queries that could not be expressed efficiently or could not be expressed at all in Datalog with stratified negation and aggregates. In fact, using a generalized notion of multiplicity called frequency, we show that diffusion models and page rank computations can be easily expressed and efficiently implemented using DatalogF S.
ChoiceGAPs: Competitive Diffusion as a Massive MultiPlayer Game in Social Networks
"... We consider the problem of modeling competitive diffusion in real world social networks via the notion of ChoiceGAPs which combine choice logic programs due to Sacca ̀ and Zaniolo and Generalized Annotated Programs due to Kifer and Subrahmanian. We assume that each vertex in a social network is a pl ..."
Abstract
 Add to MetaCart
(Show Context)
We consider the problem of modeling competitive diffusion in real world social networks via the notion of ChoiceGAPs which combine choice logic programs due to Sacca ̀ and Zaniolo and Generalized Annotated Programs due to Kifer and Subrahmanian. We assume that each vertex in a social network is a player in a multiplayer game (with a huge number of players) — the choice part of the ChoiceGAPs describe utilities of players for acting in various ways based on utilities of their neighbors in those and other situations. We define multiplayer Nash equilibrium for such programs — but because they require some conditions that are hard to satisfy in the real world, we introduce a new modeltheoretic concept of strong equilibrium. We show that stable equilibria can capture all Nash equilibria. We prove a host of complexity (intractability) results for checking existence of strong equilibria (as well as related counting complexity results), together with algorithms to find them. We then identify a class of ChoiceGAPs for which stable equilibria can be polynomially computed. We develop algorithms for computing these equilibria under various restrictions. We come up with the important concept of an estimation query which can compute quantities w.r.t. a given strong equilibrium, and approximate ranges of values (answers) across the space of strong equilibria. Even though we show that computing range answers to estimation queries exactly is intractable, we are able to identify classes of estimation queries that can be answered in polynomial time. We report on experiments we conducted with a realworld FaceBook data set surrounding the 2013 Italian election showing that our algorithms have good predictive accuracy with an Area Under a ROC Curve that, on average, is over 0.76. NOTE: The paper is currently submitted to a journal suggesting parallel submission to the CoRR repository.
4. TITLE AND SUBTITLE Modeling Human Behavior at a Large Scale
, 2012
"... Public reporting burden for the collection of information is estimated to average 1 hour per response, including the time for reviewing instructions, searching existing data sources, gathering and maintaining the data needed, and completing and reviewing the collection of information. Send comments ..."
Abstract
 Add to MetaCart
(Show Context)
Public reporting burden for the collection of information is estimated to average 1 hour per response, including the time for reviewing instructions, searching existing data sources, gathering and maintaining the data needed, and completing and reviewing the collection of information. Send comments regarding this burden estimate or any other aspect of this collection of information,