Results 1 
9 of
9
Core Decomposition of Uncertain Graphs
"... Core decomposition has proven to be a useful primitive for a wide range of graph analyses. One of its most appealing features is that, unlike other notions of dense subgraphs, it can be computed linearly in the size of the input graph. In this paper we provide an analogous tool for uncertain graphs, ..."
Abstract

Cited by 4 (1 self)
 Add to MetaCart
(Show Context)
Core decomposition has proven to be a useful primitive for a wide range of graph analyses. One of its most appealing features is that, unlike other notions of dense subgraphs, it can be computed linearly in the size of the input graph. In this paper we provide an analogous tool for uncertain graphs, i.e., graphs whose edges are assigned a probability of existence. The fact that core decomposition can be computed efficiently in deterministic graphs does not guarantee efficiency in uncertain graphs, where even the simplest graph operations may become computationally intensive. Here we show that core decomposition of uncertain graphs can be carried out efficiently as well. We extensively evaluate our definitions and methods on a number of realworld datasets and applications, such as influence maximization and taskdriven team formation.
The Pursuit of a Good Possible World: Extracting Representative Instances of Uncertain Graphs
"... Data in several applications can be represented as an uncertain graph, whose edges are labeled with a probability of existence. Exact query processing on uncertain graphs is prohibitive for most applications, as it involves evaluation over an exponential number of instantiations. Even approximate pr ..."
Abstract

Cited by 4 (1 self)
 Add to MetaCart
(Show Context)
Data in several applications can be represented as an uncertain graph, whose edges are labeled with a probability of existence. Exact query processing on uncertain graphs is prohibitive for most applications, as it involves evaluation over an exponential number of instantiations. Even approximate processing based on sampling is usually extremely expensive since it requires a vast number of samples to achieve reasonable quality guarantees. To overcome these problems, we propose algorithms for creating deterministic representative instances of uncertain graphs that maintain the underlying graph properties. Specifically, our algorithms aim at preserving the expected vertex degrees because they capture well the graph topology. Conventional processing techniques can then be applied on these instances to closely approximate the result on the uncertain graph. We experimentally demonstrate, with real and synthetic uncertain graphs, that indeed the representative instances can be used to answer, efficiently and accurately, queries based on several properties such as shortest path distance, clustering coefficient and betweenness centrality.
Node classification in uncertain graphs
 In SSDBM
, 2014
"... In many real applications that use and analyze networked data, the links in the network graph may be erroneous, or derived from probabilistic techniques. In such cases, the node classification problem can be challenging, since the unreliability of the links may affect the final results of the class ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
In many real applications that use and analyze networked data, the links in the network graph may be erroneous, or derived from probabilistic techniques. In such cases, the node classification problem can be challenging, since the unreliability of the links may affect the final results of the classification process. In this paper, we focus on situations that require the analysis of the uncertainty that is present in the graph structure. We study the novel problem of node classification in uncertain graphs, by treating uncertainty as a firstclass citizen. We propose two techniques based on a Bayes model, and show the benefits of incorporating uncertainty in the classification process as a firstclass citizen. The experimental results demonstrate the effectiveness of our approaches. 1.
Efficient Computation of Feedback Arc Set at WebScale
"... ABSTRACT The minimum feedback arc set problem is an NPhard problem on graphs that seeks a minimum set of arcs which, when removed from the graph, leave it acyclic. In this work, we investigate several approximations for computing a minimum feedback arc set with the goal of comparing the quality of ..."
Abstract
 Add to MetaCart
(Show Context)
ABSTRACT The minimum feedback arc set problem is an NPhard problem on graphs that seeks a minimum set of arcs which, when removed from the graph, leave it acyclic. In this work, we investigate several approximations for computing a minimum feedback arc set with the goal of comparing the quality of the solutions and the running times. Our investigation is motivated by applications in Social Network Analysis such as misinformation removal and label propagation. We present careful algorithmic engineering for multiple algorithms to improve the scalability of each approach. In particular, two approaches we optimize (one greedy and one randomized) provide a nice balance between feedback arc set size and running time complexity. We experimentally compare the performance of a wide range of algorithms on a broad selection of large online networks including Twitter, LiveJournal, and the Clueweb12 dataset. The experiments reveal that our greedy and randomized implementations outperform the other approaches by simultaneously computing a feedback arc set of competitive size and scaling to webscale graphs with billions of vertices and tens of billions of arcs. Finally, we extend the algorithms considered to the probabilistic case in which arcs are realized with some fixed probability and provide detailed experimental comparisons.
On Uncertain Graphs Modeling and Queries
"... ABSTRACT Largescale, highlyinterconnected networks pervade both our society and the natural world around us. Uncertainty, on the other hand, is inherent in the underlying data due to a variety of reasons, such as noisy measurements, lack of precise information needs, inference and prediction mode ..."
Abstract
 Add to MetaCart
(Show Context)
ABSTRACT Largescale, highlyinterconnected networks pervade both our society and the natural world around us. Uncertainty, on the other hand, is inherent in the underlying data due to a variety of reasons, such as noisy measurements, lack of precise information needs, inference and prediction models, or explicit manipulation, e.g., for privacy purposes. Therefore, uncertain, or probabilistic, graphs are increasingly used to represent noisy linked data in many emerging application scenarios, and they have recently become a hot topic in the database research community. While many classical graph algorithms such as reachability and shortest path queries become #Pcomplete, and hence, more expensive in uncertain graphs; various complex queries are also emerging over uncertain networks, such as pattern matching, information diffusion, and influence maximization queries. In this tutorial, we discuss the sources of uncertain graphs and their applications, uncertainty modeling, as well as the complexities and algorithmic advances on uncertain graphs processing in the context of both classical and emerging graph queries. We emphasize the current challenges and highlight some future research directions.
Fast Reliability Search in Uncertain Graphs
"... Uncertain, or probabilistic, graphs have been increasingly used to represent noisy linked data in many emerging application scenarios, and have recently attracted the attention of the database research community. A fundamental problem on uncertain graphs is reliability, which deals with the probab ..."
Abstract
 Add to MetaCart
(Show Context)
Uncertain, or probabilistic, graphs have been increasingly used to represent noisy linked data in many emerging application scenarios, and have recently attracted the attention of the database research community. A fundamental problem on uncertain graphs is reliability, which deals with the probability of nodes being reachable one from another. Existing literature has exclusively focused on reliability detection, which asks to compute the probability that two given nodes are connected. In this paper we study reliability search on uncertain graphs, which we define as the problem of computing all nodes reachable from a set of query nodes with probability no less than a given threshold. Existing reliabilitydetection approaches are not wellsuited to efficiently handle the reliabilitysearch problem. We propose RQtree, a novel index which is based on a hierarchical clustering of the nodes in the graph, and further optimized using a balancedminimumcut criterion. Based on RQtree, we define a fast filteringandverification online queryevaluation strategy that relies on a maximumflowbased candidategeneration phase, followed by a verification phase consisting of either a lowerbounding method or a sampling technique. The first verification method returns no incorrect nodes, thus guaranteeing perfect precision, completely avoids sampling, and is more efficient. The second verification method ensures instead better recall. Extensive experiments on realworld uncertain graphs show that our methods are very efficient—over stateoftheart reliabilitydetection methods, we obtain a speedup up to five orders of magnitude; as well as accurate—our techniques achieve precision> 0.95 and recall usually higher than 0.75. 1.
Uncertain Graph Processing through Representative Instances and Sparsification
"... Data in several applications can be represented as an uncertain graph, whose edges are labeled with a probability of existence. Currently, most query and mining tasks on uncertain graphs are based on MonteCarlo sampling, which is rather time consuming for the large uncertain graphs commonly found i ..."
Abstract
 Add to MetaCart
(Show Context)
Data in several applications can be represented as an uncertain graph, whose edges are labeled with a probability of existence. Currently, most query and mining tasks on uncertain graphs are based on MonteCarlo sampling, which is rather time consuming for the large uncertain graphs commonly found in practice (e.g., social networks). To overcome the high cost, in this doctoral work we propose two approaches. The first extracts deterministic representative instances that capture structural properties of the uncertain graph. The query and mining tasks can then be efficiently processed using deterministic algorithms on these representatives. The second approach sparsifies the uncertain graph (i.e., reduces the number of its edges) and redistributes its probabilities, minimizing the information loss. Then, MonteCarlo sampling applied to the reduced graph becomes much more efficient. 1.
Fast Reliability Search in Uncertain Graphs
"... Uncertain, or probabilistic, graphs have been increasingly used to represent noisy linked data in many emerging application scenarios, and have recently attracted the attention of the database research community. A fundamental problem on uncertain graphs is reliability, which deals with the probab ..."
Abstract
 Add to MetaCart
(Show Context)
Uncertain, or probabilistic, graphs have been increasingly used to represent noisy linked data in many emerging application scenarios, and have recently attracted the attention of the database research community. A fundamental problem on uncertain graphs is reliability, which deals with the probability of nodes being reachable one from another. Existing literature has exclusively focused on reliability detection, which asks to compute the probability that two given nodes are connected. In this paper we study reliability search on uncertain graphs, which we define as the problem of computing all nodes reachable from a set of query nodes with probability no less than a given threshold. Existing reliabilitydetection approaches are not wellsuited to efficiently handle the reliabilitysearch problem. We propose RQtree, a novel index which is based on a hierarchical clustering of the nodes in the graph, and further optimized using a balancedminimumcut criterion. Based on RQtree, we define a fast filteringandverification online queryevaluation strategy that relies on a maximumflowbased candidategeneration phase, followed by a verification phase consisting of either a lowerbounding method or a sampling technique. The first verification method returns no incorrect nodes, thus guaranteeing perfect precision, completely avoids sampling, and is more efficient. The second verification method ensures instead better recall. Extensive experiments on realworld uncertain graphs show that our methods are very efficient—over stateoftheart reliabilitydetection methods, we obtain a speedup up to five orders of magnitude; as well as accurate—our techniques achieve precision> 0.95 and recall usually higher than 0.75. 1.