#### DMCA

## Sequential Importance Sampling Algorithms for Estimating the All-terminal Reliability Polynomial of Sparse Graphs

### BibTeX

@MISC{Harris_sequentialimportance,

author = {David G Harris and Francis Sullivan},

title = {Sequential Importance Sampling Algorithms for Estimating the All-terminal Reliability Polynomial of Sparse Graphs},

year = {}

}

### OpenURL

### Abstract

Abstract The all-terminal reliability polynomial of a graph counts its connected subgraphs of various sizes. Algorithms based on sequential importance sampling (SIS) have been proposed to estimate a graph's reliability polynomial. We show upper bounds on the relative error of three sequential importance sampling algorithms. We use these to create a hybrid algorithm, which selects the best SIS algorithm for a particular graph G and particular coefficient of the polynomial. This hybrid algorithm is particularly effective when G has low degree. For graphs of average degree ≤ 11, it is the fastest known algorithm; for graphs of average degree ≤ 45 it is the fastest known polynomial-space algorithm. For example, when a graph has average degree 3, this algorithm estimates to error in time O (1.26 n −2 ). Although the algorithm may take exponential time, in practice it can have good performance even on medium-scale graphs. We provide experimental results that show quite practical performance on graphs with hundreds of vertices and thousands of edges. By contrast, alternative algorithms are either not rigorous or are completely impractical for such large graphs. ACM Subject Classification G.2.1 Combinatorics, G.2.2 Graph theory, G.3 Probability and statistics Keywords and phrases Introduction Let G be a connected undirected multi-graph with vertex set V and edge set E. We define Rel(p), the all-terminal reliability polynomial of G, to be the probability that the graph remains connected when edges are removed independently with probability p. This function is a polynomial which can be factored as where N i is the number of connected subgraphs of G with i edges. This polynomial has various physical applications, for example determining the reliability of a computer network or power grid. * Research supported in part by NSF Awards CNS-1010789 and CCF-1422569. SIS Algorithms for Estimating All-terminal Reliability of Sparse Graphs Exactly computing the reliability polynomial is known to be #P-complete [20]. The best-known algorithm at this time is due to [3]; it runs in 2 n time and 2 n space, or 3 n time and polynomial space. While this is a great theoretical achievement, and works well on small graphs (n ≈ 40), it is completely impractical for larger graphs. In this paper, we give an algorithm targeting sparse graphs (graphs with average degree α = O(1)), running in polynomial space and exponential time, to estimate the reliability polynomial coefficients N t to some relative error . When α is a small constant, the algorithm may be significantly faster than [3]. When α ≤ 11, this algorithm runs faster than the exponential-space variant of [3], plus our algorithm is polynomial-space; in fact, our algorithm is the fastest known method in this regime. When α ≤ 45, it runs faster than the polynomialspace variant of [3]. Furthermore, our algorithm can be easily parallelized -absolutely no communication or memory is required between different processing elements. Furthermore, in practice this algorithm seems to scale much better than the theoretical guarantees would imply. We have conducted extensive experiments on relatively large graphs, up to hundreds of vertices. The algorithm can achieve a 10% error rate on all coefficients after a minutes' run-time. This put real-world networks within reach. By contrast, the existing algorithms such as [3] simply cannot be run on such large graphs -these algorithms will use more space or time than exists on any computer. Background A variety of approaches have been proposed to estimate the reliability polynomial, or fragments of it, for a graph. Some algorithms seek to compute Rel(p 0 ) for a fixed probability p 0 , such as [10] or [16]. The problem of estimating Rel(p 0 ) is related to the problem of estimating the reliability polynomial coefficients N i , but they are not equivalent especially in terms of evaluating the relative error. For example, Karger's algorithm [16] gives a fully-polynomial relative approximation scheme (fpras) for the problem of estimating 1 − Rel(p 0 ). Other algorithms, for example [19], [20] can estimate N t in polynomial time, but only for a narrow range of value of t, such as t = O(1) or t = m − n − O(1). [5] gives an algorithm to transform a given dense graph G into a relatively sparse graph G , with only O(n log n) edges, which has approximately the same reliability. There is no known polynomial-time algorithm to estimate arbitrary coefficients. Many heuristic algorithms have been studied to approximate the all-terminal reliability. For example, [4] discusses an algorithm based on Monte-Carlo-Markov-Chain sampling of connected subgraphs. For the special case of directed acyclic graphs, [18] describes an algorithm based on Monte-Carlo sampling for the closely related problem of estimating s − t connectivity. Algorithms based on sequential importance sampling are described in [2], [14], [8]. While these approximations can be effective in practice, typically they do not have rigorous complexity bounds. This can be especially problematic for estimation algorithms: if the algorithm is run for too few iterations, then it may produce a wrong estimate despite outwardly appearing to run successfully. The Tutte polynomial The reliability polynomial is a special case of the Tutte polynomial, a very powerful graph invariant. While the reliability polynomial counts only the connected subgraphs of G, the Tutte polynomial also encodes information about the decomposition of G into its connected components. D. G. Harris and F. Sullivan 325 Our contribution In this paper, we will propose a new algorithm to estimate the reliability polynomial coefficients N t for graphs whose average degree α is smal. The running time of this algorithm will have the form (χ α +o α (1)) n / 2 , where is the desired relative error and χ α is a parameter depending on α. The space required by the algorithm is polynomial in all parameters. This algorithm has a number of advantages over the existing ones, both from a theoretical and a practical point of view. First, the theory: for small α, the algorithm may be significantly faster than Bjorklund et al. or any other known algorithm. As we will see, this algorithm will be faster than any known algorithm for α ≤ 11. At this point, the algorithm of Bjorklund et al. becomes faster, although the latter requires exponential space. Our algorithm will be faster than any known polynomial-space algorithm for α ≤ 45. 1 Note that we are considering the most general class of sparse graphs, while other algorithms of [3], [7], [11] were specialized to more restrictive classes of sparse graphs (most prominently, bounded-degree). Second, in practice this algorithm seems to scale much better than the theoretical guarantees would imply. We have conducted extensive experiments on medium-scale graphs, up to hundreds of vertices. The algorithm can achieve a 10% error rate on all coefficients after a minutes' run-time. This puts real-world networks within reach, whereas the alternative exponential-time algorithms would never finish their computations on such large graphs. There is a synergy between the theoretical analysis and the practical performance. Certain key parameters must be tuned accurately for our algorithm to guarantee good performance. These parameters are difficult to set empirically, and if they are set incorrectly, the algorithm can appear to run perfectly well but nevertheless return inaccurate results. By setting these parameters in accordance with the worst-case analysis, we tend to achieve results that are accurate in practice. This is despite the fact that the worst-case analysis, for such large graphs, is not directly relevant to practical computations. 1 This is likely a conservative estimate of the worst-case behavior of our algorithm; an accurate estimate would require solving some open problems in extremal graph theory. APPROX/RANDOM'15 SIS Algorithms for Estimating All-terminal Reliability of Sparse Graphs 2 General approach The algorithm we propose is based on sequential-importance sampling (SIS). Three SIS approaches were proposed in [2],[14], [8]. Experimental evidence showed that these algorithms could give quite effective estimates, at least in certain parameter ranges. The SIS algorithms can be divided into two types. The first, which we refer to as top-down, start by removing edges from the initial graph G and counting the connectivity of the resulting subgraph. The second, which we refer to as bottom-up, start from a spanning tree of G and add edges. On any particular graph, the top-down algorithms tend to have better relative variance when the t is large; the bottom-up algorithms tend to have better relative variance when t is small. (Recall N t is the coefficient of interest.) There is a natural strategy to combine the strengths of these algorithms: for any given graph G and any desired coefficient N t , run all three algorithms in parallel and select the one which gives the best estimate. Unfortunately, there is not any generic method to determine which statistic has lowest variance, and in fact exponentially many samples may be required to calculate a sample variance accurately. Thus, exponential time might be incurred in simply accumulating enough statistical information in order to select the appropriate algorithm. In this paper, we compute upper bounds on the precision of the three algorithms. These upper bounds play two roles. First, they allow us to find upper bounds on the running time required to achieve any desired level of relative accuracy. The main probabilistic tool we use in this paper is to estimate the relative variance of the estimate As a consequence of the Chebyshev inequality, one achieves the desired precision with high probability after extracting Θ(rv(F )/ 2 ) independent samples. Hence, if we can bound the relative variance of any of the SIS algorithms as (χ + o(1)) n , this implies that the running time of the resulting hybrid algorithm will also be (χ + o(1)) n / 2 . We will see that for a fixed value of the average degree α, we can achieve good bounds on χ α . The second, less-obvious function of these upper bounds, is that they allow us to create a hybrid algorithm, which computes the upper bounds and publishes the statistic with smallest upper-bound on precision. That is, for any fixed graph G, these bounds give us a sensible strategy on which coefficients to estimate with the top-down algorithm and which coefficients to estimate with a bottom-up algorithm. In practice, these SIS algorithms can be far more accurate than the worst-case analysis would predict. We will examine some examples of this in Section 5. Thus, we can evaluate this hybrid algorithm empirically, and we see that it can give useful estimates for graphs of moderate size (hundreds of vertices), while the exact exponential-time algorithms are impractical for tens of vertices. Once we estimate N t , we can automatically estimate Rel(p) for any fixed value of p. For, if we estimate N t up to relative error for all t, we may then estimate and we note that all the summands in (1) are positive. Despite the advantages of our algorithm, we note that it is ultimately much more specialized than that of Bjorklund et al., for three reasons. First, we only estimate the coefficients instead of computing them exactly, and we do so probabilistically instead of deterministically. Second, this algorithm applies only to the reliability polynomial, which is D. G. Harris and F. Sullivan 327 a special case of the Tutte polynomial. Third, the complexity of our algorithm is exponential in the number of edges, and hence for dense graphs it is slower. Notation We are given an connected undirected graph G, which may have multiple edges or self-loops. We will use m, n to refer to the number of edges and vertices in the graph G. Our algorithm seeks to estimate N t , and t is always the index of the coefficient we are interested in. The average degree α is given by α = 2m/n. Note that as G is connected, we must have α ≥ 2 − 2/n. In this paper, we will only be concerned with the case that α is a small constant. Hence, we will assume throughout that α ≤ Cn, where C is some large fixed constant. Together with the restriction that α ≥ 1, this implies that any asymptotics in m can be reduced to equivalent asymptotics in n. We will make this assumption m = Θ(n) throughout this paper. We let K = m − n + 1 and let k = t − n + 1. Note that spanning trees of G have n − 1 edges, and so k counts the distance of t from its maximal value K. This is particularly useful for algorithms which add edges to spanning trees. We let κ(G) denote the number of (labelled) spanning trees of G. Note that using the Kirchoff formula, κ(G) can be computed in polynomial time. We will frequently use the entropy function in estimating binomial coefficients. To simplify the notation in these estimates we define l(x) = x ln x. By Stirling's formula, for any c ∈ [0, 1] we have We will always use the notation β = t/n. Note that β ∈ [1 − 1/n, α/2]. The Kruskal-Katona Theorem We will frequently need to lower-bound N t in terms of κ(G). A key technical tool to do so comes from the . This is a basic combinatorial principle which gives bounds on the number of objects in a family of sets which is upward-closed. Graph connectivity has this property, as if a graph H is connected and H ⊇ H then H must be connected as well. We will use a simplified version of this pricniple, which is slightly less accurate than the full Kruskal-Katona bound but it adequate for our purposes. 2 Theorem 1 ([17]). Let m be the unique integer such that Then for all t ≥ n − 1 we have 2 We contrast our use of the Kruskal-Katona Theorem to that of [1]. [1] uses this bound, or variants of it, to estimate Nt itself. We use this bound to estimate the error committed by our SIS algorithms, but these algorithms do not themselves refer to the Kruskal-Katona bounds in any way. APPROX/RANDOM'15 SIS Algorithms for Estimating All-terminal Reliability of Sparse Graphs We can explain the intuition behind this result. If we completely ignore graph connectivity, then there are exactly m m−t subgraphs with t edges. Thus, the Kruskal-Katona Theorem states that the number of connected subgraphs and the number of subgraphs have a similar behavior as a function of t. We will use throughout the notation The basic algorithms We begin by describing two very simple statistics to estimate N t . These statistics are accurate for different ranges of the parameter t. We then discuss how to select the best statistic for a given value of t. Top-down algorithm bounds Let us first consider a top-down estimate, which starts with the original graph and removes edges until it reaches a disconnected graph. ; otherwise estimate N t = 0. Note that this algorithm is not really a "top-down" algorithm any more, since we could produce H by adding edges as well as removing edges. The statistic produced by TOPDOWN is clearly an unbiased estimator for N t . Proposition 2. Define Then TOPDOWN has relative variance exp(n(f T + o(1))). Proof. TOPDOWN is a Bernoulli random variable with probability N t / m t , so it has relative variance ( . Recalling that t = βn, m = alphan/2, m = γn, e apply Stirling's formula (2). (We note here that m = Θ(n), so all asymptotic terms can be reduced to o(n)). This gives the estimate: We note one key difference between this algorithm, based on estimating the reliability coefficients, and similar algorithms such as [21] which seek to compute the coefficients exactly. In this case, we are doing a Monte-Carlo estimation of the number of connected subgraphs, so our algorithm is more efficient when the subgraphs are numerous. Enumerative algorithms, by contrast, must explore each subgraph individually, and so they are more efficient when the subgraphs are few. D. G. Harris and F. Sullivan 329 Bottom-up, no edge weighting The bottom-up algorithm is based on starting with a random spanning tree and adding edges to it. This type of algorithm is discussed in [8], [14]. In Section 4, we will consider more sophisticated variants which use add edges with a non-uniform probability. However, as a warm-up exercise, we consider the following simple algorithm which we refer to as BOTTOMUP: 1. Choose a spanning tree T uniformly at random from G. (This can be done efficiently as described in e.g. [22]) 2. Choose k = t − n + 1 edges uniformly at random from G − T , and add these edges to obtain a connected subgraph H ⊆ G. Estimate Recall that κ(G) can be computed in polynomial time, so this is a polynomial-time algorithm. Proposition 3. Define Then algorithm BOTTOMUP is an unbiased estimator with relative variance at most exp(n(f B + o(1))). Proof. Let H be a connected subgraph of G with t edges. Then BOTTOMUP selects H with probability Integrating over all such H, we see that the expected value of N t is given by Next, we can estimate Hence, the relative variance is given by Applying Stirling's formula (2), and recalling m = Θ(n), shows that this is at most exp(n(f B + o(1))), as desired. We seek to maximize f B subject to f T = f B . In general, there is no closed form solution. However, for any given value of α we can numerically optimize this as follows. Given any value of γ, there is a unique value β such that f T = f B , which can be found to arbitrary accuracy via bisection. This essentially reduces the search to a single variable γ. One can verify that the resulting function f T is unimodal in γ, and hence the maximum value of γ can be found again by a golden-section search strategy. This allows us to find a value β * , γ * maximizing min(f T , f B ) to an arbitrary accuracy. The resulting value χ α is the worst-case for the running time, as a function of α. That is, when α is constant and n → ∞, we have rv = exp(nχ α + o(n)) for graphs with average degree ≤ α. Hybrid algorithm The following table shows bounds on χ α for various values of α. Note that unlike algorithms which restrict to regular or bounded-degree graphs, there is no restriction on the integrality of α. For α ≤ 8, this algorithm is strictly faster than [3]; for α ≤ 35 this algorithm is faster than the polynomial-space variant of Bjorklund et al. We will improve this still further in the next section. 4 Bottom-up, edge weighting We consider a variant of the bottom-up algorithm in which a weighting function is used to select the edges to add to the spanning tree, as described in [14]. Again without concerning ourselves with polynomial efficiency, we summarize this algorithm as 1. Select a spanning tree T uniformly at random 2. For i = 1, . . . , k, repeat the following: 3. Randomly select an edge e i to add to T . We use the probability distribution P i (which is conditioned on e 1 , . . . , e i−1 , T ) to select the edge e i , and this probability distribution P i is given by Estimate This is a generalization of the algorithm of [14], in that we allow a weighting factor ρ ∈ [0, 1]. (In the algorithm of [14], ρ = 1). Note that for ρ = 0, there is a uniform distribution on the new edge e i , and so this reduces to the unweighted bottom-up algorithm of Section 3.2. Experimental evidence in [14] suggested that this algorithm had better variance for estimating N t for small values of t. In this section, we examine this algorithm rigorously and show an upper bound which is better than that of Section 3.2. D. G. Harris and F. Sullivan 331 For any connected subgraph H ⊆ G, we define the notations for this section: and we define where T is the spanning tree selected at step (1) of this algorithm. Thus, we have P i (e) = λ Hi (e)/S Hi for i = 1, . . . , k. We first show that this is an unbiased estimator. Any T, e 1 , . . . , e k is selected with probability p = 1 κ(G) P 1 (e 1 )P 2 (e 2 ) . . . P k (e k ). Integrating over T, e 1 , . . . , e k , we compute the expected value: so the statistic is unbiased. Next, the mean square is given by To interpret this quantity, consider the following random process. We select a connected subgraph H with t edges, uniformly at random among all such subgraphs. Next, we select a random spanning tree T of H and a random permutation of the remaining edges e 1 , . . . , e k ∈ H − T . We then output the random variable R given by , and hence the relative variance of the weighted bottom-up estimate is given by Then, for any connected subgraph H ⊆ G with βn edges, we have R ≤ exp(n(f B2 + o(1))), for This then implies that the relative variance of the bottom-up algorithm is bounded by rv = exp((f As before, for any given α we seek β * , γ * so that the resulting upper bound min(f T , f B2 ) is maximized. This expression is too complicated for us to solve in closed form, or even to prove that all relevant functions have the appropriate smoothness to allow a rigorous numerical analysis. However, for any fixed ρ ∈ [0, 1] we can approximately solve this numerically. Using off-the-shelf numerical libraries, we optimize f B2 subject to f T = f B2 . We can furthermore set ρ to minimize the resulting f B2 . For any average degree α, we select an optimal parameter ρ * . The following For α ≤ 11, this algorithm is strictly faster than [3]; for α ≤ 45 this algorithm is faster than the polynomial-space variant of [3]. Practical Performance One key advantage of this algorithm is that it can be used on real-world graphs up to hundreds of vertices. In this case, the worst-case analysis would indicate exponentially low accuracy. However, in practice the accuracy may be much better than this. For our first test case, we generated Erdős-Renyi graphs of average degree 10 and ran the algorithm as specified in Appendix A. (Qualitatively similar results are seen for other edge densities). We tabulate the estimated relative error as well as the running time of a single iteration. For the most part, implementing this algorithm requires only slight modifications to the codes of [14], [15]; see these for more details. The relative variance is clearly growing exponentially with n, but the rate of growth (about 1.05 n ) is much slower than the bound of 1.89 n as indicated in the worst-case analysis. Hence for graphs of moderate size n ≈ 200 this algorithm remains quite practical. A similar result was seen for Barabasi-Albert random graphs, again of average degree 10, as shown in D. G. Harris and F. Sullivan The relative variance is significantly higher in this case, but the rate of increase remains slowly exponential, about 1.075 n . The running times of these algorithms are relatively modest, and growing at a rate of about n 1.5 , as indicated in Recall that to achieve a relative error , we must repeat this algorithm for a number of samples T = rv × −2 . In order to achieve a relative error of say 10% on the Erdős-Renyi graph with n = 150, we would need to run for approximately 200000 iterations; this would entail a running time of about 300 seconds. (And furthermore this work could be completely parallelized). Hence this algorithm provides a quite practical method for estimating graph reliability on medium-scale graphs. Possible further improvements The parameters (specifically, the choice of β) suggested by our worst-case analysis are not optimal for these sample graphs. By choosing parameters better, we could reduce the error by orders of magnitude. There seem to be three main reasons these bounds are not tight in practice. First, the top-down algorithm of [15] is much more accurate than the simple top-down algorithm we analyze in this paper. Second, our estimate for the accuracy of the bottom-up algorithm is too conservative. The bottom-up error should be discounted by a factor of E[1/κ(H)], where H is the subgraph chosen by the bottom-up algorithm. In the worst case, the expected value of this term might be very large if some subgraphs H have many spanning trees and some have few. In practice, all the subgraphs H tend to have about the same number of spanning trees, and so E[1/κ(H)] is small. Third, our method of setting the parameter ρ depends on estimating κ(H ∪ e 1 · · · ∪ e i ) where H is a subgraph of G and e i are edges in G − H. It is currently an open problem in graph theory to determine tight bounds in this case. We are forced to use an upper bound for κ(H ∪ e 1 · · · ∪ e i ) which is much larger than necessary. This causes us to set ρ to an excessively large value. Conclusion We have shown exponential bounds on the relative variance of three SIS algorithms for estimating the graph reliability polynomial. These bounds are simple computable functions of G. By choosing the algorithm which minimizes the upper bound, we define a hybrid algorithm with worst-case relative variance O(χ n α ). Hence with O(χ n α / 2 ) work, one can, with high probability, estimate the graph reliability polynomial to relative error . Although this is exponential, we believe it is the fastest known algorithm for estimating the graph reliability polynomial when the average degree α is small. Note that this bound on relative variance depended on bounding the number of spanning trees of certain sparse graphs. As this is an open problem in graph theory, the bounds we use are far from tight. It is likely that the true behavior of this algorithm is much better than the indicated values of χ α . D. G. Harris and F. Sullivan 335 In practice, these SIS algorithms tend to exhibit exponential relative variance on realworld graphs [14], [15]; however, the errors increase much slower than the worst-case analysis predicts. Hence, on many medium-scale graphs (n ≈ 200) these algorithms can give a quite practical approach to estimate the graph reliability. In these cases exact, exponential-time algorithms such as [3] are absolutely infeasible. Acknowledgments. Thanks to Isabel Beichl and Aravind Srinivasan for helpful comments and revisions.