## Optimal scaling of a gradient method for distributed resource allocation (2006)

Venue: | Journal of Optimization Theory and Applications |

Citations: | 9 - 2 self |

### BibTeX

@ARTICLE{Xiao06optimalscaling,

author = {L. Xiao and S. Boyd and Communicated P. Tseng},

title = {Optimal scaling of a gradient method for distributed resource allocation},

journal = {Journal of Optimization Theory and Applications},

year = {2006},

volume = {129},

pages = {2006}

}

### OpenURL

### Abstract

Abstract. We consider a class of weighted gradient methods for distributed resource allocation over a network. Each node of the network is associated with a local variable and a convex cost function; the sum of the variables (resources) across the network is fixed. Starting with a feasible allocation, each node updates its local variable in proportion to the differences between the marginal costs of itself and its neighbors. We focus on how to choose the proportional weights on the edges (scaling factors for the gradient method) to make this distributed algorithm converge and on how to make the convergence as fast as possible. We give sufficient conditions on the edge weights for the algorithm to converge monotonically to the optimal solution; these conditions have the form of a linear matrix inequality. We give some simple, explicit methods to choose the weights that satisfy these conditions. We derive a guaranteed convergence rate for the algorithm and find the weights that minimize this rate by solving a semidefinite program. Finally, we extend the main results to problems with general equality constraints and problems with block separable objective function. Key Words. Distributed optimization, resource allocations, weighted gradient methods, convergence rates, semidefinite programming. 1.

### Citations

4682 |
Matrix Analysis
- Horn, Johnson
- 1985
(Show Context)
Citation Context ...clude that x(t) is optimal. Next, we derive the guaranteed convergence rate (13). Note that, with the condition (11) and the inequality (17), we have f (x(t + 1)) − f (x(t)) ≤−(1/2)λn−1(V )�e(t)� 2 . =-=(18)-=- We shall derive another inequality relating f (x(t)),f∗ , and �e(t) 2�.Todoso,we use again the Taylor expansion of f at x(t). Using the assumption (2), we have f (y) ≥ f (x(t)) +∇f (x(t)) T (y − x(t)... |

2256 |
Equations of state calculations by fast computing machines
- Metropolis, Rosenbluth, et al.
- 1953
(Show Context)
Citation Context ...ing the right-hand side over z yields f (y) ≥ f (x(t)) − (1/2)�e(t)� 2 . Since this is true for all feasible y, it is certainly true for x ∗ . In other words, we have f ∗ ≥ f (x(t)) − (1/2)�e(t)� 2 . =-=(19)-=- Now combining the inequalities (18) and (19) yields f (x(t + 1)) − f ∗ ≤ [1 − λn−1(V )][f (x(t)) − f ∗ ], (20)sJOTA: VOL. 129, NO. 3, JUNE 2006 477 which gives the desired results (12) and (13). Sinc... |

391 | Algebriac Graph Theory
- Godsil, Gordon
- 2001
(Show Context)
Citation Context ... node i. From the condition (24), we can deduce a range of α that guarantees the convergence of the algorithm, −1/ � max i∈N diui � <α<0. (25) Actually, it is also safe to set α =−1 �� max i∈N diui � =-=(26)-=- unless all the diui’s are equal. This can be verified by considering the irreducibly diagonally dominant property of the matrix 2U−1 – W; see e.g., Ref. 18, Section 6.2. We call the constant weight i... |

219 |
Interior Point Algorithms: Theory and Analysis
- Ye
- 1997
(Show Context)
Citation Context ...the optimal solution. 2.1. Conditions for Symmetric Weights. When the weight matrix W is symmetric, the convergence conditions reduce to W = W T , W1 = 0, (21) 2W + (1/n)11 T ≻ 0, (22) 2U −1 − W ≻ 0. =-=(23)-=- To see this, we first rewrite the LMI (14) for symmetric W, � T 2W + (1/n)11 W W U−1 � ≻ 0.s478 JOTA: VOL. 129, NO. 3, JUNE 2006 Applying Schur complements, this LMI is equivalent to 2W + (1/n)11 T ≻... |

130 | Network Optimization: Continuous and Discrete Models, Athena Scientific
- Bertsekas
- 1998
(Show Context)
Citation Context ...ing order, λ1(Z) ≥ λ2(Z) ≥···≥λn(Z), where λi(Z) denotes the ith largest eigenvalue of Z. Theorem 2.1. If the weight matrix W satisfies 1 T W = 0,W1 = 0, and λn−1(L 1/2 (W + W T − W T UW)L 1/2 ) > 0, =-=(11)-=- then the algorithm (5) converges to the optimal solution x∗ of problem (1) and the objective function decreases monotonically. In fact, we have f (x(t)) − f ∗ ≤ η(W) t [f (x(0)) − f ∗ ] (12) with a g... |

123 |
Parallel and Distributed Computation
- BERTSEKAS
- 1989
(Show Context)
Citation Context ...d directly. The center-free algorithm considered in this paper belongs to a more general class of gradient-like algorithms studied in Ref. 15. In this paper, we give sufficient conditions weaker than =-=(10)-=- for the center-free algorithm to convergence and optimize the edge weights to get fast convergence. Our method is closely related to the approach in Ref. 16, where the problem of finding the fastest ... |

116 |
A microeconomic approach to optimal resource allocation in distributed computer systems
- Kurose, Sirnha
(Show Context)
Citation Context ...on problem (1) has a unique optimal solution x∗ .Let ∇f (x) = (f ′ 1 (x1),...,f ′ n (xn)) denote the gradient of f at x. The optimality conditions for this problem are 1 T x ∗ = c, ∇f (x ∗ ) = p ∗ 1, =-=(3)-=- where p∗ is the (unique) optimal Lagrange multiplier.sJOTA: VOL. 129, NO. 3, JUNE 2006 471 In a centralized setup, many methods can be used to solve the problem (1) or equivalently the optimality con... |

97 |
Network Flows and Monotropic Optimization
- Rockafellar
- 1984
(Show Context)
Citation Context ...pace of W. Assuming that the weights satisfy (8), we have Wii =− � Wij, j∈Ni which can be substituted into equation (4) to get xi(t + 1) = xi(t) − � Wij (f ′ j (xj (t)) − f ′ i (xi(t))), i = 1,...,n. =-=(9)-=- j∈Ni Thus, the change in the local variable at each step is given by a weighted sum of the differences between its own derivative value and those of tis neighbors. The equation (9) has a simple inter... |

61 | L.: What Do We Know about the Metropolis Algorithm
- Diaconis, Saloff-Coste
- 1998
(Show Context)
Citation Context ... it is certainly true for x ∗ . In other words, we have f ∗ ≥ f (x(t)) − (1/2)�e(t)� 2 . (19) Now combining the inequalities (18) and (19) yields f (x(t + 1)) − f ∗ ≤ [1 − λn−1(V )][f (x(t)) − f ∗ ], =-=(20)-=-sJOTA: VOL. 129, NO. 3, JUNE 2006 477 which gives the desired results (12) and (13). Since all x(t) are feasible, we always have f (x(t)) − f (x(t + 1)) ≤ f (x(t)) − f ∗ . Applying the two inequalitie... |

41 |
Handbook of Semidefinite Programing: Theory, Algorithms and Applications
- Wolkowicz, Saigal, et al.
- 1999
(Show Context)
Citation Context ...rty can be expressed as |2/ui − Wii| > � |Wij|, i = 1,...,n. j∈Ni If all the off-diagonal elements of W are nonpositive and Wii =− � j∈Ni Wij, the above condition becomes � |Wij| < 1/ui, i = 1,...,n. =-=(24)-=- j∈Ni When all the numbers ui are equal or simply take their maximum, as assumed in Ref. 7, the inequality (24) is exactly the third condition in (10). 3. Simple Weight Selections In this section, we ... |

13 |
A class of center-free resource allocation algorithms
- Ho, Servi, et al.
- 1980
(Show Context)
Citation Context ...rates x(t) of the algorithm are feasible, i.e., satisfy 1 T x(t) = c for all t. With the assumption that x(0) is feasible, this requirement will be met provided the weight matrix satisfies 1 T W = 0, =-=(7)-=-s472 JOTA: VOL. 129, NO. 3, JUNE 2006 since we then have 1 T x(t + 1) = 1 T x(t) − 1 T W∇f (x(t)) = 1 T x(t). We will also require, naturally, that the optimal point x ∗ is a fixed point of (5), i.e.,... |

7 |
Planning without prices
- Heal
- 1969
(Show Context)
Citation Context ...ces. We use the following family of functions: fi(xi) = (1/2)ai(xi − ci) 2 + log[1 + exp(bi(xi − di))], i = 1,...,n, with the coefficients ai,bi,ci,di generated randomly with uniform distributions on =-=[0, 2]-=-, [−2, 2], [−10, 10], [−10, 10] respectively. The second derivatives of these functions are f ′′ i (xi) = ai + b 2 i exp[bi(xi − di)]/[1 + exp(bi(xi − di))] 2 , i = 1,...,n, which have the following l... |

3 |
Asynchronous gradient algorithms for a class of convex separable network flow problems
- Baz
- 1996
(Show Context)
Citation Context ...and the objective function decreases monotonically. In fact, we have f (x(t)) − f ∗ ≤ η(W) t [f (x(0)) − f ∗ ] (12) with a guaranteed convergence rate η(W) = 1 − λn−1(L 1/2 (W + W T − W T UW)L 1/2 ). =-=(13)-=- Moreover, the condition (11) is equivalent to the strict linear matrix inequality (LMI) �W T T + W + (1/n)11 W W T U−1 � ≻ 0. (14)sJOTA: VOL. 129, NO. 3, JUNE 2006 475 Proof. Let �x(t) = x(t + 1) − x... |

3 |
BDistributed asynchronous deterministic and stochastic gradient optimization algorithms
- Tsitsiklis, Bertsekas, et al.
- 1986
(Show Context)
Citation Context ...t))W] ×∇f (x(t)). Using the assumption (2), we have ∇ 2 f (z(t)) � U, so f (x(t + 1)) ≤ f (x(t)) − (1/2)∇f (x(t)) T (W + W T − W T UW)∇f (x(t)) = f (x(t)) − (1/2)∇f (x(t)) T L −1/2 VL −1/2 ∇f (x(t)), =-=(15)-=- where V = L 1/2 (W + W T − W T UW)L 1/2 . (16) From conditions (7) and (8), i.e., 1 T W = 0 and W1 = 0, we conclude that L−1/21 is an eigenvector of the symmetric matrix V associated with the eigenva... |

1 |
ARROW,K.J.,andHURWICZ,L.,Decentralization and Computation in Resource Allocation, Essays in Economics and Econometrics, Edited by
- Pfouts
- 1960
(Show Context)
Citation Context ....t. i=1 n� xi = c, (1b) i=1 where c ∈ℜ is a given constant. We can think of xi as the amount of some resource located at node i and interpret – fi as the local (concave) utility function. The problem =-=(1)-=- is to fund an allocation of the resource that maximizes the total utility �n i=1 −fi(xi). In this paper, we are interested in distributed algorithms for solving this problem, where each node is only ... |

1 |
A.,Interior-Point Polynomial Algorithms
- NESTEROV, andNEMIROVSKII
(Show Context)
Citation Context ...ithm to converge to the optimal solution. 2.1. Conditions for Symmetric Weights. When the weight matrix W is symmetric, the convergence conditions reduce to W = W T , W1 = 0, (21) 2W + (1/n)11 T ≻ 0, =-=(22)-=- 2U −1 − W ≻ 0. (23) To see this, we first rewrite the LMI (14) for symmetric W, � T 2W + (1/n)11 W W U−1 � ≻ 0.s478 JOTA: VOL. 129, NO. 3, JUNE 2006 Applying Schur complements, this LMI is equivalent... |