Results

**1 - 6**of**6**### A MULTILEVEL BILINEAR PROGRAMMING ALGORITHM FOR THE VERTEX SEPARATOR PROBLEM ∗

"... ar ..."

(Show Context)
### Scheduling Storms and Streams in the Cloud

"... ABSTRACT Motivated by emerging big streaming data processing paradigms (e.g., Twitter Storm, Streaming MapReduce), we investigate the problem of scheduling graphs over a large cluster of servers. Each graph is a job, where nodes represent compute tasks and edges indicate data-flows between these co ..."

Abstract
- Add to MetaCart

(Show Context)
ABSTRACT Motivated by emerging big streaming data processing paradigms (e.g., Twitter Storm, Streaming MapReduce), we investigate the problem of scheduling graphs over a large cluster of servers. Each graph is a job, where nodes represent compute tasks and edges indicate data-flows between these compute tasks. Jobs (graphs) arrive randomly over time, and upon completion, leave the system. When a job arrives, the scheduler needs to partition the graph and distribute it over the servers to satisfy load balancing and cost considerations. Specifically, neighboring compute tasks in the graph that are mapped to different servers incur load on the network; thus a mapping of the jobs among the servers incurs a cost that is proportional to the number of "broken edges". We propose a low complexity randomized scheduling algorithm that, without service preemptions, stabilizes the system with graph arrivals/departures; more importantly, it allows a smooth trade-off between minimizing average partitioning cost and average queue lengths. Interestingly, to avoid service preemptions, our approach does not rely on a Gibb's sampler; instead, we show that the corresponding limiting invariant measure has an interpretation stemming from a loss system.

### Streaming Graph Partitioning in the Planted Partition Model

"... The sheer increase in the size of graph data has created a lot of interest into developing efficient distributed graph processing frameworks. Popular existing frameworks such as GraphLab and Pregel rely on balanced graph partition-ing in order to minimize communication and achieve work balance. In t ..."

Abstract
- Add to MetaCart

The sheer increase in the size of graph data has created a lot of interest into developing efficient distributed graph processing frameworks. Popular existing frameworks such as GraphLab and Pregel rely on balanced graph partition-ing in order to minimize communication and achieve work balance. In this work we contribute to the recent research line of streaming graph partitioning [30, 31, 34] which computes an approximately balanced k-partitioning of the vertex set of a graph using a single pass over the graph stream using degree-based criteria. This graph partitioning framework is well tailored to processing large-scale and dynamic graphs. In this work we introduce the use of higher length walks for streaming graph partitioning and show that their use incurs a minor computational cost which can significantly improve the quality of the graph partition. We perform an average case analysis of our algorithm using the planted partition model [7, 25]. We complement the recent results of Stanton [30] by showing that our proposed method recovers the true partition with high probability even when the gap of the model tends to zero as the size of the graph grows. Further-more, among the wide number of choices for the length of the walks we show that the proposed length is optimal. Finally, we perform simulations which indicate that our asymptotic results hold even for small graph sizes.

### A Continuous Refinement Strategy for the Multilevel Computation of Vertex Separators

"... ar ..."

(Show Context)
### An improved, easily computable combinatorial lower bound for weighted graph bipartitioning

, 2014

"... There has recently been much progress on exact algorithms for the (un)weighted graph (bi)partitioning problem using branch-and-bound and related methods. In this note we present and improve an easily computable, purely combinatorial lower bound for the weighted bipartitioning problem. The bound is c ..."

Abstract
- Add to MetaCart

(Show Context)
There has recently been much progress on exact algorithms for the (un)weighted graph (bi)partitioning problem using branch-and-bound and related methods. In this note we present and improve an easily computable, purely combinatorial lower bound for the weighted bipartitioning problem. The bound is computable in O(n log n+m) time steps for weighted graphs with n vertices and m edges. In the branch-and-bound setting, the bound for each new subproblem can be updated in O(n+ (m/n) log n) time steps amortized over a series of n branching steps; a rarely triggered tightening of the bound requires search on the graph of unassigned vertices and can take from O(n+m) to O(nm+ n2 log n) steps depending on implementation and possible bound quality. Representing a subproblem uses O(n) space. Although the bound is weak, we believe that it can be advantageous in a parallel setting to be able to generate many subproblems fast, possibly out-weighting the advantages of tighter, but much more expensive (algebraic, spectral, flow) lower bounds. We use a recent priority task-scheduling framework for giving a parallel implementation, and show the relative improvements in bound quality and solution speed by the different contributions of the lower bound. A detailed comparison with standardized input graphs to other lower bounds and frameworks is pending. Detailed investigations of branching and subproblem selection rules are likewise not the focus here, but various options are discussed. 1

### Graph Clustering Evaluation Metrics as Software Metrics

"... Graph clustering evaluation (GCE) metrics quantify the quality of clusters obtained by graph clustering (community detection) algorithms. In this paper we argue that GCE metrics can be applied on graph representations of software systems in order to evaluate the degree of cohesiveness of software en ..."

Abstract
- Add to MetaCart

Graph clustering evaluation (GCE) metrics quantify the quality of clusters obtained by graph clustering (community detection) algorithms. In this paper we argue that GCE metrics can be applied on graph representations of software systems in order to evaluate the degree of cohesiveness of software entities. In contrast to widely known cohesion measures used in software engi-neering, GCE metrics do not ignore external dependencies among software entities, but contrast them to internal dependencies to quantify cohesion. Using the theoretical framework of cohesion measurement in software engineering introduced by Briand et al. we investigate the properties of GCE metrics. Our analysis shows that GCE metrics are theoretically sound with respect to the monotonicity and merge property, but also reveals that they possess certain limitations whose importance is discussed in the paper. Finally, we propose a set of research questions for further empirical studies on this topic.