Results 1 - 10
of
10
Sorting on the OTIS-Mesh
- Proc. 14 th Int’l Parallel and Distributed Processing Symp
, 2000
"... In this paper we present sorting algorithms on the recently introduced N 2 processor OTIS-Mesh, a network with diameter 4 p N \Gamma 3 consisting of N connected meshes of size p N \Theta p N . We show that k-k sorting can be done in 8 p N + O(N 1 3 ) steps for k = 1; 2; 3; 4 and in 2k ..."
Abstract
-
Cited by 11 (0 self)
- Add to MetaCart
In this paper we present sorting algorithms on the recently introduced N 2 processor OTIS-Mesh, a network with diameter 4 p N \Gamma 3 consisting of N connected meshes of size p N \Theta p N . We show that k-k sorting can be done in 8 p N + O(N 1 3 ) steps for k = 1; 2; 3; 4 and in 2k p N + O(kN 1 3 ) steps for k ? 4 with constant buffersize for all k. We show how our algorithms can be modified to achieve 4 p N+O(N 1 3 ) steps for k = 1; 2; 3; 4 and k p N+O(kN 1 3 ) steps for k ? 4 in the average case. Finally we show a lower bound of maxf4 p N ; 1 p 2 k p Ng steps for k-k sorting. 1. Introduction Several models for parallel machines were studied in the past and it has turned out that no model ideally fits for all applications. Especially well studied topologies is the mesh of processors or mesh-connected array which is a simple architecture that fulfills the demands of VLSI technology quite well. The two-dimensional mesh is ideally suited for seve...
Analysis of Bernstein's Factorization Circuit
, 2002
"... In [1], Bernstein proposed a circuit-based implementation of the matrix step of the number field sieve factorization algorithm. These circuits o er an asymptotic cost reduction under the measure "construction cost × run time". We evaluate the cost of these circuits, in agreement with [1], but ..."
Abstract
-
Cited by 7 (2 self)
- Add to MetaCart
In [1], Bernstein proposed a circuit-based implementation of the matrix step of the number field sieve factorization algorithm. These circuits o er an asymptotic cost reduction under the measure "construction cost × run time". We evaluate the cost of these circuits, in agreement with [1], but argue that compared to previously known methods these circuits can factor integers that are 1.17 times larger, rather than 3.01 as claimed (and even this, only under the non-standard cost measure).
How Helpers Hasten h-Relations
- IN EUROPEAN SYMPOSIUM ON ALGORITHMS
, 2000
"... We study the problem of exchanging a set of messages among a group of processors using the model of simplex communication. Each processor has a unidirectional connection into a fast network. Messages may consist of different numbers of packets. Let h denote the maximum number of packets that a p ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
We study the problem of exchanging a set of messages among a group of processors using the model of simplex communication. Each processor has a unidirectional connection into a fast network. Messages may consist of different numbers of packets. Let h denote the maximum number of packets that a processor must send and receive. If all the packets need to be delivered directly, at least h communication steps are needed to solve the problem. We show that by allowing forwarding, only h +O(1) time steps are needed to exchange all the messages, and this is optimal. Our work was motivated by the importance of irregular message exchanges in distributed-memory parallel computers, but it can also be viewed as an answer to an open problem on scheduling file transfers posed by Coffmann, Garey, Johnsson, and LaPaugh in 1985.
Optimal Load-Balancing
- in Proceedings of IEEE Infocom
, 2005
"... This paper is about load-balancing packets across multiple paths inside a switch, or across a network. It is motivated by the recent interest in load-balanced switches. Load-balanced switches provide an appealing alternative to crossbars with centralized schedulers. A load-balanced switch has no sch ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
This paper is about load-balancing packets across multiple paths inside a switch, or across a network. It is motivated by the recent interest in load-balanced switches. Load-balanced switches provide an appealing alternative to crossbars with centralized schedulers. A load-balanced switch has no scheduler, is particularly amenable to optics, and -- most relevant here -- guarantees 100% throughput. A uniform mesh is used to loadbalance packets uniformly across all 2-hop paths in the switch. In this paper we explore whether this particular method of load-balancing is optimal in the sense that it achieves the highest throughput for a given capacity of interconnect. The method we use allows the load-balanced switch to be compared with ring, torus and hypercube interconnects, too. We prove that for a given interconnect capacity, the load-balancing mesh has the maximum throughput. Perhaps surprisingly, we find that the best mesh is slightly non-uniform, or biased, and has a throughput of N/(2N-1), where N is the number of nodes.
Lattice networks: Capacity limits, optimal routing and queueing behavior
"... Abstract—Lattice networks are widely used in regular settings like grid computing, distributed control, satellite constellations, and sensor networks. Thus, limits on capacity, optimal routing policies, and performance with finite buffers are key issues and are addressed in this paper. In particular ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Abstract—Lattice networks are widely used in regular settings like grid computing, distributed control, satellite constellations, and sensor networks. Thus, limits on capacity, optimal routing policies, and performance with finite buffers are key issues and are addressed in this paper. In particular, we study the routing algorithms that achieve the maximum rate per node for infinite and finite buffers in the nodes and different communication models, namely uniform communications, central data gathering and border data gathering. In the case of nodes with infinite buffers, we determine the capacity of the network and we characterize the set of optimal routing algorithms that achieve capacity. In the case of nodes with finite buffers, we approximate the queue network problem and obtain the distribution on the queue size at the nodes. This distribution allows us to study the effect of routing on the queue distribution and derive the algorithms that achieve the maximum rate. Index Terms—Border data gathering, data gathering, lattice networks, network capacity, queueing theory, routing, square grid, torus, uniform communication. I.
Short cut eulerian routing of datagrams in all optical point-to-point networks
, 2001
"... In this paper we describe routing functions for optical packets in point-to-point networks. These functions are based on Eulerian tours. We first define different measures to handle the efficiency of this routing. Then, we describe an algorithm to compute these measures. Moreover, we present such an ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
In this paper we describe routing functions for optical packets in point-to-point networks. These functions are based on Eulerian tours. We first define different measures to handle the efficiency of this routing. Then, we describe an algorithm to compute these measures. Moreover, we present such an Eulerian routing in the Square Mesh and we prove that the induced paths are close to the optimal (shortest paths) ones. We also construct a large family of digraphs having an optimal Eulerian routing. 1.
Research Proposal Extension
"... Introduction We request renewal of access to KFA-Julich parallel computing facilities, and in particular the Cray-T3E systems, in order to develop further our research on parallel system interconnects. Our past activities and future work within our project's areas: (a) parallel simulation of ATM ro ..."
Abstract
- Add to MetaCart
Introduction We request renewal of access to KFA-Julich parallel computing facilities, and in particular the Cray-T3E systems, in order to develop further our research on parallel system interconnects. Our past activities and future work within our project's areas: (a) parallel simulation of ATM routers and networks, and (b) communication and consistency benchmarks of the Cray-T3E system architecture are briefly described below. 2 ATM Switch and Network Simulator The first part of the proposal considers the implementation of a discrete-event, parallel simulator of ATM switches and large ATM networks. Parallel simulation of ATM and communication networks offers natural parallelism. Events from different switches which occur during the same clock cycle are completely independent. They can be scheduled to different processors without any conflicts. Interprocessor communication is only needed to support duality of events at the end of each simulated clock-cycle. A data parallel approa
Store-and-Forward Multicast Routing on the Mesh
, 2005
"... We study the complexity of routing a set of messages with multiple destinations (multicast routing) on an n-node square mesh under the store-and-forward model. A standard argument proves that Ω ( √ cn) time is required to route n messages, where each message is generated by a distinct node and at m ..."
Abstract
- Add to MetaCart
We study the complexity of routing a set of messages with multiple destinations (multicast routing) on an n-node square mesh under the store-and-forward model. A standard argument proves that Ω ( √ cn) time is required to route n messages, where each message is generated by a distinct node and at most c messages are to be delivered to any individual node. The obvious approach of simply replicating each message into the appropriate number of unicast (single-destination) messages and routing these independently does not yield an optimal algorithm. We provide both randomized and deterministic algorithms for multicast routing, which use constantsize buffers at each node. The randomized algorithm attains � optimal performance, while the deterministic algorithm is slower by a factor of O log 2 � n. We also describe an optimal deterministic algorithm that, however, requires large buffers of size O (c).
Algorithms for Data Migration
- ALGORITHMICA
, 2007
"... The data migration problem is the problem of computing a plan for moving data objects stored on devices in a network from one configuration to another. Load balancing or changing usage patterns might necessitate such a rearrangement ..."
Abstract
- Add to MetaCart
The data migration problem is the problem of computing a plan for moving data objects stored on devices in a network from one configuration to another. Load balancing or changing usage patterns might necessitate such a rearrangement

