Results 1  10
of
12
Analysis of Bernstein's Factorization Circuit
, 2002
"... In [1], Bernstein proposed a circuitbased implementation of the matrix step of the number field sieve factorization algorithm. These circuits o er an asymptotic cost reduction under the measure "construction cost × run time". We evaluate the cost of these circuits, in agreement with [1], but ..."
Abstract

Cited by 14 (2 self)
 Add to MetaCart
In [1], Bernstein proposed a circuitbased implementation of the matrix step of the number field sieve factorization algorithm. These circuits o er an asymptotic cost reduction under the measure "construction cost × run time". We evaluate the cost of these circuits, in agreement with [1], but argue that compared to previously known methods these circuits can factor integers that are 1.17 times larger, rather than 3.01 as claimed (and even this, only under the nonstandard cost measure).
Sorting on the OTISMesh
 Proc. 14 th Int’l Parallel and Distributed Processing Symp
, 2000
"... In this paper we present sorting algorithms on the recently introduced N 2 processor OTISMesh, a network with diameter 4 p N \Gamma 3 consisting of N connected meshes of size p N \Theta p N . We show that kk sorting can be done in 8 p N + O(N 1 3 ) steps for k = 1; 2; 3; 4 and in 2k ..."
Abstract

Cited by 12 (0 self)
 Add to MetaCart
In this paper we present sorting algorithms on the recently introduced N 2 processor OTISMesh, a network with diameter 4 p N \Gamma 3 consisting of N connected meshes of size p N \Theta p N . We show that kk sorting can be done in 8 p N + O(N 1 3 ) steps for k = 1; 2; 3; 4 and in 2k p N + O(kN 1 3 ) steps for k ? 4 with constant buffersize for all k. We show how our algorithms can be modified to achieve 4 p N+O(N 1 3 ) steps for k = 1; 2; 3; 4 and k p N+O(kN 1 3 ) steps for k ? 4 in the average case. Finally we show a lower bound of maxf4 p N ; 1 p 2 k p Ng steps for kk sorting. 1. Introduction Several models for parallel machines were studied in the past and it has turned out that no model ideally fits for all applications. Especially well studied topologies is the mesh of processors or meshconnected array which is a simple architecture that fulfills the demands of VLSI technology quite well. The twodimensional mesh is ideally suited for seve...
How Helpers Hasten hRelations
 IN EUROPEAN SYMPOSIUM ON ALGORITHMS
, 2000
"... We study the problem of exchanging a set of messages among a group of processors using the model of simplex communication. Each processor has a unidirectional connection into a fast network. Messages may consist of different numbers of packets. Let h denote the maximum number of packets that a p ..."
Abstract

Cited by 5 (1 self)
 Add to MetaCart
We study the problem of exchanging a set of messages among a group of processors using the model of simplex communication. Each processor has a unidirectional connection into a fast network. Messages may consist of different numbers of packets. Let h denote the maximum number of packets that a processor must send and receive. If all the packets need to be delivered directly, at least h communication steps are needed to solve the problem. We show that by allowing forwarding, only h +O(1) time steps are needed to exchange all the messages, and this is optimal. Our work was motivated by the importance of irregular message exchanges in distributedmemory parallel computers, but it can also be viewed as an answer to an open problem on scheduling file transfers posed by Coffmann, Garey, Johnsson, and LaPaugh in 1985.
Optimal LoadBalancing
 in Proceedings of IEEE Infocom
, 2005
"... This paper is about loadbalancing packets across multiple paths inside a switch, or across a network. It is motivated by the recent interest in loadbalanced switches. Loadbalanced switches provide an appealing alternative to crossbars with centralized schedulers. A loadbalanced switch has no sch ..."
Abstract

Cited by 4 (2 self)
 Add to MetaCart
This paper is about loadbalancing packets across multiple paths inside a switch, or across a network. It is motivated by the recent interest in loadbalanced switches. Loadbalanced switches provide an appealing alternative to crossbars with centralized schedulers. A loadbalanced switch has no scheduler, is particularly amenable to optics, and  most relevant here  guarantees 100% throughput. A uniform mesh is used to loadbalance packets uniformly across all 2hop paths in the switch. In this paper we explore whether this particular method of loadbalancing is optimal in the sense that it achieves the highest throughput for a given capacity of interconnect. The method we use allows the loadbalanced switch to be compared with ring, torus and hypercube interconnects, too. We prove that for a given interconnect capacity, the loadbalancing mesh has the maximum throughput. Perhaps surprisingly, we find that the best mesh is slightly nonuniform, or biased, and has a throughput of N/(2N1), where N is the number of nodes.
Lattice networks: Capacity limits, optimal routing and queueing behavior
"... Abstract—Lattice networks are widely used in regular settings like grid computing, distributed control, satellite constellations, and sensor networks. Thus, limits on capacity, optimal routing policies, and performance with finite buffers are key issues and are addressed in this paper. In particular ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
Abstract—Lattice networks are widely used in regular settings like grid computing, distributed control, satellite constellations, and sensor networks. Thus, limits on capacity, optimal routing policies, and performance with finite buffers are key issues and are addressed in this paper. In particular, we study the routing algorithms that achieve the maximum rate per node for infinite and finite buffers in the nodes and different communication models, namely uniform communications, central data gathering and border data gathering. In the case of nodes with infinite buffers, we determine the capacity of the network and we characterize the set of optimal routing algorithms that achieve capacity. In the case of nodes with finite buffers, we approximate the queue network problem and obtain the distribution on the queue size at the nodes. This distribution allows us to study the effect of routing on the queue distribution and derive the algorithms that achieve the maximum rate. Index Terms—Border data gathering, data gathering, lattice networks, network capacity, queueing theory, routing, square grid, torus, uniform communication. I.
Lattice Sensor Networks: Capacity Limits, Optimal Routing and Robustness to Failures ∗
"... We study network capacity limits and optimal routing algorithms for regular sensor networks, namely, square and torus grid sensor networks, in both, the static case (no node failures) and the dynamic case (node failures). For static networks, we derive upper bounds on the network capacity and then w ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
We study network capacity limits and optimal routing algorithms for regular sensor networks, namely, square and torus grid sensor networks, in both, the static case (no node failures) and the dynamic case (node failures). For static networks, we derive upper bounds on the network capacity and then we characterize and provide optimal routing algorithms whose rate per node is equal to this upper bound, thus, obtaining the exact analytical expression for the network capacity. For dynamic networks, the unreliability of the network is modeled in two ways: a Markovian node failure and an energy based node failure. Depending on the probability of node failure that is present in the network, we propose to use a particular combination of two routing algorithms, the first one being optimal when there are no node failures at all and the second one being appropriate when the probability of node failure is high. The combination of these two routing algorithms defines a family of randomized routing algorithms, each of them being suitable for a given probability of node failure.
Short cut eulerian routing of datagrams in all optical pointtopoint networks
, 2001
"... In this paper we describe routing functions for optical packets in pointtopoint networks. These functions are based on Eulerian tours. We first define different measures to handle the efficiency of this routing. Then, we describe an algorithm to compute these measures. Moreover, we present such an ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
In this paper we describe routing functions for optical packets in pointtopoint networks. These functions are based on Eulerian tours. We first define different measures to handle the efficiency of this routing. Then, we describe an algorithm to compute these measures. Moreover, we present such an Eulerian routing in the Square Mesh and we prove that the induced paths are close to the optimal (shortest paths) ones. We also construct a large family of digraphs having an optimal Eulerian routing. 1.
Research Proposal Extension
"... Introduction We request renewal of access to KFAJulich parallel computing facilities, and in particular the CrayT3E systems, in order to develop further our research on parallel system interconnects. Our past activities and future work within our project's areas: (a) parallel simulation of ATM ro ..."
Abstract
 Add to MetaCart
Introduction We request renewal of access to KFAJulich parallel computing facilities, and in particular the CrayT3E systems, in order to develop further our research on parallel system interconnects. Our past activities and future work within our project's areas: (a) parallel simulation of ATM routers and networks, and (b) communication and consistency benchmarks of the CrayT3E system architecture are briefly described below. 2 ATM Switch and Network Simulator The first part of the proposal considers the implementation of a discreteevent, parallel simulator of ATM switches and large ATM networks. Parallel simulation of ATM and communication networks offers natural parallelism. Events from different switches which occur during the same clock cycle are completely independent. They can be scheduled to different processors without any conflicts. Interprocessor communication is only needed to support duality of events at the end of each simulated clockcycle. A data parallel approa
StoreandForward Multicast Routing on the Mesh
, 2005
"... We study the complexity of routing a set of messages with multiple destinations (multicast routing) on an nnode square mesh under the storeandforward model. A standard argument proves that Ω ( √ cn) time is required to route n messages, where each message is generated by a distinct node and at m ..."
Abstract
 Add to MetaCart
We study the complexity of routing a set of messages with multiple destinations (multicast routing) on an nnode square mesh under the storeandforward model. A standard argument proves that Ω ( √ cn) time is required to route n messages, where each message is generated by a distinct node and at most c messages are to be delivered to any individual node. The obvious approach of simply replicating each message into the appropriate number of unicast (singledestination) messages and routing these independently does not yield an optimal algorithm. We provide both randomized and deterministic algorithms for multicast routing, which use constantsize buffers at each node. The randomized algorithm attains � optimal performance, while the deterministic algorithm is slower by a factor of O log 2 � n. We also describe an optimal deterministic algorithm that, however, requires large buffers of size O (c).