Results 1  10
of
37
Scheduling Distributed Applications: The SimGrid Simulation Framework
 In Proceedings of the Third IEEE International Symposium on Cluster Computing and the Grid (CCGrid’03
, 2003
"... Abstract — Since the advent of distributed computer systems an active field of research has been the investigation of scheduling strategies for parallel applications. The common approach is to employ scheduling heuristics that approximate an optimal schedule. Unfortunately, it is often impossible to ..."
Abstract

Cited by 141 (29 self)
 Add to MetaCart
(Show Context)
Abstract — Since the advent of distributed computer systems an active field of research has been the investigation of scheduling strategies for parallel applications. The common approach is to employ scheduling heuristics that approximate an optimal schedule. Unfortunately, it is often impossible to obtain analytical results to compare the efficacy of these heuristics. One possibility is to conducts large numbers of backtoback experiments on real platforms. While this is possible on tightlycoupled platforms, it is infeasible on modern distributed platforms (i.e. Grids) as it is laborintensive and does not enable repeatable results. The solution is to resort to simulations. Simulations not only enables repeatable results but also make it possible to explore wide ranges of platform and application scenarios. In this paper we present the SimGrid framework which enables the simulation of distributed applications in distributed computing environments for the specific purpose of developing and evaluating scheduling algorithms. This paper focuses on SimGrid v2, which greatly improves on the first version of the software with more realistic networkmodels and topologies. SimGrid v2 also enables the simulation of distributed scheduling agents, which has become critical for current scheduling research in largescale platforms. After describing and validating these features, we present a case study by which we demonstrate the usefulness of SimGrid for conducting scheduling research. I.
Scheduling Strategies for MasterSlave Tasking on Heterogeneous Processor Grids
, 2002
"... In this paper, we consider the problem of allocating a large number of independent, equalsized tasks to a heterogeneous "grid" computing platform. We use a nonoriented graph to model a grid, where resources can have different speeds of computation and communication, as well as different ..."
Abstract

Cited by 102 (33 self)
 Add to MetaCart
(Show Context)
In this paper, we consider the problem of allocating a large number of independent, equalsized tasks to a heterogeneous "grid" computing platform. We use a nonoriented graph to model a grid, where resources can have different speeds of computation and communication, as well as different overlap capabilities. We show how to determine the optimal steadystate scheduling strategy for each processor (the fraction of time spent computing and the fraction of time spent communicating with each neighbor). This result holds for a quite general framework, allowing for cycles and multiple paths in the interconnection graph, and allowing for several masters. Because
BandwidthCentric Allocation of Independent Tasks on Heterogeneous Platforms
 In International Parallel and Distributed Processing Symposium (IPDPS’2002). IEEE Computer
, 2001
"... In this paper, we consider the problem of allocating a large number of independent, equalsized tasks to a heterogenerous "grid" computing platform. Such problems arise in collaborative computing eorts like SETI@home. We use a tree to model a grid, where resources can have dierent speeds ..."
Abstract

Cited by 85 (29 self)
 Add to MetaCart
In this paper, we consider the problem of allocating a large number of independent, equalsized tasks to a heterogenerous "grid" computing platform. Such problems arise in collaborative computing eorts like SETI@home. We use a tree to model a grid, where resources can have dierent speeds of computation and communication, as well as dierent overlap capabilities. We dene a base model, and show how to determine the maximum steadystate throughput of a node in the base model, assuming we already know the throughput of the subtrees rooted at the node's children. Thus, a bottomup traversal of the tree determines the rate at which tasks can be processed in the full tree. The best allocation is bandwidthcentric: if enough bandwidth is available, then all nodes are kept busy; if bandwidth is limited, then tasks should be allocated only to the children which have suciently small communication times, regardless of their computation power. We then show how nodes with other capabilities ones that allow more or less overlapping of computation and communication than the base model can be transformed to equivalent nodes in the base model. We also show how to handle a more general communication model. Finally, we present simulation results of several demanddriven task allocation policies that show that our bandwidthcentric method obtains better results than allocating tasks to all processors on a rstcome, rst serve basis. Key words: heterogeneous computer, allocation, scheduling, grid, metacomputing. Corresponding author: Jeanne Ferrante The work of Larry Carter and Jeanne Ferrante was performed while visiting LIP. 1 1
A tale of two sieves
 Notices Amer. Math. Soc
, 1996
"... It is the best of times for the game of factoring large numbers into their prime factors. In 1970 it was barely possible to factor “hard ” 20digit numbers. In 1980, in the heyday of the BrillhartMorrison continued fraction factoring algorithm, factoring of 50digit numbers was becoming commonplace ..."
Abstract

Cited by 49 (2 self)
 Add to MetaCart
(Show Context)
It is the best of times for the game of factoring large numbers into their prime factors. In 1970 it was barely possible to factor “hard ” 20digit numbers. In 1980, in the heyday of the BrillhartMorrison continued fraction factoring algorithm, factoring of 50digit numbers was becoming commonplace. In 1990 my own quadratic sieve factoring algorithm had doubled the length of the numbers that could be factored, the record having 116 digits. By 1994 the quadratic sieve had factored the famous 129digit RSA challenge number that had been estimated in Martin Gardner’s 1976 Scientific American column to be safe for 40 quadrillion years (though other estimates around then were more modest). But the quadratic sieve is no longer the champion. It was replaced by Pollard’s number field sieve in the spring of 1996, when that method successfully split a 130digit RSA challenge number in about 15 % of the time the quadratic sieve would have taken. In this article we shall briefly meet these factorization algorithms—these two sieves—and some of the many people who helped to develop them. In the middle part of this century, computational issues seemed to be out of fashion. In most books the problem of factoring big numbers
Parallel Algorithms for Integer Factorisation
"... The problem of finding the prime factors of large composite numbers has always been of mathematical interest. With the advent of public key cryptosystems it is also of practical importance, because the security of some of these cryptosystems, such as the RivestShamirAdelman (RSA) system, depends o ..."
Abstract

Cited by 44 (17 self)
 Add to MetaCart
The problem of finding the prime factors of large composite numbers has always been of mathematical interest. With the advent of public key cryptosystems it is also of practical importance, because the security of some of these cryptosystems, such as the RivestShamirAdelman (RSA) system, depends on the difficulty of factoring the public keys. In recent years the best known integer factorisation algorithms have improved greatly, to the point where it is now easy to factor a 60decimal digit number, and possible to factor numbers larger than 120 decimal digits, given the availability of enough computing power. We describe several algorithms, including the elliptic curve method (ECM), and the multiplepolynomial quadratic sieve (MPQS) algorithm, and discuss their parallel implementation. It turns out that some of the algorithms are very well suited to parallel implementation. Doubling the degree of parallelism (i.e. the amount of hardware devoted to the problem) roughly increases the size of a number which can be factored in a fixed time by 3 decimal digits. Some recent computational results are mentioned – for example, the complete factorisation of the 617decimal digit Fermat number F11 = 2211 + 1 which was accomplished using ECM.
Improvements to the general number field sieve for discrete logarithms in prime fields
 Mathematics of Computation
, 2003
"... Abstract. In this paper, we describe many improvements to the number field sieve. Our main contribution consists of a new way to compute individual logarithms with the number field sieve without solving a very large linear system for each logarithm. We show that, with these improvements, the number ..."
Abstract

Cited by 26 (2 self)
 Add to MetaCart
Abstract. In this paper, we describe many improvements to the number field sieve. Our main contribution consists of a new way to compute individual logarithms with the number field sieve without solving a very large linear system for each logarithm. We show that, with these improvements, the number field sieve outperforms the gaussian integer method in the hundred digit range. We also illustrate our results by successfully computing discrete logarithms with GNFS in a large prime field. 1.
Recent progress and prospects for integer factorisation algorithms
 In Proc. of COCOON 2000
, 2000
"... Abstract. The integer factorisation and discrete logarithm problems are of practical importance because of the widespread use of public key cryptosystems whose security depends on the presumed difficulty of solving these problems. This paper considers primarily the integer factorisation problem. In ..."
Abstract

Cited by 24 (1 self)
 Add to MetaCart
(Show Context)
Abstract. The integer factorisation and discrete logarithm problems are of practical importance because of the widespread use of public key cryptosystems whose security depends on the presumed difficulty of solving these problems. This paper considers primarily the integer factorisation problem. In recent years the limits of the best integer factorisation algorithms have been extended greatly, due in part to Moore’s law and in part to algorithmic improvements. It is now routine to factor 100decimal digit numbers, and feasible to factor numbers of 155 decimal digits (512 bits). We outline several integer factorisation algorithms, consider their suitability for implementation on parallel machines, and give examples of their current capabilities. In particular, we consider the problem of parallel solution of the large, sparse linear systems which arise with the MPQS and NFS methods. 1
Optimizing the steadystate throughput of scatter and reduce operations on heterogeneous platforms
, 2005
"... ..."
ECM on Graphics Cards
"... Abstract. This paper reports recordsetting performance for the ellipticcurve method of integer factorization: for example, 604.99 curves/second for ECM stage 1 with B1 = 8192 for 280bit integers on a single PC. The stateoftheart GMPECM software handles 171.42 curves/second for ECM stage 1 with ..."
Abstract

Cited by 16 (4 self)
 Add to MetaCart
(Show Context)
Abstract. This paper reports recordsetting performance for the ellipticcurve method of integer factorization: for example, 604.99 curves/second for ECM stage 1 with B1 = 8192 for 280bit integers on a single PC. The stateoftheart GMPECM software handles 171.42 curves/second for ECM stage 1 with B1 = 8192 for 280bit integers using all four cores of a 2.4GHz Core 2 Quad Q6600. The extra speed takes advantage of extra hardware, specifically two NVIDIA GTX 280 graphics cards, using a new ECM implementation introduced in this paper. Our implementation uses Edwards curves, relies on new parallel addition formulas, and is carefully tuned for the highly parallel GPU architecture. On a single GTX 280 the implementation performs 22.66 million modular multiplications per second for a general 280bit modulus. GMPECM, using all four cores of a Q6600, performs 17.91 million multiplications per second. This paper also reports speeds on other graphics processors: for example,
A Montgomerylike Square Root for the Number Field Sieve
, 1998
"... The Number Field Sieve (NFS) is the asymptotically fastest factoring algorithm known. It had spectacular successes in factoring numbers of a special form. Then the method was adapted for general numbers, and recently applied to the RSA130 number [6], setting a new world record in factorization. Th ..."
Abstract

Cited by 14 (3 self)
 Add to MetaCart
(Show Context)
The Number Field Sieve (NFS) is the asymptotically fastest factoring algorithm known. It had spectacular successes in factoring numbers of a special form. Then the method was adapted for general numbers, and recently applied to the RSA130 number [6], setting a new world record in factorization. The NFS has undergone several modifications since its appearance. One of these modifications concerns the last stage: the computation of the square root of a huge algebraic number given as a product of hundreds of thousands of small ones. This problem was not satisfactorily solved until the appearance of an algorithm by Peter Montgomery. Unfortunately, Montgomery only published a preliminary version of his algorithm [15], while a description of his own implementation can be found in [7]. In this paper, we present a variant of the algorithm, compare it with the original algorithm, and discuss its complexity.