Results 11  20
of
34
Bipartite Modular Multiplication
, 2005
"... This paper proposes a new fast method for calculating modular multiplication. The calculation is performed using a new representation of residue classes modulo M that enables the splitting of the multiplier into two parts. These two parts are then processed separately, in parallel, potentially doub ..."
Abstract

Cited by 6 (0 self)
 Add to MetaCart
(Show Context)
This paper proposes a new fast method for calculating modular multiplication. The calculation is performed using a new representation of residue classes modulo M that enables the splitting of the multiplier into two parts. These two parts are then processed separately, in parallel, potentially doubling the calculation speed. The upper part and the lower part of the multiplier are processed using the interleaved modular multiplication algorithm and the Montgomery algorithm respectively. Conversions back and forth between the original integer set and the new residue system can be performed at speeds up to twice that of the Montgomery method without the need for precomputed constants. This new method is suitable for both hardware implementation; and software implementation in a multiprocessor environment. Although this paper is focusing on the application of the new method in the integer field, the technique used to speed up the calculation can also easily be adapted for operation in the binary extended field GF (2 m).
XTR Implementation on Reconfigurable Hardware
 of Lecture Notes in Computer Science
, 2004
"... Abstract. Recently, Lenstra and Verheul proposed an efficient cryptosystem called XTR. This system represents elements of F ∗ p6 with order dividing p 2 − p + 1 by their trace over Fp2. Compared with the usual representation, this one achieves a ratio of three between security size and manipulated d ..."
Abstract

Cited by 5 (1 self)
 Add to MetaCart
(Show Context)
Abstract. Recently, Lenstra and Verheul proposed an efficient cryptosystem called XTR. This system represents elements of F ∗ p6 with order dividing p 2 − p + 1 by their trace over Fp2. Compared with the usual representation, this one achieves a ratio of three between security size and manipulated data. Consequently very promising performance compared with RSA and ECC are expected. In this paper, we are dealing with hardware implementation of XTR, and more precisely with Field Programmable Gate Array (FPGA). The intrinsic parallelism of such a device is combined with efficient modular multiplication algorithms to obtain effective implementation(s) of XTR with respect to time and area. We also compare our implementations with hardware implementations of RSA and ECC. This shows that XTR achieves a very high level of speed with small area requirements: an XTR exponentiation is carried out in less than 0.21 ms at a frequency beyond 150 MHz.
Towards an FPGA Architecture Optimized for PublicKey Algorithms
 in The SPIE’s Symposium on Voice, Video, and Data Communications
, 1999
"... Cryptographic algorithms are constantly evolving to meet security needs, and modular arithmetic is an integral part of these algorithms, especially in the case of publickey cryptosystems. To achieve optimal system performance while maintaining physical security, it is desirable to implement cryptog ..."
Abstract

Cited by 5 (1 self)
 Add to MetaCart
Cryptographic algorithms are constantly evolving to meet security needs, and modular arithmetic is an integral part of these algorithms, especially in the case of publickey cryptosystems. To achieve optimal system performance while maintaining physical security, it is desirable to implement cryptographic algorithms in hardware. However, many publickey cryptographic algorithms require the implementation of modular arithmetic, specifically modular multiplication, for operands of 1024 bits in length. Additionally, algorithm agility is required to support algorithm independent protocols, a feature of most modern security protocols. Reprogrammability, particularly insystem reprogrammability, is critical in enabling the switching between cryptographic algorithms required for algorithm independent protocols. Field Programmable Gate Arrays (FPGAs) are a viable option for achieving this goal. Ideally, the targeted FPGA will have been designed with the architectural requirements for wideoper...
Moduli for Testing Implementations of the RSA Cryptosystem
 in IEEE 14th Symposium on Computer Arithmetic
, 1999
"... Comprehensive testing of any implementation of the RSA cryptosystem requires the use of a number of moduli with specific properties. It is shown how to generate a sufficient variety of these to enable testing which will justify high confidence in the correctness of both the design and the operation ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
(Show Context)
Comprehensive testing of any implementation of the RSA cryptosystem requires the use of a number of moduli with specific properties. It is shown how to generate a sufficient variety of these to enable testing which will justify high confidence in the correctness of both the design and the operation of hardware implementations. The tests avoid the necessity of another implementation for comparison. Many of these moduli are also suitable for testing software implementations. Furthermore, the methods apply equally well to other similar modular arithmetic based cryptosystems which use exponentiation, such as DiffieHelman key exchange. Key Words: Computer arithmetic, cryptography, RSA modulus, testing, correctness, verification, implementation validation benchmark. 1 Introduction The RSA cryptosystem [5] is widely used for key exchange and increasingly for the long term storage of sensitive data. A large number of such systems have been designed and built in both software and hardware. ...
Design of Long Integer Arithmetic Units for PublicKey Algorithms
 Proc. of EUROSMART Security Conference 2000
, 2000
"... Abstract. For many years the terms RSA and PublicKey Cryptography were used more or less synonymously. Consequently, long integer arithmetic units for publickey cryptography were designed to support mainly this specific algorithm. Today, however, the requirements on such an arithmetic unit have ch ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
(Show Context)
Abstract. For many years the terms RSA and PublicKey Cryptography were used more or less synonymously. Consequently, long integer arithmetic units for publickey cryptography were designed to support mainly this specific algorithm. Today, however, the requirements on such an arithmetic unit have changed and are much harder to fulfil than in the past. This is due to growing interest in new publickey algorithms and recent developments in cryptanalysis, progress in the factorization of long integers, and new attacks like the timing attack, the differential fault analysis, the SPA, and the DPA. We require immunity against these attacks and also optimal support of publickey algorithms based on elliptic curves. In this paper we describe and compare several design approaches for such arithmetic units for use in smart cards and security ICs with respect to performance and chip area. Our conclusion is that a design approach based on a fast parallel/serial adder is still the best solution in view of the requirements. 1
Faster and smaller hardware implementation of XTR
 In Proceedings of SPIE, Symposium on Optics & photonics, Advanced Signal Processing Algorithms, Architectures, and Implementations
, 2006
"... Modular multiplication is the core of most Public Key Cryptosystems and therefore its implementation plays a crucial role in the overall efficiency of asymmetric cryptosystems. Hardware approaches provide advantages over software in the framework of efficient dedicated accelerators. The concerns of ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
Modular multiplication is the core of most Public Key Cryptosystems and therefore its implementation plays a crucial role in the overall efficiency of asymmetric cryptosystems. Hardware approaches provide advantages over software in the framework of efficient dedicated accelerators. The concerns of the designers are mainly the die size, frequency, latency (throughput) and power consumption of those solutions. We show in this paper how Booth recoding, pipelining, Montgomery modular multiplication and carry save adders offer an attractive solution for hardware modular multiplication. Although most of the hereafter techniques stand as stateoftheart, the combination described here is unique and particularly efficient in the context of constrained hardware design of XTR cryptosystem. Our solution is implemented on an FPGA platform and compared with previous results. The areatime ratio is improved by around a factor of 3.
Duality between Multiplication and Modular Reduction
, 2005
"... This paper presents a duality between the classical optimally speeded up multiplication algorithm and some "fast" reduction algorithm. For this, the multiplier is represented by the unique signed digit representation with minimal Hamming weight using Reitwiesner's multiplier recoding ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
This paper presents a duality between the classical optimally speeded up multiplication algorithm and some "fast" reduction algorithm. For this, the multiplier is represented by the unique signed digit representation with minimal Hamming weight using Reitwiesner's multiplier recoding algorithm. In fact, the present paper proves that this optimal multiplier recoding technique naturally translates into a canonical modular reduction technique. Thus, the resulting reduction algorithm is optimal with respect to its averagetime complexity as well. Besides these two new results, our proof of the transfertheorem serves another interesting purpose: The reason that the considered reduction algorithm from [Sed] is so unknown might lie in the fact that it is rather unintuitive and no proper understanding was available so far. Therefore, our proper mathematical derivation/explanation solves this lack of understanding. Keywords: Computer arithmetic, Booth recoding, Canonical signeddigit representation, Modular reduction, Multiplication, Minimum Hamming weight, Optimal algorithm, Signed digit representation, Reitwiesner recoding. 1
Hardware for Computing Modular Multiplication Algorithm
, 1998
"... This paper examines the characteristics of an alternative architecture for computing a modular multiplication based on Montgomery's algorithm, useful in performing the RSA Public Key Cryptosystems. An experimental 12x12 bits modular multiplier prototype has been designed with this architecture. ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
This paper examines the characteristics of an alternative architecture for computing a modular multiplication based on Montgomery's algorithm, useful in performing the RSA Public Key Cryptosystems. An experimental 12x12 bits modular multiplier prototype has been designed with this architecture. Is fabricated by AMS using 0.6 µmCMOS technology. The architecture, its operation and some simulation results are presented. The evaluation is provided according to the functionality. The active area size is 1.33 x 0.93 mm 2 containing a number of transistors about 4100. 1. Introduction VLSI circuits that accelerate the encryption and decryption of messages using the RSA encryption technique [1] and circuits capable of performing long wordlength modulo multiplication at very high speed attract much interest for cryptography applications. Modular exponentiation operation is the main and more frequently function to process hidden information, for that reason, modular exponentiation plays impor...
RSA encryption using extended modular arithmetic on the quicksilver COSM adaptive computing machine
 IEEE Symposium on Field Programmable Custom Computing Machines (FCCM 03
, 2003
"... Modular arithmetic is typically the computational bottleneck in a hardware implementation of public key cryptography algorithms. This paper focuses on an implementation of modular multiplication on the Quicksilver COSM adaptive computing machine as a runtimereconfigurable user authentication conte ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Modular arithmetic is typically the computational bottleneck in a hardware implementation of public key cryptography algorithms. This paper focuses on an implementation of modular multiplication on the Quicksilver COSM adaptive computing machine as a runtimereconfigurable user authentication context candidate. The design is targeted specifically to the COSM adaptive computing machine, taking into account the underlying architecture of the device
Efficient SoftwareImplementation of Finite Fields with Applications to Cryptography
 ACTA APPL MATH
"... In this work, we present a survey of efficient techniques for software implementation of finite field arithmetic especially suitable for cryptographic applications. We discuss different algorithms for three types of finite fields and their special versions popularly used in cryptography: Binary fiel ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
In this work, we present a survey of efficient techniques for software implementation of finite field arithmetic especially suitable for cryptographic applications. We discuss different algorithms for three types of finite fields and their special versions popularly used in cryptography: Binary fields, prime fields and extension fields. Implementation details of the algorithms for field addition/subtraction, field multiplication, field reduction and field inversion for each of these fields are discussed in detail. The efficiency of these different algorithms depends largely on the underlying microprocessor architecture. Therefore, a careful choice of the appropriate set of algorithms has to be made for a software implementation depending on the performance requirements and available resources.