Results 1 - 10
of
13
A scalable and unified multiplier architecture for finite fields GF(p) and GF(2 m
- and GF (2 m ). In Cryptographic Hardware and Embedded Systems — CHES 2000, LNCS
, 2000
"... We describe a scalable and unified architecture for a Montgomery multiplication module which operates in both types of finite fields GF(p) and GF(2m). The unified architecture requires only slightly more area than that of the multiplier architecture for the field GF(p). The multiplier is scalable,wh ..."
Abstract
-
Cited by 35 (11 self)
- Add to MetaCart
We describe a scalable and unified architecture for a Montgomery multiplication module which operates in both types of finite fields GF(p) and GF(2m). The unified architecture requires only slightly more area than that of the multiplier architecture for the field GF(p). The multiplier is scalable,which means that a fixed-area multiplication module can handle operands of any size,and also,the wordsize can be selected based on the area and performance requirements. We utilize the concurrency in the Montgomery multiplication operation by employing a pipelining design methodology. We also describe a scalable and unified adder module to carry out concomitant operations in our implementation of the Montgomery multiplication. The upper limit on the precision of the scalable and unified Montgomery multiplier is dictated only by the available memory to store the operands and internal results,and the module is capable of performing infinite-precision Montgomery multiplication in both types of finite fields. Key Words: Prime fields,binary extension fields,multiplication,Montgomery multiplication, scalability,hardware implementation.
A Scalable Architecture for Montgomery Multiplication
- Lecture Notes in Computer Science
, 1999
"... This paper describes the methodology and design of a scalable Montgomery multiplication module. There is no limitation on the maximum number of bits manipulated by the multiplier, and the selection of the word-size is made according to the available area and/or desired performance. We describe t ..."
Abstract
-
Cited by 33 (6 self)
- Add to MetaCart
This paper describes the methodology and design of a scalable Montgomery multiplication module. There is no limitation on the maximum number of bits manipulated by the multiplier, and the selection of the word-size is made according to the available area and/or desired performance. We describe the general view of the new architecture, analyze hardware organization for its parallel computation, and discuss design tradeo#s which are useful to identify the best hardware configuration.
A Scalable Architecture for Modular Multiplication Based on Montgomery's Algorithm
- IEEE TRANSACTIONS ON COMPUTERS
, 2003
"... This paper presents a scalable architecture for the computation of modular multiplication, based on the Montgomery multiplication (MM) algorithm. A word-based version of MM is presented and used to explain the main concepts in the hardware design. The proposed multiplier is able to work with any pr ..."
Abstract
-
Cited by 27 (1 self)
- Add to MetaCart
This paper presents a scalable architecture for the computation of modular multiplication, based on the Montgomery multiplication (MM) algorithm. A word-based version of MM is presented and used to explain the main concepts in the hardware design. The proposed multiplier is able to work with any precision of the input operands, limited only by memory or control constraints. Its architecture gives enough freedom to select the word size and the degree of parallelism to be used, according to the available area and/or desired performance. Design trade offs are analyzed in order to identify adequate hardware configurations for a given area or bandwidth requirement.
A Systolic, Linear-Array Multiplier for a Class of Right-Shift Algorithms
- IEEE Transactions on computers
, 1994
"... A very simple multiplier cell is developed for use in a linear, purely systolic array forming a digit-serial multiplier for unsigned or 2'complement operands. Each cell produces two digit-product terms and accumulates these into a previous sum of the same weight, developing the product least signifi ..."
Abstract
-
Cited by 26 (0 self)
- Add to MetaCart
A very simple multiplier cell is developed for use in a linear, purely systolic array forming a digit-serial multiplier for unsigned or 2'complement operands. Each cell produces two digit-product terms and accumulates these into a previous sum of the same weight, developing the product least significant digit first. Grouping two terms per cell, the ratio of active elements to latches is low, and only \Sigma n 2 \Upsilon cells are needed for a full n by n multiply. A modulo-multiplier is then developed by incorporating a Montgomery type of modulo-reduction. Two such multipliers interconnect to form a purely systolic modulo exponentiator, capable of performing RSA encryption at very high clock frequencies, but with a low gate count and small area. It is also shown how the multiplier, with some simple back-end connections, can compute modular inverses and perform modular division for a power of two as modulus. Keywords: Systolic array, digit-serial multiplier, 2'complement multiplic...
An RNS Montgomery Modular Multiplication Algorithm
, 1998
"... We present a new RNS modular multiplication for very large operands. The algorithm is based on Montgomery's method adapted to mixed radix, and is performed using a Residue Number System. By choosing the moduli of the RNS system reasonably large, and implementing the system on a ring of fairly simple ..."
Abstract
-
Cited by 18 (3 self)
- Add to MetaCart
We present a new RNS modular multiplication for very large operands. The algorithm is based on Montgomery's method adapted to mixed radix, and is performed using a Residue Number System. By choosing the moduli of the RNS system reasonably large, and implementing the system on a ring of fairly simple processors, the carry-free nature of RNS arithmetic achieves an effect corresponding to a redundant high-radix implementation. The algorithm can be implemented to run in O(n) time on O(n) processors, where n is the number of moduli in the RNS system. 1 Introduction Many cryptosystems employs modular multiplication with very large numbers [RSA78, FS86]. Different algorithms have been proposed in the literature [Bri90, Kor93, Wal93, Tak93, SV93, Oru95]. Most of them use redundant radix number systems and Montgomery 's modular multiplication [Mon85]. On the other hand the Residue Number System (RNS) is also of particular interest because of the parallel and carry free nature of its arithmeti...
Design and Implementation of a Coprocessor for Cryptography Applications
, 1997
"... In this paper, an ASIC suitable for cryptography applications based on modular arithmetic techniques, is presented. These applications, such as for example digital signature (DSA) and public key encryption and decryption (RSA), use, as basic operation, the modular exponentiation. This ASIC works as ..."
Abstract
-
Cited by 15 (0 self)
- Add to MetaCart
In this paper, an ASIC suitable for cryptography applications based on modular arithmetic techniques, is presented. These applications, such as for example digital signature (DSA) and public key encryption and decryption (RSA), use, as basic operation, the modular exponentiation. This ASIC works as a coprocessor with a special set of instructions specialized on dealing with high accuracy integers, as well as on the rapid evaluation of modular multiplications and exponentiations. The algorithm, the hardware architecture, the design methodology and the results are described in detail. 1. Introduction Security has become a key issue in the world of electronic communication. Besides how fast data are transmitted, the security of these data through the communication channel arises as one of the most important problems. Though, the time overhead due to data encryption and decryption should not impose a bottleneck in the communication process. Public key cryptography (RSA), as well as othe...
Modular exponentiation using parallel multipliers
- Proceedings of the 2003 IEEE International Conference on Field Programmable Technology (FPT
, 2003
"... A field programmable gate array (FPGA) semi-systolic implementation of a modular exponentiation unit, suitable for use in implementing the RSA public key cryptosystem is presented. The design is carefully matched with features of the FPGA architecture, utilizing embedded 18×18-bit multipliers on the ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
A field programmable gate array (FPGA) semi-systolic implementation of a modular exponentiation unit, suitable for use in implementing the RSA public key cryptosystem is presented. The design is carefully matched with features of the FPGA architecture, utilizing embedded 18×18-bit multipliers on the FPGA and employing a carry save addition scheme. Using this architecture, a 1024-bit modular exponentiation can operate at 90 MHz on a Xilinx XC2V3000-6 device and perform a 1024-bit RSA decryption in 0.66 ms with the Chinese Remainder Theorem. 1
Modular Multiplication and Base Extensions in Residue Number Systems
- IN 15TH IEEE SYMPOSIUM ON COMPUTER ARITHMETIC
, 2001
"... We present a new RNS modular multiplication for very large operands. The algorithm is based on Montgomery's method adapted to residue arithmetic. By choosing the moduli of the RNS system reasonably large, an eect corresponding to a redundant high-radix implementation is achieved, due to the carry-fr ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
We present a new RNS modular multiplication for very large operands. The algorithm is based on Montgomery's method adapted to residue arithmetic. By choosing the moduli of the RNS system reasonably large, an eect corresponding to a redundant high-radix implementation is achieved, due to the carry-free nature of residue arithmetic. The actual computation in the multiplication takes place in constant time, where the unit of time is a few simple residue operations. However, it is necessary twice to convert values from one residue system into another, operations which take O(n) time on O(n) processors, where n is the number of moduli in the RNS systems. Thus these conversions are the bottlenecks of the method, and any future improvements in RNS base conversions, or the use of particular residue systems, can immediately be applied.
A Hardware Algorithm for Modular Multiplication/ Division
- IEEE TRANSACTIONS ON COMPUTERS
, 2005
"... A mixed radix-4/2 algorithm for modular multiplication/division suitable for VLSI implementation is proposed. The algorithm is based on Montgomery method for modular multiplication and on the extended Binary GCD algorithm for modular division. Both algorithms are modified and combined into the propo ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
A mixed radix-4/2 algorithm for modular multiplication/division suitable for VLSI implementation is proposed. The algorithm is based on Montgomery method for modular multiplication and on the extended Binary GCD algorithm for modular division. Both algorithms are modified and combined into the proposed algorithm so that almost all the hardware components are shared. The new algorithm carries out both calculations using simple operations such as shifts, additions, and subtractions. The radix-2 signed-digit representation is used to avoid carry propagation in all additions and subtractions. A modular multiplier/divider based on the algorithm performs an n-bit modular multiplication/division in OðnÞ clock cycles where the length of the clock cycle is constant and independent of n. The modular multiplier/divider has a linear array structure with a bit-slice feature and can be implemented with much smaller hardware than that necessary to implement both multiplier and divider separately.
Koç , “Dual-field multiplier architecture for cryptographic applications
- in The Thirty-seventh Annual Asilomar Conference on Signals, Systems, and Computers
"... The multiplication operation in finite fields GF (p) and GF (2 n) is the most often used and timeconsuming operation in the harware and software realizations of public-key cryptographic systems, particularly elliptic curve cryptography. We propose a new hardware architecture for fast and efficient e ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
The multiplication operation in finite fields GF (p) and GF (2 n) is the most often used and timeconsuming operation in the harware and software realizations of public-key cryptographic systems, particularly elliptic curve cryptography. We propose a new hardware architecture for fast and efficient execution of the multiplication operation in this paper. The proposed architecture is scalable, i.e., can handle operands of any size;only limited by input/output and scratch space size, not by computational unit. It can also be configured to fit the available chip area for the desired performance. Our proposed architecture computes multiplication faster in GF (2 n) than GF (p), which conforms with premise of GF (2 n) for hardware realizations. I.

