Results 1  10
of
30
Security as a new dimension in embedded system design
 In Proceedings of the 41st Design Automation Conference (DAC ’04
, 2004
"... The growing number of instances of breaches in information security in the last few years has created a compelling case for efforts towards secure electronic systems. Embedded systems, which will be ubiquitously used to capture, store, manipulate, and access data of a sensitive nature, pose several ..."
Abstract

Cited by 40 (4 self)
 Add to MetaCart
The growing number of instances of breaches in information security in the last few years has created a compelling case for efforts towards secure electronic systems. Embedded systems, which will be ubiquitously used to capture, store, manipulate, and access data of a sensitive nature, pose several unique and interesting security challenges. Security has been the subject of intensive research in the areas of cryptography, computing, and networking. However, despite these efforts, security is often misconstrued by designers as the hardware or software implementation of specific cryptographic algorithms and security protocols. In reality, it is an entirely new metric that designers should consider throughout the design process, along with other metrics such as cost, performance, and power. This paper is intended to introduce embedded system designers and design tool developers to the challenges involved in designing
A Scalable Architecture for Modular Multiplication Based on Montgomery's Algorithm
 IEEE TRANSACTIONS ON COMPUTERS
, 2003
"... This paper presents a scalable architecture for the computation of modular multiplication, based on the Montgomery multiplication (MM) algorithm. A wordbased version of MM is presented and used to explain the main concepts in the hardware design. The proposed multiplier is able to work with any pr ..."
Abstract

Cited by 34 (2 self)
 Add to MetaCart
This paper presents a scalable architecture for the computation of modular multiplication, based on the Montgomery multiplication (MM) algorithm. A wordbased version of MM is presented and used to explain the main concepts in the hardware design. The proposed multiplier is able to work with any precision of the input operands, limited only by memory or control constraints. Its architecture gives enough freedom to select the word size and the degree of parallelism to be used, according to the available area and/or desired performance. Design trade offs are analyzed in order to identify adequate hardware configurations for a given area or bandwidth requirement.
Instruction Set Extensions for Fast Arithmetic in Finite Fields GF(p) and GF(2m)
 CRYPTOGRAPHIC HARDWARE AND EMBEDDED SYSTEMS — CHES 2004
, 2004
"... Abstract. Instruction set extensions are a small number of custom instructions specifically designed to accelerate the processing of a given kind of workload such as multimedia or cryptography. Enhancing a generalpurpose RISC processor with a few applicationspecific instructions to facilitate the ..."
Abstract

Cited by 14 (6 self)
 Add to MetaCart
Abstract. Instruction set extensions are a small number of custom instructions specifically designed to accelerate the processing of a given kind of workload such as multimedia or cryptography. Enhancing a generalpurpose RISC processor with a few applicationspecific instructions to facilitate the inner loop operations of publickey cryptosystems can result in a significant performance gain. In this paper we introduce a set of five custom instructions to accelerate arithmetic operations in finite fields GF(p) and GF(2^m). The custom instructions can be easily integrated into a standard RISC architecture like MIPS32 and require only little extra hardware. Our experimental results show that an extended MIPS32 core is able to perform an elliptic curve scalar multiplication over a 192bit prime field in 36 msec, assuming a clock speed of 33 MHz. An elliptic curve scalar multiplication over the binary field GF(2^191) takes only 21 msec, which is approximately six times faster than a software implementation on a standard MIPS32 processor.
Instruction Set Extension for Fast Elliptic Curve Cryptography Over Binary Finite Fields GF(2m)
 IN PROCEEDINGS OF THE 14TH IEEE INTERNATIONAL CONFERENCE ON APPLICATIONSPECIFIC SYSTEMS, ARCHITECTURES AND PROCESSORS (ASAP 2003)
, 2003
"... The performance of elliptic curve (EC) cryptosystems depends essentially on efficient arithmetic in the underlying finite field. Binary finite fields GF(2m) have the advantage of “carryfree” addition. Multiplication, on the other hand, is rather costly since polynomial arithmetic is not supported b ..."
Abstract

Cited by 13 (7 self)
 Add to MetaCart
The performance of elliptic curve (EC) cryptosystems depends essentially on efficient arithmetic in the underlying finite field. Binary finite fields GF(2m) have the advantage of “carryfree” addition. Multiplication, on the other hand, is rather costly since polynomial arithmetic is not supported by generalpurpose processors. In this paper we propose a combined hardware/software approach to overcome this problem. First, we outline that multiplication of binary polynomials can be easily integrated into a multiplier datapath for integers without significant additional hardware. Then, we present new algorithms for multipleprecision arithmetic in GF(2m) based on the availability of an instruction for singleprecision multiplication of binary polynomials. The proposed hardware/software approach is considerably faster than a “conventional” software implementation and well suited for constrained devices like smart cards. Our experimental results show that an enhanced 16bit RISC processor is able to generate a 191bit ECDSA signature in less than 650 msec when the core is clocked at 5 MHz.
An efficient and scalable radix4 modular multiplier design using recoding techniques
 Proc Asilomar Conf. Signals, Systems, and Computers
, 2003
"... Abstract — This paper presents the algorithm and architecture of a scalable radix4 Montgomery Multiplier. The straightforward implementation of a radix4 design based on the techniques already published results in a poor solution. In this paper we present an algorithm and architecture for the scala ..."
Abstract

Cited by 8 (0 self)
 Add to MetaCart
Abstract — This paper presents the algorithm and architecture of a scalable radix4 Montgomery Multiplier. The straightforward implementation of a radix4 design based on the techniques already published results in a poor solution. In this paper we present an algorithm and architecture for the scalable radix4 multiplier that makes use of two types of digit recoding in order to generate an efficient solution. The wordbyword algorithm used in the multiplier gives to the designer the freedom to select the level of parallelism according to the available area. Experimental results are shown to demonstrate that the proposed radix4 Montgomery Multiplier design has better area/performance tradeoff than previous radix2 and 8 scalable designs. I.
Scalable and unified hardware to compute montgomery inverse
 in GF(p) and GF(2 n ),” Cryptographic Hardware and Embedded Systems  CHES 2002, 4th International Workshop
, 2003
"... Abstract. Computing the inverse of a number in finite fields GF(p) or GF(2 n) is equally important for cryptographic applications. This paper proposes a novel scalable and unified architecture for a Montgomery inverse hardware that operates in both GF(p) and GF(2 n) fields. We adjust and modify a GF ..."
Abstract

Cited by 7 (1 self)
 Add to MetaCart
Abstract. Computing the inverse of a number in finite fields GF(p) or GF(2 n) is equally important for cryptographic applications. This paper proposes a novel scalable and unified architecture for a Montgomery inverse hardware that operates in both GF(p) and GF(2 n) fields. We adjust and modify a GF(2 n) Montgomery inverse algorithm to accommodate multibit shifting hardware, making it very similar to a previously proposed GF(p) algorithm. The architecture is intended to be scalable, which allows the hardware to compute the inverse of long precision numbers in a repetitive way. After implementing this unified design it was compared with other designs. The unified hardware was found to be eight times smaller than another reconfigurable design, with comparable performance. Even though the unified design consumes slightly more area and it is slightly slower than the scalable inverter implementations for GF(p) only, it is a practical solution whenever arithmetic in the two finite fields is needed. 1
Evaluating Instruction Set Extensions for Fast Arithmetic on Binary Finite Fields
 PROC. INT. CONF. APPLICATIONSPECIFIC SYSTEMS, ARCHITECTURES, AND PROCESSORS (ASAP
, 2004
"... Binary finite fields GF(2^n) are very commonly used in cryptography, particularly in publickey algorithms such as Elliptic Curve Cryptography (ECC). On wordoriented programmable processors, field elements are generally represented as polynomials with coefficients from {0, 1}. Key arithmetic operati ..."
Abstract

Cited by 5 (1 self)
 Add to MetaCart
Binary finite fields GF(2^n) are very commonly used in cryptography, particularly in publickey algorithms such as Elliptic Curve Cryptography (ECC). On wordoriented programmable processors, field elements are generally represented as polynomials with coefficients from {0, 1}. Key arithmetic operations on these polynomials, such as squaring and multiplication, are not supported by integeroriented processor architectures. Instead, these are implemented in software, causing a very large fraction of the cryptography execution time to be dominated by a few elementary operations. For example, more than 90% of the execution time of 163bit ECC may be consumed by two simple field operations: squaring and multiplication. A few
A Performance Evaluation of ARM ISA Extension for Elliptic Curve Cryptography Over Binary Finite Fields
 in Proceedings of the Sixteenth Symposium on Computer Architecture and High Performance Computing — SBCPAD 2004, Foz do Iguaçu
"... In this paper, we present an evaluation of possible ARM instruction set extension for Elliptic Curve Cryptography (ECC) over binary finite fields GF(2 m). The use of elliptic curve cryptography is becoming common in embedded domain, where its reduced key size at a security level equivalent to standa ..."
Abstract

Cited by 5 (1 self)
 Add to MetaCart
In this paper, we present an evaluation of possible ARM instruction set extension for Elliptic Curve Cryptography (ECC) over binary finite fields GF(2 m). The use of elliptic curve cryptography is becoming common in embedded domain, where its reduced key size at a security level equivalent to standard publickey methods (such as RSA) allows for power consumption savings and more efficient operation. ARM processor was selected because it is widely used for embedded system applications. We developed an ECC benchmark set with three widely used publickey algorithms: DiffieHellman for key exchange, digital signature algorithm, as well as ElGamal method for encryption/decryption. We analyzed the major bottlenecks at function level and evaluated the performance improvement, when we introduce some simple architectural support in the ARM ISA. Results of our experiments show that the use of a wordlevel multiplication instruction over binary field allows for an average 33 % reduction of the total number of dynamically executed instructions, while execution time improves by the same amount when projective coordinates are used. 1.
Hardware Implementation of a Montgomery Modular Multiplier in a Systolic Array
"... This paper describes a hardware architecture for modular multiplication operation which is efficient for bitlengths suitable for both commonly used types of Public Key Cryptography (PKC) i.e. ECC and RSA Cryptosystems. The challenge of current PKC implementations is to deal with long numbers (1602 ..."
Abstract

Cited by 5 (2 self)
 Add to MetaCart
This paper describes a hardware architecture for modular multiplication operation which is efficient for bitlengths suitable for both commonly used types of Public Key Cryptography (PKC) i.e. ECC and RSA Cryptosystems. The challenge of current PKC implementations is to deal with long numbers (1602048 bits) in order to achieve system's efficiency, as well as security. RSA, still the most popular PKC, has at its root the modular exponentiation operation. Modular exponentiation consists of repeated modular multiplications, which is also the basic operation for ECC protocols. The solution proposed in this work uses a systolic array implementation and can be used for arbitrary precisions. We also present modular exponentiation based on the Montgomery's Multiplication Method (MMM).
Instruction Set Extensions for PairingBased Cryptography
, 2007
"... A series of recent algorithmic advances has delivered highly effective methods for pairing evaluation and parameter generation. However, the resulting multitude of options means many different variations of base field must ideally be supported on the target platform. Typical hardware accelerators in ..."
Abstract

Cited by 4 (3 self)
 Add to MetaCart
A series of recent algorithmic advances has delivered highly effective methods for pairing evaluation and parameter generation. However, the resulting multitude of options means many different variations of base field must ideally be supported on the target platform. Typical hardware accelerators in the form of coprocessors possess neither the flexibility nor the scalability to support fields of different characteristic and order. On the other hand, extending the instruction set of a generalpurpose processor by custom instructions for field arithmetic allows to combine the performance of hardware with the flexibility of software. To this end, we investigate the integration of a trifield multiplyaccumulate (MAC) unit into a SPARC V8 processor core to support arithmetic in Fp, F2n and F3n. Besides integer multiplication, the MAC unit can also execute dedicated multiply and MAC instructions for binary and ternary polynomials. Our results show that the trifield MAC unit adds only a small size overhead while significantly accelerating arithmetic in F2n and F3n, which sheds new light on the relative performance of Fp, F2n and F3n in the context of pairingbased cryptography.