Results 1 - 10
of
21
Efficient and secure algorithms for GLV-based scalar multiplication and their implementation on GLV-GLS curves (or Keep Calm . . . )
, 2013
"... We propose efficient algorithms and formulas that improve the performance of side-channel protected elliptic curve computations, with special focus on scalar multiplication exploiting the Gallant-Lambert-Vanstone (CRYPTO 2001) and Galbraith-Lin-Scott (EUROCRYPT 2009) methods. Firstly, by adapting ..."
Abstract
-
Cited by 22 (5 self)
- Add to MetaCart
We propose efficient algorithms and formulas that improve the performance of side-channel protected elliptic curve computations, with special focus on scalar multiplication exploiting the Gallant-Lambert-Vanstone (CRYPTO 2001) and Galbraith-Lin-Scott (EUROCRYPT 2009) methods. Firstly, by adapting Feng et al.’s recoding to the GLV setting, we derive new regular algorithms for variable-base scalar multiplication that offer protection against simple side-channel and timing attacks. Secondly, we propose an efficient algorithm for fixed-base scalar multiplication that is also protected against sidechannel attacks by combining Feng et al.’s recoding with Lim-Lee’s comb method. Thirdly, we propose an efficient technique that interleaves ARM-based and NEON-based multiprecision operations over an extension field, as typically found on GLS curves and pairing computations, to improve performance on modern ARM processors. Finally, we showcase the efficiency of the proposed techniques by implementing a state-of-the-art GLV-GLS curve in twisted Edwards form defined over F p 2, which supports a four dimensional decomposition of the scalar and is fully protected against timing attacks. Analysis and performance results are reported for modern x64 and ARM processors. For instance, using a precomputed table of only 512 bytes, we compute a variable-base scalar multiplication in 92,000 and 244,000 cycles on an Intel Ivy Bridge and an ARM Cortex-A15 processor (respect.); using an off-line precomputed
Elligator: Elliptic-curve points indistinguishable from uniform random strings
"... Censorship-circumvention tools are in an arms race against censors. The censors study all traffic passing into and out of their controlled sphere, and try to disable censorshipcircumvention tools without completely shutting down the Internet. Tools aim to shape their traffic patterns to match unbloc ..."
Abstract
-
Cited by 16 (1 self)
- Add to MetaCart
(Show Context)
Censorship-circumvention tools are in an arms race against censors. The censors study all traffic passing into and out of their controlled sphere, and try to disable censorshipcircumvention tools without completely shutting down the Internet. Tools aim to shape their traffic patterns to match unblocked programs, so that simple traffic profiling cannot identify the tools within a reasonable number of traces; the censors respond by deploying firewalls with increasingly sophisticated deep-packet inspection. Cryptography hides patterns in user data but does not evade censorship if the censor can recognize patterns in the cryptography itself. In particular, elliptic-curve cryptography often transmits points on known elliptic curves, and those points are easily distinguishable from uniform random strings of bits. This paper introduces high-security high-speed ellipticcurve systems in which elliptic-curve points are encoded so as to be indistinguishable from uniform random strings. 1.
High-Performance Scalar Multiplication using 8-Dimensional GLV/GLS Decomposition
"... Abstract. This paper explores the potential for using genus 2 curves over quadratic extension fields in cryptography, motivated by the fact that they allow for an 8-dimensional scalar decomposition when using a combination of the GLV/GLS algorithms. Besides lowering the number of doublings required ..."
Abstract
-
Cited by 9 (2 self)
- Add to MetaCart
(Show Context)
Abstract. This paper explores the potential for using genus 2 curves over quadratic extension fields in cryptography, motivated by the fact that they allow for an 8-dimensional scalar decomposition when using a combination of the GLV/GLS algorithms. Besides lowering the number of doublings required in a scalar multiplication, this approach has the advantage of performing arithmetic operations in a 64-bit ground field, making it an attractive candidate for embedded devices. We found cryptographically secure genus 2 curves which, although susceptible to index calculus attacks, aim for the standardized 112-bit security level. Our implementation results on both high-end architectures (Ivy Bridge) and low-end ARM platforms (Cortex-A8) highlight the practical benefits of this approach. 1
NaCl on 8-bit AVR Microcontrollers
"... Abstract. This paper presents first results of the NetworkingandCryptography library (NaCl) on the 8-bit AVR family of microcontrollers. We show that NaCl, which has so far been optimized mainly for different desktop and server platforms, is feasible on resource-constrained devices while being very ..."
Abstract
-
Cited by 8 (2 self)
- Add to MetaCart
(Show Context)
Abstract. This paper presents first results of the NetworkingandCryptography library (NaCl) on the 8-bit AVR family of microcontrollers. We show that NaCl, which has so far been optimized mainly for different desktop and server platforms, is feasible on resource-constrained devices while being very fast and memory efficient. Our implementation shows that encryption using Salsa20 requires 277 cycles/byte, authentication usingPoly1305 needs211cycles/byte,aCurve25519scalar multiplication needs 22954657 cycles, signing of data using Ed25519 needs 23211611 cycles, and verification can be done within 32937940 cycles. All implemented primitives provide at least 128-bit security, run in constant time, do not use secret-data-dependent branch conditions, and are open to the public domain (no usage restrictions).
NEON implementation of an attribute-based encryption scheme
"... Abstract. In 2011, Waters presented a ciphertext-policy attribute-based encryption protocol that uses bilinear pairings to provide control access mechanisms, where the set of user’s attributes is specified by means of a linear secret sharing scheme. Some of the applications foreseen for this protoco ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
(Show Context)
Abstract. In 2011, Waters presented a ciphertext-policy attribute-based encryption protocol that uses bilinear pairings to provide control access mechanisms, where the set of user’s attributes is specified by means of a linear secret sharing scheme. Some of the applications foreseen for this protocol lie in the context of mobile devices such a smartphones and tablets, which in a majority of instances are powered by an ARM processor supporting the NEON vector set of instructions. In this paper we present the design of a software cryptographic library that implements a 127-bit security level attribute-based encryption scheme over mobile devices equipped with a 1.4GHz Exynos 4 Cortex-A9 processor and a developing board that hosts a 1.7 GHz Exynos 5 Cortex-A15 processor. For the latter platform and taking advantage of the inherent parallelism of the NEON vector instructions, our library computes a single optimal pairing over a Barreto-Naehrig curve approximately 2 times faster than the best timings previously reported on ARM platforms at this level of security. Further, using a 6-attribute access formula our library is able to encrypt/decrypt a text/ciphertext in less than 4.49mS and 15.67mS, respectively.
Montgomery Multiplication Using Vector Instructions
, 2013
"... Abstract. In this paper we present a parallel approach to compute interleaved Montgomery multiplication. This approach is particularly suitable to be computed on 2-way single instruction, multiple data platforms as can be found on most modern computer architectures in the form of vector instruction ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
(Show Context)
Abstract. In this paper we present a parallel approach to compute interleaved Montgomery multiplication. This approach is particularly suitable to be computed on 2-way single instruction, multiple data platforms as can be found on most modern computer architectures in the form of vector instruction set extensions. We have implemented this approach for tablet devices which run the x86 architecture (Intel Atom Z2760) using SSE2 instructions as well as devices which run on the ARM platform (Qualcomm MSM8960, NVIDIA Tegra 3 and 4) using NEON instructions. When instantiating modular exponentiation with this parallel version of Montgomery multiplication we observed a performance increase of more than a factor of 1.5 compared to the sequential implementation in OpenSSL for the classical arithmetic logic unit on the Atom platform for 2048-bit moduli. Key words: Montgomery multiplication, SIMD, software implementation, vector instructions 1
MinimaLT: Minimal-latency Networking Through Better Security ABSTRACT
"... Minimal Latency Tunneling (MinimaLT) is a new network protocol that provides ubiquitous encryption for maximal confidentiality, including protecting packet headers. MinimaLT provides server and user authentication, extensive Denial-of-Service protections, privacy-preserving IP mobility, and fast key ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
(Show Context)
Minimal Latency Tunneling (MinimaLT) is a new network protocol that provides ubiquitous encryption for maximal confidentiality, including protecting packet headers. MinimaLT provides server and user authentication, extensive Denial-of-Service protections, privacy-preserving IP mobility, and fast key erasure. We describe the protocol, demonstrate its performance relative to TLS and unencrypted TCP/IP, and analyze its protections, including its resilience against DoS attacks. By exploiting the properties of its cryptographic protections, MinimaLT is able to eliminate three-way handshakes and thus create connections faster than unencrypted TCP/IP.
Soc it to EM: electromagnetic side-channel attacks on a complex system-on-chip. Cryptology ePrint Archive, Report 2015/561
, 2015
"... Abstract. Increased complexity in modern embedded systems has pre-sented various important challenges with regard to side-channel attacks. In particular, it is common to deploy SoC-based target devices with high clock frequencies in security-critical scenarios; understanding how such features align ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
(Show Context)
Abstract. Increased complexity in modern embedded systems has pre-sented various important challenges with regard to side-channel attacks. In particular, it is common to deploy SoC-based target devices with high clock frequencies in security-critical scenarios; understanding how such features align with techniques more often deployed against simpler de-vices is vital from both destructive (i.e., attack) and constructive (i.e., evaluation and/or countermeasure) perspectives. In this paper, we in-vestigate electromagnetic-based leakage from three different means of executing cryptographic workloads (including the general purpose ARM core, an on-chip co-processor, and the NEON core) on the AM335x SoC. Our conclusion is that addressing challenges of the type above is feasible, and that key recovery attacks can be conducted with modest resources.
mind.org
"... The following full text is a preprint version which may differ from the publisher's version. ..."
Abstract
- Add to MetaCart
(Show Context)
The following full text is a preprint version which may differ from the publisher's version.
FourQNEON: Faster Elliptic Curve Scalar Multiplications on ARM Processors
"... Abstract. We present a high-speed, high-security implementation of the recently proposed elliptic curve FourQ (ASIACRYPT 2015) for 32-bit ARM processors with NEON support. Exploiting the versatile and compact arithmetic of this curve, we design a vectorized implementation that achieves high-perform ..."
Abstract
- Add to MetaCart
(Show Context)
Abstract. We present a high-speed, high-security implementation of the recently proposed elliptic curve FourQ (ASIACRYPT 2015) for 32-bit ARM processors with NEON support. Exploiting the versatile and compact arithmetic of this curve, we design a vectorized implementation that achieves high-performance across a large variety of ARM platforms. Our software is fully protected against timing and cache attacks, and showcases the impressive speed of FourQ when compared with other curve-based alternatives. For example, one single variable-base scalar multiplication is computed in about 235,000 Cortex-A8 cycles or 132,000 Cortex-A15 cycles which, compared to the results of the fastest genus 2 Kummer and Curve25519 implementations on the same platforms, offer speedups between 1.3x-1.7x and between 2.1x-2.4x, respectively. In comparison with the NIST standard curve K-283, we achieve speedups above 4x and 5.5x.