Results 1  10
of
13
DigitSet Conversions: Generalizations and Applications
 IEEE Transactions on Computers
, 1995
"... The problem of digit set conversion for fixed radix is investigated for the case of converting into a nonredundant, as well as into a redundant digit set. Conversion may be from very general digit sets, and covers as special cases multiplier recodings, additions and certain multiplications. We gene ..."
Abstract

Cited by 22 (5 self)
 Add to MetaCart
The problem of digit set conversion for fixed radix is investigated for the case of converting into a nonredundant, as well as into a redundant digit set. Conversion may be from very general digit sets, and covers as special cases multiplier recodings, additions and certain multiplications. We generalize known algorithms for conversions into nonredundant digit sets, as well as apply conversion to generalize the O(log n) time algorithm for conditional sum addition using parallel prefix computation, and a comparison is made with standard carrylookahead techniques. Examples on multioperand addition are used to illustrate the generality of this approach. O(1) time algorithms for converting into redundant digit sets are generalized based on a very simple lemma, which provides a framework for all conversions into redundant digit sets. Applications in multiplier recoding and partial product accumulation are used here as exemplifications. Keywords: Computer arithmetic, digit set conversio...
Reduced Power Dissipation Through Truncated Multiplication
 in IEEE Alessandro Volta Memorial Workshop on Low Power Design
, 1999
"... Reducing the power dissipation of parallel multipliers is important in the design of digital signal processing systems. In many of these systems, the products of parallel multipliers are rounded to avoid growth in word size. The power dissipation and area of rounded parallel multipliers can be signi ..."
Abstract

Cited by 19 (5 self)
 Add to MetaCart
Reducing the power dissipation of parallel multipliers is important in the design of digital signal processing systems. In many of these systems, the products of parallel multipliers are rounded to avoid growth in word size. The power dissipation and area of rounded parallel multipliers can be significantly reduced by a technique known as truncated multiplication. With this technique, the least significant columns of the multiplication matrix are not used. Instead, the carries generated by these columns are estimated. This estimate is added with the most significant columns to produce the rounded product. This paper presents the design and implementation of parallel truncated multipliers. Simulations indicate that truncated parallel multipliers dissipate between 29 and 40 percent less power than standard parallel multipliers for operand sizes of 16 and 32 bits. 1: Introduction Highspeed parallel multipliers are fundamental building blocks in digital signal processing systems [1]. In...
Parallel Saturating Fractional Arithmetic Units
 IN 9TH GREAT LAKES SYMPOSIUM ON VLSI
, 1999
"... This paper describes the designs of a saturating adder, multiplier, single MAC unit, and dual MAC unit with one cycle latencies. The dual MAC unit can perform two saturating MAC operations in parallel and accumulate the results with saturation. Specialized saturation logic ensures that the output of ..."
Abstract

Cited by 12 (6 self)
 Add to MetaCart
This paper describes the designs of a saturating adder, multiplier, single MAC unit, and dual MAC unit with one cycle latencies. The dual MAC unit can perform two saturating MAC operations in parallel and accumulate the results with saturation. Specialized saturation logic ensures that the output of the dual MAC unit is identical to the result of the operations performed serially with saturation after each multiplication and each addition 1
Integer Multiplication with Overflow Detection or Saturation
 IEEE Transactions on Computers
, 2000
"... AbstractÐHighspeed multiplication is frequently used in generalpurpose and applicationspecific computer systems. These systems often support integer multiplication, where two nbit integers are multiplied to produce a 2nbit product. To prevent growth in word length, processors typically return t ..."
Abstract

Cited by 7 (2 self)
 Add to MetaCart
AbstractÐHighspeed multiplication is frequently used in generalpurpose and applicationspecific computer systems. These systems often support integer multiplication, where two nbit integers are multiplied to produce a 2nbit product. To prevent growth in word length, processors typically return the n least significant bits of the product and a flag that indicates whether or not overflow has occurred. Alternatively, some processors saturate results that overflow to the most positive or most negative representable number. This paper presents efficient methods for performing unsigned or two's complement integer multiplication with overflow detection or saturation. These methods have significantly less area and delay than conventional methods for integer multiplication with overflow detection or saturation.
A Fast Parallel Squarer Based on DivideandConquer
 IEEE Journal of SolidState Circuits
, 1995
"... Fast and small squarers are needed in many applications such as image compression. A new family of high performance parallel squarers based on the divideandconquer method is reported. Our main result was realizing the basis cases of the divideandconquer recursion by using optimized nbit primiti ..."
Abstract

Cited by 6 (0 self)
 Add to MetaCart
Fast and small squarers are needed in many applications such as image compression. A new family of high performance parallel squarers based on the divideandconquer method is reported. Our main result was realizing the basis cases of the divideandconquer recursion by using optimized nbit primitive squarers, where n is in the range of 2 to 6. This method reduced the gate count and provided shorter critical paths. A chip implementing an 8bit squarer was designed, fabricated and successfully tested, resulting in 24 MOPS using a 2¯ CMOS fabrication technology. This squarer had two additional features: increased number of squaring operations per unit circuit area, and the potential for reduced power consumption per squaring operation. 1 Introduction The need to square numbers arises in a large number of image processing algorithms. For example, in many subband vector quantization systems (e.g. [1]), the L 2 norm calculations in the vector quantizer can involve the order of 288 mil...
Combined Unsigned and Two's Complement Saturating Multipliers
, 2000
"... In many digital signal processing and multimedia applications, results that overflow are saturated to the most positive or most negative representable number. This paper presents efficient techniques for performing saturating nbit integer multiplication on unsigned and two's complement numbers. Un ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
In many digital signal processing and multimedia applications, results that overflow are saturated to the most positive or most negative representable number. This paper presents efficient techniques for performing saturating nbit integer multiplication on unsigned and two's complement numbers. Unlike conventional techniques for saturating multiplication, which compute a 2nbit product and then examine the n most significant product bits to determine if overflow has occurred, the techniques presented in this paper compute only the (n + 1) least significant bits of the product. Specialized overflow detection units, which operate in parallel with the multiplier, determine if overflow has occurred and the product should be saturated. These techniques are applied to designs for saturating array multipliers that perform either unsigned or two's complement saturating integer multiplication, based on an input control signal. Compared to array multipliers that use conventional methods for sa...
A New Recursive Multibit Recoding Algorithm for HighSpeed and LowPower Multiplier
 ISSN 15461998, American Scientific Publishers (ASP
, 2012
"... Abstract—In this paper, a new recursive multibit recoding multiplication algorithm is introduced. It provides a general spacetime partitioning of the multiplication problem that not only enables a drastic reduction of the number of partial products (n/r), but also eliminates the need of precomputi ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
Abstract—In this paper, a new recursive multibit recoding multiplication algorithm is introduced. It provides a general spacetime partitioning of the multiplication problem that not only enables a drastic reduction of the number of partial products (n/r), but also eliminates the need of precomputing odd multiples of the multiplicand in higher radix (ß≥8) multiplication. Based on a mathematical proof that any higher radix ß=2 r can be recursively derived from a combination of two or a number of lower radices, a series of generalized radix ß=2 r multipliers are generated by means of primary radices: 2 1, 2 2, 2 5, and 2 8. A variety of higherradix (2 3 2 32) two’s complement 64x64 bit serial/parallel multipliers are implemented on Virtex6 FPGA and characterized in terms of multiplytime, energy consumption per multiplyoperation, and area occupation for r value varying from 2 to 64. Compared to reference algorithm, savings of 8%, 52%, 63% are respectively obtained in terms of speed, power, and area. In addition, a new lowpower and highlyflexible radix 2 r adapted technique for a multiprecision multiplication is presented.
HighSpeed and LowPower PID Structures for Embedded Applications
 Proceedings of the 21th edition of the International Workshop on Power and Timing Modeling, Optimization and Simulation PATMOS, LNCS 6951
"... Abstract. In embedded control applications, controlrate and energyconsumption are two critical design issues. This paper presents a series of highspeed and lowpower finitewordlength PID controllers based on a new recursive multiplication algorithm. Compared to published results into the same con ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
Abstract. In embedded control applications, controlrate and energyconsumption are two critical design issues. This paper presents a series of highspeed and lowpower finitewordlength PID controllers based on a new recursive multiplication algorithm. Compared to published results into the same conditions, savings of 431 % and 20 % are respectively obtained in terms of controlrate and dynamic power consumption. In addition, the new multiplication algorithm generates scalable PID structures that can be tailored to the desired performance and power budget. All PIDs are implemented at RTL level as technologyindependent reusable IPcores. They are reconfigurable according to two compiletime constants: setpoint wordlength and latency.
Arithmetic, pp. 168174, IEEE Computer Society, 1997. [41] M. J. Schulte and E. E. Swartzlander, "Hardware Designs for Exactly Rounded Elementary Functions,"
"... 9, IEEE Computer Society, 1993. 5 [27] J. Fandrianto, "Algorithm for High Speed Shared Radix 4 Division and Radix 4 SquareRoot, " in Proc. 8th IEEE Symposium on Computer Arithmetic, pp. 7379, IEEE Computer Society, 1987. [28] C. V. Ramamoorthy, J. R. Goodman, and K. H. Kim, "Some Properties of ..."
Abstract
 Add to MetaCart
9, IEEE Computer Society, 1993. 5 [27] J. Fandrianto, "Algorithm for High Speed Shared Radix 4 Division and Radix 4 SquareRoot, " in Proc. 8th IEEE Symposium on Computer Arithmetic, pp. 7379, IEEE Computer Society, 1987. [28] C. V. Ramamoorthy, J. R. Goodman, and K. H. Kim, "Some Properties of Iterative SquareRooting Methods Using HighSpeed Multiplication," IEEE Transactions on Computers, vol. C21, pp. 837847, 1972. [29] M. J. Flynn, "On Division by Functional Iteration," IEEE Transactions on Computers, vol. C19, pp. 702706, 1970. [30] S. Oberman and M. Flynn, "Division Algorithms and Implementations," ieeetc, vol. C46, pp. 833854, August 1997. [31] P. Soderquist and M. Leeser, "An Area/performance Comparison of Subtractive and Multiplicative Divide/Square Root Implementations," in Proc. 12th IEEE Symposium on Computer Arithmetic (S. Knowles and W. H. McAllister, eds.), IEEE Computer Society,
ZOTBinary: A new . . .
, 2013
"... In this paper we present a new numbering system with an efficient application on BigInteger multiplication. The paper starts with an introduction to a new redundant positional numbering system known as “BigDigit Numbering System ” (BDNS). With BDNS, a new nonredundant positional numbering system ..."
Abstract
 Add to MetaCart
In this paper we present a new numbering system with an efficient application on BigInteger multiplication. The paper starts with an introduction to a new redundant positional numbering system known as “BigDigit Numbering System ” (BDNS). With BDNS, a new nonredundant positional numbering system known as ZOTBinary is proposed. ZOTBinary has a low Hamming weight with an average of 23.8 % nonzero symbols, and therefore is highly suitable for BigInteger calculation, especially for BigInteger multiplication. To harvest such benefit from the ZOTBinary representation, a new BigInteger multiplication algorithm, ZOTCM, which is based on the Classical multiplication algorithm, is proposed. Our result shows that when compared with the Classical multiplication algorithm, ZOTCM is about 12 times faster for multiplying 128 bits numbers and at least 16 times faster for multiplying numbers that are bigger than 32,000 bits long. Our result also shows that ZOTCM is about 20 to 3 times faster than Karatsuba multiplication algorithm, for multiplying numbers that are ranging from 128 bits to 32,000 bits long. From the findings, it is clear that ZOTCM is a valuable addition to BigInteger multiplication algorithm, and it is also believed that ZOTBinary representation can benefit many other BigInteger calculations.