Results 1 
9 of
9
Reduced Power Dissipation Through Truncated Multiplication
 in IEEE Alessandro Volta Memorial Workshop on Low Power Design
, 1999
"... Reducing the power dissipation of parallel multipliers is important in the design of digital signal processing systems. In many of these systems, the products of parallel multipliers are rounded to avoid growth in word size. The power dissipation and area of rounded parallel multipliers can be signi ..."
Abstract

Cited by 26 (7 self)
 Add to MetaCart
(Show Context)
Reducing the power dissipation of parallel multipliers is important in the design of digital signal processing systems. In many of these systems, the products of parallel multipliers are rounded to avoid growth in word size. The power dissipation and area of rounded parallel multipliers can be significantly reduced by a technique known as truncated multiplication. With this technique, the least significant columns of the multiplication matrix are not used. Instead, the carries generated by these columns are estimated. This estimate is added with the most significant columns to produce the rounded product. This paper presents the design and implementation of parallel truncated multipliers. Simulations indicate that truncated parallel multipliers dissipate between 29 and 40 percent less power than standard parallel multipliers for operand sizes of 16 and 32 bits. 1: Introduction Highspeed parallel multipliers are fundamental building blocks in digital signal processing systems [1]. In...
Hybrid SignedDigit number systems: A unified framework for redundant number representations with bounded carry propagation chains
, 1993
"... Abstract A novel hybrid number representation is proposed in this paper. It includes the two's complement representation and the signeddigit mpresentation as special cases. The hybrid number representations proposed are capable of bounding the maximum length of carry propagation chains during ..."
Abstract

Cited by 20 (6 self)
 Add to MetaCart
Abstract A novel hybrid number representation is proposed in this paper. It includes the two's complement representation and the signeddigit mpresentation as special cases. The hybrid number representations proposed are capable of bounding the maximum length of carry propagation chains during addition to any desired value between 1 and the entire word length. The framework reveals a continuum of number representations between the two extremes of two's complement and signeddigit number systems and allows a unified performance analysis of the entire spectrum of implementations of adders, multipliers and alike. We present several static CMOS implementations of a twooperand adder which employ the proposed representations. We then derive quantitative estimates of area (in terms of the required number of transistors) and the maximum carry propagation delay for such an adder. The analysis clearly itlustrates the tradeoffs between area and execution time assodated with each of the possible representathns. We also discuss adder trees for parallel multipliers and show that the proposed representations lead to compact adder trees with fast execution times. In practice, the area available to a designer is often Umited. In such cases, the designer can select the particular hybrid fepresentation that yields the most suitable implementation (fastest, lowest power consumption, etc.) while satisfying the area constraint. Similarly, if the worst case delay is predetermined, the designer can select a hybrid representation that minimizes area or power under the delay constraint. Index TermsBounded carry propagation, carryfree addition, hybrid signeddigit number system, redundant number representation, signeddigit numbers, static CMOS implementation. I.
Area Delay (A T ) Efficient Multiplier Based on an Intermediate Hybrid SignedDigit (HSD1) Representation
 Proc. of the 14th IEEE International Symposium on Computer Arithmetic
, 1999
"... Intermediate Signed Digit (SD) representation can facilitate fast and compact VLSI implementations of partial product accumulation trees. It achieves a reduction ratio of 2:1 at every level and also leads to more regular layouts. Its disadvantage is that the number of bit lines that need to routed c ..."
Abstract

Cited by 3 (2 self)
 Add to MetaCart
(Show Context)
Intermediate Signed Digit (SD) representation can facilitate fast and compact VLSI implementations of partial product accumulation trees. It achieves a reduction ratio of 2:1 at every level and also leads to more regular layouts. Its disadvantage is that the number of bit lines that need to routed can be high. This can lead to a significant area overhead especially at smaller feature sizes where the wire/interconnect area and delay can be dominant. A Hybrid Signed Digit (HSD) representation lets some of the digits be unsigned bits, thereby reducing the number of bit lines. By arbitrarily varying the positions of and distances between consecutive signed digits, this representation can trade off latency for area and offers a continuum of choices between the two’s complement representation on the one hand and fully Signed Digit (FSD or simply SD) representation on the other. In this paper, we illustrate an A T (area delay) efficient multiplier based on the HSD–1 representation which is one of the many possible HSD formats, wherein every alternate digit is signed and the rest are unsigned (ordinary) bits. It is seen that multipliers based on HSD–1 format require more transistors than those based on FSD format. However, they require fewer bit lines to be routed, which substantially reduces the interconnect area; thereby leading to a reduction in the total VLSI area and a lower A T product. The design reaffirms that the interconnect area can be siginficant especially at small feature sizes. 1.
A DFT Technique for Testing HighSpeed Circuits with Arbitrarily Slow Testers
, 2001
"... Abstract. This paper presents a design for testability (DFT) technique for testing highspeed circuits with a lowspeed test mode clock. With this technique, the test mode clock frequency can be reduced with virtually no lower limit. Even with the reduced speed requirement on the automatic test equi ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract. This paper presents a design for testability (DFT) technique for testing highspeed circuits with a lowspeed test mode clock. With this technique, the test mode clock frequency can be reduced with virtually no lower limit. Even with the reduced speed requirement on the automatic test equipment (ATE), our method facilitates the test of the ratedspeed timing and allows performance binning. A CMOS implementation of the DFT hardware with 50 ps timing accuracy is presented. To demonstrate the effectiveness of the technique we designed a 16bit, 1.4 GHz pipelined multiplier as a test vehicle. Simulations using a test clock frequency much lower than the rated clock frequency show that delay faults of sizes as small as 50 ps are detected and that the new test technique provides correct performance binning.
A Complementary GaAs (CGaAsT”) 32bit Multiply Accumulate Unit
"... A high speed onecycle, 32bit multiply, 64bit accumulate unit is presented in Complementary GaAs (CGaAs~~) technology. A tree of 4:2 compressors is used to collect the partial products and a cany select adder is used to determine the jinal result. Radix4 Booth encoding is utilized to reduce the p ..."
Abstract
 Add to MetaCart
A high speed onecycle, 32bit multiply, 64bit accumulate unit is presented in Complementary GaAs (CGaAs~~) technology. A tree of 4:2 compressors is used to collect the partial products and a cany select adder is used to determine the jinal result. Radix4 Booth encoding is utilized to reduce the partial product tree size. Di#erential cascode voltage switch logic (D CVSL) is used on cn”tical paths. A description of CGaAs technology, including its inherent radiation hardness, is provided as a background to the discussion. Finally, a study of some of the implications of designing in CGaAs is presented, including logic styles, circuit issues, design methodology, and their effect on performance. 1.
A 32bit MultiplyAccumulate Circuit in Complementary GaAs (CGaAs)
, 1998
"... This paper describes a high speed, onecycle, 32bit multiply, 64bit accumulate unit in Complementary GaAs (CGaAs^TM) technology. A tree of 4:2 compressors is used to collect the partial products and a carry select adder is used to determine the final result. Radix4 Booth encoding is utilized t ..."
Abstract
 Add to MetaCart
(Show Context)
This paper describes a high speed, onecycle, 32bit multiply, 64bit accumulate unit in Complementary GaAs (CGaAs^TM) technology. A tree of 4:2 compressors is used to collect the partial products and a carry select adder is used to determine the final result. Radix4 Booth encoding is utilized to reduce the partial product tree size. Differential cascode voltage switch logic (DCVSL) is used throughout the circuit. A description of CGaAs technology, including its inherent radiation hardness, is provided as a rationale for many of the design decisions. The design methodology, including planned verification of the final device is also discussed. 3 1. Introduction The emerging demand for satellite communications is moving the market for radiation hardened devices out of military applications and into the mainstream. Both of these markets have need for micro and digitalsignal processors which are capable of surviving a hostile radiation environment. In addition, the satellite co...