Results 1 - 10
of
11
Optimal Left-to-right Binary Signed-Digit Recoding
, 2000
"... This paper describes new methods for producing optimal binary signed-digit representations. This can be useful in the fast computation of exponentiations. Contrary to existing algorithms, the digits are scanned from left to right (i.e., from the most significant position to the least significant ..."
Abstract
-
Cited by 30 (3 self)
- Add to MetaCart
This paper describes new methods for producing optimal binary signed-digit representations. This can be useful in the fast computation of exponentiations. Contrary to existing algorithms, the digits are scanned from left to right (i.e., from the most significant position to the least significant position). This may lead to better performances in both hardware and software.
Design Issues In High Performance Floating Point Arithmetic Units
, 1996
"... In recent years computer applications have increased in their computational complexity. The industry-wide usage of performance benchmarks, such as SPECmarks, forces processor designers to pay particular attention to implementation of the floating point unit, or FPU. Special purpose applications, suc ..."
Abstract
-
Cited by 17 (3 self)
- Add to MetaCart
In recent years computer applications have increased in their computational complexity. The industry-wide usage of performance benchmarks, such as SPECmarks, forces processor designers to pay particular attention to implementation of the floating point unit, or FPU. Special purpose applications, such as high performance graphics rendering systems, have placed further demands on processors. High speed floating point hardware is a requirement to meet these increasing demands. This work examines the state-of-the-art in FPU design and proposes techniques for improving the performance and the performance/area ratio of future FPUs. In recent FPUs, emphasis has been placed on designing ever-faster adders and multipliers, with division receiving less attention. The design space of FP dividers is large, comprising five different classes of division algorithms: digit recurrence, functional iteration, very high radix, table look-up, and variable latency. While division is an infrequent operation...
VLSI Implementation of Discrete Wavelet Transform
, 1996
"... This paper presents a VLSI implementation of discrete wavelet transform (DWT). The architecture is simple, modular, and cascadable for computation of one, or multi-dimensional DWT. It comprises of four basic units: input delay, filter, register bank, and control unit. The proposed architecture is sy ..."
Abstract
-
Cited by 15 (0 self)
- Add to MetaCart
This paper presents a VLSI implementation of discrete wavelet transform (DWT). The architecture is simple, modular, and cascadable for computation of one, or multi-dimensional DWT. It comprises of four basic units: input delay, filter, register bank, and control unit. The proposed architecture is systolic in nature and performs both high-pass and lowpass coefficient calculations with only one set of multipliers. In addition, it requires a small on-chip interface circuitry for interconnection to a standard communication bus. A detailed analysis of the effect of finite precision of data and wavelet filter coefficients on the accuracy of the DWT coefficients is presented. The architecture has been simulated in VLSI and has a hardware utilization efficiency of 87.5%. Being systolic in nature, the architecture can compute DWT at a data rate of N × 10 6 samples/sec corresponding to a clock speed of N MHz. 1. Introduction In the last decade, there has been an enormous increase in the appl...
Reduced Power Dissipation Through Truncated Multiplication
- in IEEE Alessandro Volta Memorial Workshop on Low Power Design
, 1999
"... Reducing the power dissipation of parallel multipliers is important in the design of digital signal processing systems. In many of these systems, the products of parallel multipliers are rounded to avoid growth in word size. The power dissipation and area of rounded parallel multipliers can be signi ..."
Abstract
-
Cited by 15 (5 self)
- Add to MetaCart
Reducing the power dissipation of parallel multipliers is important in the design of digital signal processing systems. In many of these systems, the products of parallel multipliers are rounded to avoid growth in word size. The power dissipation and area of rounded parallel multipliers can be significantly reduced by a technique known as truncated multiplication. With this technique, the least significant columns of the multiplication matrix are not used. Instead, the carries generated by these columns are estimated. This estimate is added with the most significant columns to produce the rounded product. This paper presents the design and implementation of parallel truncated multipliers. Simulations indicate that truncated parallel multipliers dissipate between 29 and 40 percent less power than standard parallel multipliers for operand sizes of 16 and 32 bits. 1: Introduction High-speed parallel multipliers are fundamental building blocks in digital signal processing systems [1]. In...
Integer Multiplication with Overflow Detection or Saturation
- IEEE Transactions on Computers
, 2000
"... High-speed multiplication is frequently used in general-purpose and application-specific computer systems. These systems often support integer multiplication, where two n-bit integers are multiplied to produce a 2n-bit product. To prevent growth in word length, processors typically return the n leas ..."
Abstract
-
Cited by 6 (2 self)
- Add to MetaCart
High-speed multiplication is frequently used in general-purpose and application-specific computer systems. These systems often support integer multiplication, where two n-bit integers are multiplied to produce a 2n-bit product. To prevent growth in word length, processors typically return the n least significant bits of the product and a flag that indicates whether or not overflow has occurred. Alternatively, some processors saturate results that overflow to the most positive or most negative representable number. This paper presents efficient methods for performing unsigned or two's complement integer multiplication with overflow detection or saturation. These methods have significantly less area and delay than conventional methods for integer multiplication with overflow detection or saturation. Keywords--- Overflow, saturation, two's complement, unsigned, integer, array multipliers, tree multipliers, computer arithmetic. I. Introduction Most modern computers directly support multi...
Design and Implementation of a 16 by 16 Low-Power Two's Complement Multiplier
- in Proc. 2000 IEEE Int. Symp. Circuits and Systems
, 2000
"... This paper describes the design and implementation of a high-speed low-power 16 by 16 two's complement parallel multiplier. The multiplier uses optimized radix-4 Booth encoders to generate the partial products, and an array of strategically placed (3,2), (5,3), and (7,4) counters to reduce the parti ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
This paper describes the design and implementation of a high-speed low-power 16 by 16 two's complement parallel multiplier. The multiplier uses optimized radix-4 Booth encoders to generate the partial products, and an array of strategically placed (3,2), (5,3), and (7,4) counters to reduce the partial products to sum and carry vectors. The more significant bits of the product are computed from left to right using a modified Ercegovac-Lang converter. An implementation of the multiplier in 0.25- m static CMOS technology has an area of 0.126 mm 2 , a measured delay of 4.39 ns, and a average power dissipation of 0.110 mW/MHz at 2.5 Volts and 100 ffi C. I.
Mixed Swing Techniques for Low Energy/Operation Datapath Circuits
, 1997
"... The portable communications industry’s vision of integrating a complete multimedia complex on a single die, coupled with the desktop computing industry’s vision of inte-grating multimedia functionality into general-purpose microprocessors has trans-formed lowering the power dissipation of digital si ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
The portable communications industry’s vision of integrating a complete multimedia complex on a single die, coupled with the desktop computing industry’s vision of inte-grating multimedia functionality into general-purpose microprocessors has trans-formed lowering the power dissipation of digital signal processing (DSP) datapath circuits into an increasingly important challenge in current and future fabrication pro-cesses. Fully-static CMOS logic accompanied with supply voltage scaling has enjoyed widespread usage in lowering datapath power dissipation over the last decade. How-ever, fundamental limitations preclude device threshold voltage scaling under the con-stant drain-source field scaling paradigm in future deep-submicron processes, imposing limitations on voltage scaling. This has motivated a strong necessity for exploring new methodologies to lower the power dissipation of next-generation high-speed datapath circuits. This thesis investigates Mixed Swing techniques for reducing the power dissipa-tion of static CMOS datapath operators while retaining their high performance, or
Technology scaling effects on multipliers
- IEEE Trans
, 1996
"... Abstract—Since integrated circuits were invented, fabrication engineers have been able to steadily decrease the dimensions of the devices (transistors). These reductions in the minimum feature sizes have resulted in improved performance. In addition, the dimensions of the interconnect used to connec ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
Abstract—Since integrated circuits were invented, fabrication engineers have been able to steadily decrease the dimensions of the devices (transistors). These reductions in the minimum feature sizes have resulted in improved performance. In addition, the dimensions of the interconnect used to connect the active transistors have also scaled. The decreasing dimensions of the physical devices causes the capacitance and resistances of the different parts of the multiplier to change. Therefore, the relative delay due to each part of the multiplier changes. In addition, the different encoding schemes used to generate the partial products and the different topologies used in the reduction of the partial products effect the total latency of the multiplier. This paper examines the effects of the smaller device dimensions on multipliers. It shows that the interconnect is becoming more important and that automatic generation of partial products provides the minimum latency for small feature sizes. Index Terms—Feature size, multipliers, Booth encoding, topology.
Integer Multiplication With Overflow Detection Or Saturation
- Master's thesis, Lehigh University, 19 Memorial Dr
, 2000
"... 1 1 Introduction 2 1.1 Multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2 Overflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.3 Saturation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.4 Thesis Overview . . . . . . ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
1 1 Introduction 2 1.1 Multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2 Overflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.3 Saturation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.4 Thesis Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2 Previous Research 6 2.1 Unsigned Parallel Multipliers . . . . . . . . . . . . . . . . . . . . . . 6 iv 2.1.1 Unsigned Array Multipliers . . . . . . . . . . . . . . . . . . . 7 2.1.2 Unsigned Tree Multipliers . . . . . . . . . . . . . . . . . . . . 11 2.2 Two's Complement Multipliers . . . . . . . . . . . . . . . . . . . . . . 14 2.2.1 Two's Complement Array Multipliers . . . . . . . . . . . . . . 16 2.2.2 Two's Complement Tree Multipliers . . . . . . . . . . . . . . . 19 3 Overflow Detection and Saturation for Unsigned Integer Multiplication 21 3.1 General Design Approach . . . . . . . . . . . . . . . . . . . . . . . . 21 3.2 Unsigned Arr...
Combined Unsigned and Two's Complement Saturating Multipliers
, 2000
"... In many digital signal processing and multimedia applications, results that overflow are saturated to the most positive or most negative representable number. This paper presents efficient techniques for performing saturating n-bit integer multiplication on unsigned and two's complement numbers. Un ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
In many digital signal processing and multimedia applications, results that overflow are saturated to the most positive or most negative representable number. This paper presents efficient techniques for performing saturating n-bit integer multiplication on unsigned and two's complement numbers. Unlike conventional techniques for saturating multiplication, which compute a 2n-bit product and then examine the n most significant product bits to determine if overflow has occurred, the techniques presented in this paper compute only the (n + 1) least significant bits of the product. Specialized overflow detection units, which operate in parallel with the multiplier, determine if overflow has occurred and the product should be saturated. These techniques are applied to designs for saturating array multipliers that perform either unsigned or two's complement saturating integer multiplication, based on an input control signal. Compared to array multipliers that use conventional methods for sa...

