## Multiprecision Division on an 8-Bit Processor (1997)

Venue: | in Proc. 13th IEEE Symp. Computer Arithmetic, IEEE CS |

Citations: | 5 - 5 self |

### BibTeX

@INPROCEEDINGS{Rice97multiprecisiondivision,

author = {Eric Rice and Richard Hughey},

title = {Multiprecision Division on an 8-Bit Processor},

booktitle = {in Proc. 13th IEEE Symp. Computer Arithmetic, IEEE CS},

year = {1997},

pages = {74--81},

publisher = {IEEE CS}

}

### OpenURL

### Abstract

Small processors can be especially useful in massively parallel architectures. This paper considers multiprecision division algorithms on an 8-bit processor (the Kestrel processor, currently in fabrication) that includes a small amount of memory and an 8-bit multiplier. We evaluate several variations of the Newton-Raphson reciprocal approximation methods for use with division. Our final singleprecision algorithm requires 41 cycles to divide two 24-bit numbers to produce a 26-bit result. The double-precision version requires 98 cycles to divide two 53-bit numbers to produce a 55-bit result. This low cycle count is the result of several techniques including low-precision arithmetic, early introduction of dividends, and simple yet good initial reciprocal estimates. 1. Introduction This paper presents a study of division on an 8-bit processor. It is motivated by the Kestrel architecture, an 8-bit parallel processor tuned to sequence analysis [8]. The word size is a natural choice for seq...

### Citations

4331 |
Computer Architecture: a quantitative approach (3rd ed
- Hennessy, Patterson
- 2003
(Show Context)
Citation Context ...ng a lookup table for initial reciprocal estimates on Kestrel. Finally, the table includes estimated instruction counts for three alternative methods: the well-known SRT division algorithm in radix 4 =-=[7]-=-, Ercegovac and Lang's multiprecision division algorithm [2], and radix 2 nonrestoring division (the SP numbers in these cases are for only 24 bits of precision, and assume an appropriately modified K... |

384 | What every computer scientist should know about floating-point arithmetic
- Goldberg
- 1991
(Show Context)
Citation Context ...d estimate can produce. We evaluate methods according to two primary targets: 1 single precision and double precision floating-point numbers. The significands of interest are thus 24 bits and 53 bits =-=[6]-=-. In both cases, as shall be seen, division must be carried out to additional precision to enable proper rounding. Our results are presented as follows. After a brief discussion of savings available d... |

108 |
Lang: Division and Square Root: Digit Recurrence Algorithms and Implementations
- Ercegovac, T
- 1994
(Show Context)
Citation Context ...at multiplicative algorithms (which typically double the precision on each iteration) as methods of efficiently implementing division, rather than additive algorithms such as digit-recurrence methods =-=[3]-=- and SRT division. Tuning algorithms to byte calculations is frequently not discussed in the literature [1, 9]. Typically methods are discussed in terms of the number of operations, not size of operat... |

94 |
The Art of Computer Programming, volume 2
- Knuth
- 1998
(Show Context)
Citation Context ...ntly implementing division, rather than additive algorithms such as digit-recurrence methods [3] and SRT division. Tuning algorithms to byte calculations is frequently not discussed in the literature =-=[1, 9]-=-. Typically methods are discussed in terms of the number of operations, not size of operations, though there are exceptions [12, 2, 10]. Since the analysis found was lacking for our purposes, we evalu... |

37 |
On Division by Functional Iteration
- Flynn
- 1970
(Show Context)
Citation Context ... In Newton-Raphson methods, a recurrence, such as r i+1 = r i [2 \Gamma br i ], is evaluated from an initial reciprocal estimate r 0 to form a series of increasingly exact reciprocal estimates to 1=b =-=[5]-=-. These estimates converge quadratically or better (depending on the equation used), this being the prime advantage of Newton-Raphson over digit-at-a-time methods for moderately-sized numbers. Its dis... |

28 |
Fast Division using Accurate Quotient Approximations to reduce the number of iterations
- Wong, Flynn
- 1992
(Show Context)
Citation Context ...ms to byte calculations is frequently not discussed in the literature [1, 9]. Typically methods are discussed in terms of the number of operations, not size of operations, though there are exceptions =-=[12, 2, 10]-=-. Since the analysis found was lacking for our purposes, we evaluated various suggested forms of multiplicative division. The basic principles of each were examined in terms of how accuracy needs to b... |

23 | Kestrel: “A programmable array for sequence analysis
- Hirschberg, Hughey, et al.
- 1996
(Show Context)
Citation Context ...eciprocal estimates. 1. Introduction This paper presents a study of division on an 8-bit processor. It is motivated by the Kestrel architecture, an 8-bit parallel processor tuned to sequence analysis =-=[8]-=-. The word size is a natural choice for sequence analysis applications: characters may require 2 (DNA and RNA), 5 (protein), or 8 (text) bits, and the dynamic programming calculation at the core of ma... |

9 |
On optimal iterative schemes for high-speed division
- Krishnamurthy
- 1970
(Show Context)
Citation Context ...ms to byte calculations is frequently not discussed in the literature [1, 9]. Typically methods are discussed in terms of the number of operations, not size of operations, though there are exceptions =-=[12, 2, 10]-=-. Since the analysis found was lacking for our purposes, we evaluated various suggested forms of multiplicative division. The basic principles of each were examined in terms of how accuracy needs to b... |

7 |
A Division Method Using a Parallel Multiplier
- Ferrari
- 1967
(Show Context)
Citation Context ...tonRaphson can be extended to: r i+1 = r i \Theta 1 + (1 \Gamma br i ) + (1 \Gamma br i ) 2 + : : : + (1 \Gamma br i ) n ; (2) which is a more general equation, with equation 1 corresponding to n = 1 =-=[4]-=-. Looking more closely at equation 2, an inner iteration is suggested to obtain an increasingly accurate value by which the previous r i should be multiplied, each such inner iteration requiring a mul... |

6 |
Digital computer arithmetic
- Cavanagh
- 1984
(Show Context)
Citation Context ...ntly implementing division, rather than additive algorithms such as digit-recurrence methods [3] and SRT division. Tuning algorithms to byte calculations is frequently not discussed in the literature =-=[1, 9]-=-. Typically methods are discussed in terms of the number of operations, not size of operations, though there are exceptions [12, 2, 10]. Since the analysis found was lacking for our purposes, we evalu... |

1 |
Multiplication/ division/ square root module for massively parallel computers
- Ercegovac, Lang
- 1993
(Show Context)
Citation Context ...ms to byte calculations is frequently not discussed in the literature [1, 9]. Typically methods are discussed in terms of the number of operations, not size of operations, though there are exceptions =-=[12, 2, 10]-=-. Since the analysis found was lacking for our purposes, we evaluated various suggested forms of multiplicative division. The basic principles of each were examined in terms of how accuracy needs to b... |

1 |
Economical iterative rangetransformation schemes for division
- Krishnamurthy
- 1971
(Show Context)
Citation Context ...g a simple hardware implementation, the extended NR equation should never be used with more than a few inner iterations. In fact, echoing conclusions reached earlier using less byte-oriented analysis =-=[11]-=-, it appears that one loses little (if anything) by limiting consideration to just equation 1, r i+1 = r i \Theta [1 + (1 \Gamma br i )] ; which doubles the accuracy, and r i+1 = r i \Theta \Theta 1 +... |