### Table 6: Hits by algorithmic problem, with implementation ratings

1998

"... In PAGE 2: ... The world has been divided into a total of 75 fundamental algorithmic problems, partitioned among data structures, numerical algorithms, combinatorial algorithms, graph algorithms, hard problems, and computational geometry. See Table6 or http://www.... In PAGE 3: ... Table 3 reports the 15 most popular and least popular algorithmic problems, as measured by the number of hits the associated pages received. Hit counts for all of the 75 problems appears in Table6 . Several observations can be drawn from this data: Shortest path (with 3660 hits) was the most popular of the algorithmic problems over the course of the study.... ..."

Cited by 5

### Table 1: Resource allocation used

2004

"... In PAGE 40: ...2 Syntax For the moment, agent programming takes the form of a set of annotations that result in assembly-level transformations. These transformations are not yet implemented in a compiler, they are performed by hand, but the corresponding static transformations AP Annotation Semantic // AP divide Divide agent (before loops and procedure calls) // AP shared Atomic access to variable (before statement using variable) // AP reduction Variable for storing the result of a reduction (before statement using variable) Table1 : Annotations for agent programming. are straightforward and can be automated.... In PAGE 60: ... Con- Fetch queue size 64 Branch predictor comb. of bimodal and 2-level Bimodal predictor size 2048 Level 1 predictor 1024 entries, history 10 Level 2 predictor 4096 entries BTB size 2048 sets, 2-way Branch mispredict penalty at least 12 cycles Fetch width 8 (across up to 2 basic blocks) Dispatch and commit width 16 Issue queue size 15 per cluster (int and fp, each) Register file size 30 per cluster (int and fp, each) Re-order Buffer size 480 Integer ALUs/mult-div 1/1 (in each cluster) FP ALUs/mult-div 1/1 (in each cluster) L1 I-cache 32KB 2-way L1 D-cache 32KB 2-way set-associative, 6 cycles, 4-way word-interleaved L2 unified cache 2MB 8-way, 25 cycles I and D TLB 128 entries, 8KB page size Memory latency 160 cycles for the first chunk Table1 . Simplescalar simulator parameters.... In PAGE 66: ...ith in this paper. Using 0.13 micron technology, we have assumed features such as a clock rate of 3 GHz, a relatively large L2 cache, and other parameters scaled up accordingly. Our parameters for a single-context superscalar processor, SMT processor, and two CMP configurations are shown in Table1 . Our models assume the same parameters for SMT as the single- context superscalar and CMP resources are mostly ... In PAGE 67: ... Table1 . Processor parameters.... In PAGE 81: ...13a5 0.13 a5 Machine Width 4 wide fetch, 4 wide issue, 4 wide commit Window Size 128 entry RUU 64 entry RUU 64 entry load/store queue 32 entry load/store queue Branch Misprediction Latency 19 cycles 12 cycles L1 Icache 16K, 4-way 16K, 4-way 32 byte lines 32 byte lines 2 cycle latency 3 cycle latency L1 Data Cache 8K, 4-way 16 K, 4-way 32 byte lines 32 byte lines 2 cycle latency 3 cycle latency L2 Combined 512K, 8-way 512K, 8-way (Shared) 128 byte lines 128 byte lines 10 cycle latency 7 cycle latency Memory 128 bit wide 128 bit wide 92 cycle latency 41 cycle latency BTB 4096 entry, 4-way set-associative 512 entry, 4-way set-associative 32 entry return address stack 32 entry return address stack TLB 128 entry (I), 128 entry (D) 64 entry (I), 64 entry (D) 4-way set-associative 4-way set-associative 30 cycle miss latency 30 cycle miss latency Functional Units and 2 Int ALU (1/1), 1 Int Mult (2/2) / Div(2/2) 1 Int ALU (1/1), 1 Int Mult (2/2) / Div(2/2) Latency (total/issue) 4 Load/Store (2/1), 1 FP Add (5/3) 2 Load/Store (2/1), 1 FP Add (5/3) 1 FP Mult (6/5) / Div (6/5) / Sqrt (6/5) 1 FP Mult (6/5) / Div (6/5) / Sqrt (6/5) Table1 : Simulation Parameters 5 ms. Processing is returned to Core 1 and the pro- cess repeats itself.... In PAGE 89: ... Data shown in Table 1 from a previous study [Cameron99Tutorial] indicates that for some processors, counters are essentially asymptotically accurate, reaching a steady-state value which is close to actual event counts at large granularities. Results such as those in Table1 are sufficient for coarse-grain profiling or averaging, but would introduce a large amount of noise into a phase detection system for optimization or adaptation. Another use of performance counters is in software testing, particularly for isolation of performance bugs.... In PAGE 89: ... [Zagha96SC] Event Generator Event Generator Event Generator Event Generator Central Event Collector (Monitor Unit) Custom Routing Figure 1: Current performance monitoring. Table1 : Counter inaccuracy. [Cameron99Tutorial] 99,054 100,097 100,000 10,055 9,997 10,000 54 950 1,000 54 957 100 53 956 10 Meas.... In PAGE 99: ...Previous Stream Previous Stream Current Address hash Address Indexed Table Path Indexed Table Decoded Address Tag Length Next Stream Hysteresis Tag Length Next Stream Hysteresis Decoded Length Decoded Valid (2nd Level) (1st Level) Figure 2. The cascaded design of the next stream predictor Table1 . Configuration of the simulated processors 4-wide processor 8-wide processor fetch width 4 instructions 8 instructions rename/commit width 4 instructions 8 instructions integer issue width 4 instructions 8 instructions floating point issue width 4 instructions 8 instructions load/store issue width 2 instructions 4 instructions fetch target queue 4 entries 4 entries integer issue queue 32 entries 64 entries floating point issue queue 32 entries 64 entries load/store issue queue 32 entries 64 entries reorder buffer 128 entries 256 entries integer registers 96 160 floating point registers 96 160 L1 instruction cache 64 Kbytes, 2-way associative, (4*fetch width) byte block, 3 cycle latency L1 data cache 64 Kbytes, 2-way associative, 64 byte block, 3 cycle latency L2 unified cache 1 Mbyte, 4-way associative, 128 byte block, 16 cycle latency main memory latency 350 cycles 1024 entry, 4-way associative first level next stream predictor 4096 entry, 4-way associative second level... In PAGE 100: ... We simulate two processor setups, a 4-wide and an 8- wide superscalar processor, both having a 20-stage pipeline. The main values of these setups are shown in Table1 . Our first level instruction cache uses wide cache lines, that is, four times the processor fetch width, as described in [12].... In PAGE 107: ... For an overriding perceptron, all partial sums in ight in the pipeline need to be checkpointed. See Table1 for the formulas used to determine the amount of state to be checkpointed. Since the partial sums are distributed accross the whole predictor in pipeline latches, the checkpointing tables and associated circuitry must also be distributed.... In PAGE 114: ... The wire length, L, is a function of the number of func- tional units being bypassed while the resistance (Rmetal) and capacitance (Cmetal) per unit length remain constant. Plugging in their parameter values and wire length estimations into this equation produces delays for various bypass widths, shown in Table1 . Using their assumption of non-scalability, we use this as the bypass delay at 180nm and 90nm.... In PAGE 114: ... From these numbers, it is evident that the length term is dominant as the delay grows exponen- tially with more bypassed units. Table1 . Calculated bypass delays for various processor widths.... ..."

### Table 2. Numerical results

"... In PAGE 10: ... Thus they are able to provide much better starting points so that the ine ciencies we observe, are somewhat academic. In Table2 we collect the numerical results achieved. For each data set the least-squares algorithms DN2GB of Dennis et al.... In PAGE 10: ... However the numerical tests show that the solution of systems of nonlinear equations usually requires not more than 6 iterations with respect to the very small nal accuracy. In Table2 we list the name of the test case, the least-squares code DN2GB or DFNLP and the nal weighted residual. The subsequent two columns list the number of function and gradient evaluations of the outer least-squares algorithm until a termination condition is satis ed, i.... In PAGE 14: ...K. Schittkowski Table2 . (continued) problem code r? nf nrf i DN2GB .... ..."

### Table 6: Hits by algorithmic problem, with implementation ratings

"... In PAGE 2: ... The world has been divided into a total of 75 fundamental algorithmic problems, partitioned among data structures, numerical algorithms, combinatorial algorithms, graph algorithms, hard problems, and computational geometry. See Table6 or http://www.... ..."

### Table 10. Numerical results for stochastic nonlinear programs.

in On the implementation of a log-barrier progressive hedging method for multistage stochastic programs

"... In PAGE 20: ... For each (j; i), we generate dji according to the uniform distribution on [0; 1]. The numerical results are presented in Table10 . The problem with nonlinear constraints added to the linear problem LPi is named NLPi.... ..."

### Table 5 Resource allocation template

"... In PAGE 8: ... Team Presentation - A team presentation to the Client of the Report 3. Individual Critical Analysis of the problem-solving processes Resource allocation For the first phase a template was established, Table5 . This sets out the required resources for planning purposes.... ..."

### Table 1: Numerical results for Algorithm 3.1 (continued)

"... In PAGE 19: ... We used all complementarity problems and all available starting points from the MCPLIB test problem collection by Dirkse and Ferris [6]. Our results are summarized in Table1 , where we present the following data: problem: name of test example in MCPLIB n: dimension of test example SP: number of starting point in the M- le cpstart.m k: number of iterations ksuc: number of successful iterations N: number of Newton steps F -ev.... In PAGE 19: ...: number of Jacobian evaluations of F 2(xf): value of 2(x) at the nal iterate x = xf kr 2(xf)k: value of kr 2(x)k at the nal iterate x = xf B: number of iterations using a backtracking step. Table1 : Numerical results for Algorithm 3.1 problem n SP k ksuc N F -ev.... In PAGE 21: ...19 The results in Table1 are quite promising. The number of iterations seems comparable to the full-dimensional trust-region method by Jiang et al.... ..."

### Table 1. Results from sensor network domain for dynamic resource allocation problems.

"... In PAGE 14: ...termed the RMS error of upto 3 units as acceptable. Table1 presents our results from the implementation with the Mapping II in Sec- tion 6. Experiments were conducted in different dynamic situations varying the type of resource allocation problem, the number of nodes/targets, and the configuration.... ..."

### Table 1. Results from sensor network domain for dynamic resource allocation problems.

"... In PAGE 14: ...termed the RMS error of upto 3 units as acceptable. Table1 presents our results from the implementation with the Mapping II in Sec- tion 6. Experiments were conducted in different dynamic situations varying the type of resource allocation problem, the number of nodes/targets, and the configuration.... ..."