Results 1 - 10
of
217,894
Table 1: Resource allocation used
2004
"... In PAGE 40: ...2 Syntax For the moment, agent programming takes the form of a set of annotations that result in assembly-level transformations. These transformations are not yet implemented in a compiler, they are performed by hand, but the corresponding static transformations AP Annotation Semantic // AP divide Divide agent (before loops and procedure calls) // AP shared Atomic access to variable (before statement using variable) // AP reduction Variable for storing the result of a reduction (before statement using variable) Table1 : Annotations for agent programming. are straightforward and can be automated.... In PAGE 60: ... Con- Fetch queue size 64 Branch predictor comb. of bimodal and 2-level Bimodal predictor size 2048 Level 1 predictor 1024 entries, history 10 Level 2 predictor 4096 entries BTB size 2048 sets, 2-way Branch mispredict penalty at least 12 cycles Fetch width 8 (across up to 2 basic blocks) Dispatch and commit width 16 Issue queue size 15 per cluster (int and fp, each) Register file size 30 per cluster (int and fp, each) Re-order Buffer size 480 Integer ALUs/mult-div 1/1 (in each cluster) FP ALUs/mult-div 1/1 (in each cluster) L1 I-cache 32KB 2-way L1 D-cache 32KB 2-way set-associative, 6 cycles, 4-way word-interleaved L2 unified cache 2MB 8-way, 25 cycles I and D TLB 128 entries, 8KB page size Memory latency 160 cycles for the first chunk Table1 . Simplescalar simulator parameters.... In PAGE 66: ...ith in this paper. Using 0.13 micron technology, we have assumed features such as a clock rate of 3 GHz, a relatively large L2 cache, and other parameters scaled up accordingly. Our parameters for a single-context superscalar processor, SMT processor, and two CMP configurations are shown in Table1 . Our models assume the same parameters for SMT as the single- context superscalar and CMP resources are mostly ... In PAGE 67: ... Table1 . Processor parameters.... In PAGE 81: ...13a5 0.13 a5 Machine Width 4 wide fetch, 4 wide issue, 4 wide commit Window Size 128 entry RUU 64 entry RUU 64 entry load/store queue 32 entry load/store queue Branch Misprediction Latency 19 cycles 12 cycles L1 Icache 16K, 4-way 16K, 4-way 32 byte lines 32 byte lines 2 cycle latency 3 cycle latency L1 Data Cache 8K, 4-way 16 K, 4-way 32 byte lines 32 byte lines 2 cycle latency 3 cycle latency L2 Combined 512K, 8-way 512K, 8-way (Shared) 128 byte lines 128 byte lines 10 cycle latency 7 cycle latency Memory 128 bit wide 128 bit wide 92 cycle latency 41 cycle latency BTB 4096 entry, 4-way set-associative 512 entry, 4-way set-associative 32 entry return address stack 32 entry return address stack TLB 128 entry (I), 128 entry (D) 64 entry (I), 64 entry (D) 4-way set-associative 4-way set-associative 30 cycle miss latency 30 cycle miss latency Functional Units and 2 Int ALU (1/1), 1 Int Mult (2/2) / Div(2/2) 1 Int ALU (1/1), 1 Int Mult (2/2) / Div(2/2) Latency (total/issue) 4 Load/Store (2/1), 1 FP Add (5/3) 2 Load/Store (2/1), 1 FP Add (5/3) 1 FP Mult (6/5) / Div (6/5) / Sqrt (6/5) 1 FP Mult (6/5) / Div (6/5) / Sqrt (6/5) Table1 : Simulation Parameters 5 ms. Processing is returned to Core 1 and the pro- cess repeats itself.... In PAGE 89: ... Data shown in Table 1 from a previous study [Cameron99Tutorial] indicates that for some processors, counters are essentially asymptotically accurate, reaching a steady-state value which is close to actual event counts at large granularities. Results such as those in Table1 are sufficient for coarse-grain profiling or averaging, but would introduce a large amount of noise into a phase detection system for optimization or adaptation. Another use of performance counters is in software testing, particularly for isolation of performance bugs.... In PAGE 89: ... [Zagha96SC] Event Generator Event Generator Event Generator Event Generator Central Event Collector (Monitor Unit) Custom Routing Figure 1: Current performance monitoring. Table1 : Counter inaccuracy. [Cameron99Tutorial] 99,054 100,097 100,000 10,055 9,997 10,000 54 950 1,000 54 957 100 53 956 10 Meas.... In PAGE 99: ...Previous Stream Previous Stream Current Address hash Address Indexed Table Path Indexed Table Decoded Address Tag Length Next Stream Hysteresis Tag Length Next Stream Hysteresis Decoded Length Decoded Valid (2nd Level) (1st Level) Figure 2. The cascaded design of the next stream predictor Table1 . Configuration of the simulated processors 4-wide processor 8-wide processor fetch width 4 instructions 8 instructions rename/commit width 4 instructions 8 instructions integer issue width 4 instructions 8 instructions floating point issue width 4 instructions 8 instructions load/store issue width 2 instructions 4 instructions fetch target queue 4 entries 4 entries integer issue queue 32 entries 64 entries floating point issue queue 32 entries 64 entries load/store issue queue 32 entries 64 entries reorder buffer 128 entries 256 entries integer registers 96 160 floating point registers 96 160 L1 instruction cache 64 Kbytes, 2-way associative, (4*fetch width) byte block, 3 cycle latency L1 data cache 64 Kbytes, 2-way associative, 64 byte block, 3 cycle latency L2 unified cache 1 Mbyte, 4-way associative, 128 byte block, 16 cycle latency main memory latency 350 cycles 1024 entry, 4-way associative first level next stream predictor 4096 entry, 4-way associative second level... In PAGE 100: ... We simulate two processor setups, a 4-wide and an 8- wide superscalar processor, both having a 20-stage pipeline. The main values of these setups are shown in Table1 . Our first level instruction cache uses wide cache lines, that is, four times the processor fetch width, as described in [12].... In PAGE 107: ... For an overriding perceptron, all partial sums in ight in the pipeline need to be checkpointed. See Table1 for the formulas used to determine the amount of state to be checkpointed. Since the partial sums are distributed accross the whole predictor in pipeline latches, the checkpointing tables and associated circuitry must also be distributed.... In PAGE 114: ... The wire length, L, is a function of the number of func- tional units being bypassed while the resistance (Rmetal) and capacitance (Cmetal) per unit length remain constant. Plugging in their parameter values and wire length estimations into this equation produces delays for various bypass widths, shown in Table1 . Using their assumption of non-scalability, we use this as the bypass delay at 180nm and 90nm.... In PAGE 114: ... From these numbers, it is evident that the length term is dominant as the delay grows exponen- tially with more bypassed units. Table1 . Calculated bypass delays for various processor widths.... ..."
Table 1. Results from sensor network domain for dynamic resource allocation problems.
"... In PAGE 14: ...termed the RMS error of upto 3 units as acceptable. Table1 presents our results from the implementation with the Mapping II in Sec- tion 6. Experiments were conducted in different dynamic situations varying the type of resource allocation problem, the number of nodes/targets, and the configuration.... ..."
Table 1. Results from sensor network domain for dynamic resource allocation problems.
"... In PAGE 14: ...termed the RMS error of upto 3 units as acceptable. Table1 presents our results from the implementation with the Mapping II in Sec- tion 6. Experiments were conducted in different dynamic situations varying the type of resource allocation problem, the number of nodes/targets, and the configuration.... ..."
Table 1. Results from sensor network domain for dynamic resource allocation problems.
"... In PAGE 14: ...Table1 presents our results from the implementation with the Mapping II in Sec- tion 6. Experiments were conducted in different dynamic situations varying the type of resource allocation problem, the number of nodes/targets, and the configuration.... ..."
Table 1. Results from sensor network domain for dynamic resource allocation problems.
"... In PAGE 14: ...termed the RMS error of upto 3 units as acceptable. Table1 presents our results from the implementation with the Mapping II in Sec- tion 6. Experiments were conducted in different dynamic situations varying the type of resource allocation problem, the number of nodes/targets, and the configuration.... ..."
Table 1. Results from sensor network domain for dynamic resource allocation problems.
"... In PAGE 14: ...termed the RMS error of upto 3 units as acceptable. Table1 presents our results from the implementation with the Mapping II in Sec- tion 6. Experiments were conducted in different dynamic situations varying the type of resource allocation problem, the number of nodes/targets, and the configuration.... ..."
Table 2: Results from sensor network domain for dynamic resource allocation problems.
in Distributed Resource Allocation: Formalization, Complexity Results and Mappings to Distributed CSPs
"... In PAGE 24: ... We have done extensive tests using this simulator to further validate the DyDisCSP formalization. Table2 presents some experimental results from the implementation in the simulator. One evaluation criteria in distributed sensor networks is how accurately targets are tracked.... ..."
Table 2: Results from sensor network domain for dynamic resource allocation problems.
2002
"... In PAGE 24: ... We have done extensive tests using this simulator to further validate the DyDisCSP formalization. Table2 presents some experimental results from the implementation in the simulator. One evaluation criteria in distributed sensor networks is how accurately targets are tracked.... ..."
Table 3.1: SCI communication library
2007
Table 1. Example resource allocations.
1999
"... In PAGE 8: ... Plotted in Figure 4 are the approximation data points for each task after the convex hull frontier procedure is called in asrmd1. Table1... ..."
Cited by 61
Results 1 - 10
of
217,894