### Table 1 Sparse Term-Document Matrix Speci cations . By using the reduced model in (2), usually with 100 k 200, minor di erences in terminology are 1 Semantic structure refers to the correlation structure in the way in which individual words appear in documents; semantic implies only the fact that terms in a document may be taken as referents to the document itself or to its topic. 2 Special thanks to Sue Dumais from Bell Communications Research (Bellcore), Morristown, NJ for providing the various sparse matrices from Latent Semantic Indexing (LSI) studies. 4

1992

"... In PAGE 4: ... We note that r and c are the average number of nonzeros per row and column, respectively. The Density of each sparse matrix listed in Table1 is de ned to be the ratio (Rows Columns) = (Nonzeros).... In PAGE 5: ... As discussed in [4] and [12], LSI using the sparse SVD can be more robust and economical than straight term overlap methods. However, in practice, one must compute at least 100-200 largest singular values and corresponding singular vectors of sparse matrices having similar characteristics to those matrices in Table1 . In addition, it is not necessarily the case that rank(A) = n for the m n term-document matrix A, this is due to errors caused by term extraction, spelling, or duplication of documents.... In PAGE 5: ... In addition, it is not necessarily the case that rank(A) = n for the m n term-document matrix A, this is due to errors caused by term extraction, spelling, or duplication of documents. Regarding the numerical precision of the desired singular triplets for LSI, recent tests using a few of the databases listed in Table1 have revealed that for the i-th singular triplet, fui; i; vig, 10?6 kAvi ? iuik 10?3 will su ce. Finally, as the desire for using LSI on larger and larger databases or archives grows, fast algorithms for computing the sparse singular value decomposition will become of paramount impor- tance.... In PAGE 7: ... Figure 1 depicts typical nonzero patterns of the sparse matrices arising from information retrieval and seismic tomography applications in Figure 1, where each nonzero element is given by a single dot. The matrix in 1(a) is the ADI database matrix (374 82) listed in Table1 . The nearly dense rows re ect words such as computer which commonly occur in each document found in that particular database.... In PAGE 8: ... 1. (a) Non-zero pattern of the 374 82 database matrix (ADI in Table1 ) from an information retrieval application. (b) Non-zero pattern of the rst 718 rows of a 1436 330 Jacobian matrix from a sample seismic travel tomography application in which subsurface velocities are needed.... In PAGE 31: ... A modi ed version of the Concentrix 3:0 operating system is used on this particular Alliant FX/80.In Table1 2, we illustrate the dominant sub-algorithms or tasks associated with each of four sparse SVD methods when we determine the 100-largest singular triplets of the medical abstract database matrix, MED, from the Bellcore collection in Table 1 on the Cray-2S/4-128. To obtain these pro les, we invoke the owtrace compiler option (see [9]) on only 1 CPU (pro ling is not currently available for multiple-CPU programs).... In PAGE 31: ... A modi ed version of the Concentrix 3:0 operating system is used on this particular Alliant FX/80.In Table 12, we illustrate the dominant sub-algorithms or tasks associated with each of four sparse SVD methods when we determine the 100-largest singular triplets of the medical abstract database matrix, MED, from the Bellcore collection in Table1 on the Cray-2S/4-128. To obtain these pro les, we invoke the owtrace compiler option (see [9]) on only 1 CPU (pro ling is not currently available for multiple-CPU programs).... In PAGE 32: ... Table 13 indicates the pro les of our four methods on 8 processors of the Alliant FX/80. Again, we seek the 100-largest singular triplets of the 5831 1033 MED matrix from Table1 with residuals (6) less than or equal to 10?3. For each of the methods, we compute singular triplets via eigenpairs of ATA only.... In PAGE 33: ...Algorithm LASVD BLSVD SISVD TRSVD B ATA B ATA ~ B ATA ~ B ATA SPMXV 54 72 42 77 86 88 88 85 ORTHG 4 2 43 12 11 7 1 2 (IM)TQL2 34 12 { { { { { { QR { { 5 { { { { { CG (BLAS1) { { { { { { 4 3 Table 12 Pro le of the four methods for computing the 100-largest singular triplets of the 5831 1033 MED matrix in Table1 on the Cray-2S/4-128. Eigensystems of the original or modi ed ( ~ B) 2-cyclic matrix B, and the matrix AT A are approximated by each method.... In PAGE 33: ... Table 13 also indicates that a signi cant proportion of time (24% of total CPU time) is spent in the level-2 (matrix-vector) and level-3 (matrix-matrix) BLAS kernels. The outer block Lanczos recursion for ATA (see Table1 1), as with the outer recursion in Table 9, primarily consists of these higher- level BLAS kernels (also supplied by the Alliant FX/Series Scienti c Library) which are designed for execution on all 8 processors of the Alliant FX/80. The modi ed Gram-Schmidt procedure we employ for re-orthogonalization is also driven by the higher-level BLAS kernels.... In PAGE 33: ... Although it did not appear in the pro le of BLSVD on the Cray-2S/4-128, multiplication by the Krylov projection matrix, Jk, which is a symmetric block tridiagonal matrix (see Section 3:5) for the recursion in ATA, requires just under 10% of the total CPU time on the Alliant FX/80. The computation of the eigenpairs of the resulting symmetric tridiagonal matrix (Bk in inner Lanczos iteration from Table1 0) by the EISPACK routine, TQL2, is not as demanding in BLSVD (8%) as opposed to LASVD (14%) since we incorporate a bound... In PAGE 34: ... In this particular experiment, we selected an initial block size of b = 30 and maintained a maximum dimension of c = 150 for the Krylov subspace in each outer iteration. As in Table1 2, we again note the similarity in pro les of SISVD and TRSVD on the Alliant FX/80 in Table 13. Both methods are clearly dominated by sparse matrix-vector multiplications (SPMXV).... In PAGE 34: ... In the next section, we assess the degree of parallelism exhibited by each of the four sparse iterative methods on 8 processors of the Alliant FX/80. Percentage of Total CPU Time Algorithm LASVD BLSVD SISVD TRSVD SPMXV 27 46 63 69 ORTHG { { { 1 BLAS2(3) { 24 8 14 DSBMV { 9 { { (IM)TQL2 14 8 3 { TRED2 { { 3 { DAXPY 17 { 14 { DCOPY 20 { { { DDOT 2 { { { DNRM2 { { { 1 Table 13 Pro le of the four methods for computing the 100-largest singular triplets of the 5831 1033 MED matrix in Table1 on the Alliant FX/80. Only eigensystems of ATA are approximated.... In PAGE 35: ...the 8 processors inactive before, during, or after its execution. Given co, we also de ne the concurrency e ciency, Ec, of a particular method by Ec = co 8 : In Table1 4, we indicate the breakdown in percentage of total (user) CPU time spent on exactly j processors by each of the four sparse SVD methods when the 100-largest triplets of the MED matrix in Table 1 are approximated via eigenpairs of ATA. Essentially, this data reveals the e ective distribution of parallelism (or utilization of multiple processors) among the four methods across the 8 processors of the Alliant FX/80.... In PAGE 35: ...the 8 processors inactive before, during, or after its execution. Given co, we also de ne the concurrency e ciency, Ec, of a particular method by Ec = co 8 : In Table 14, we indicate the breakdown in percentage of total (user) CPU time spent on exactly j processors by each of the four sparse SVD methods when the 100-largest triplets of the MED matrix in Table1 are approximated via eigenpairs of ATA. Essentially, this data reveals the e ective distribution of parallelism (or utilization of multiple processors) among the four methods across the 8 processors of the Alliant FX/80.... In PAGE 35: ... BLSVD is a close third with 56%, and LASVD is the least parallel with only 37%. This ranking is not too surprising since the subspace methods, TRSVD and SISVD, have only a few dominant sub-algorithms (SPMXV and BLAS2[3] from Table1 3) which can be easily parallelized on the Alliant FX/80. It is important to note, however, that although LASVD with the AT A operator may be the least parallel of the four methods in this case, it can still be the fastest method on the Alliant FX/80 (see [4]).... In PAGE 36: ...4 67 LASVD 3.7 47 Table 15 Average concurrency ( co) and e ciency (Ec) of sparse SVD methods on Alliant FX/80 when com- puting the 100-largest triplets of the 5831 1033 MED matrix in Table1 via the eigensystem of ATA. 4.... In PAGE 37: ...LASVD BLSVD SISVD TRSVD SPMXV 3:0 3:0 3:1 3:6 ORTHG | | | 4:8 BLAS2(3) | 5:0 5:3 5:5 DSBMV | 2:0 | | TQL2 | 2:8 3:5 | IMTQL2 4:3 | | | TRED2 3:3 | | | DAXPY 5:0 | 4:4 | DCOPY 3:6 | | | DDOT 7:7 | | | DNRM2 | | | 5:5 Overall Speedup 3:0 3:2 3:4 4:0 Table 16 Speedups of the four methods (and their sub-algorithms) for computing the 100-largest singular triplets of the 5831 1033 MED matrix in Table1 on the Alliant FX/80. Speedups indicated are the ratio of (user) CPU time on 1 processor to CPU time on 8 processors.... In PAGE 37: ... If we compute eigensystems of B or ~ B, the Bellcore matrix TECH will pose the greatest memory constraint, while the Amoco matrix AMOCO2 will require the largest amount of memory if we approximate the eigensystem of AT A. In Table1 7, we include the memory requirements for SISVD... In PAGE 39: ... For clari cation purposes, SISVD and SISVD2 denote subspace iterati on using eigensystems of ~ B and B, respectively, where ~ B is given by (9) and B is the 2-cyclic matrix de ned by (5). Due to memory limitations in working with B or ~ B (see Table1 7), we only consider eigensystems of either ATA or 2In ? AT A for each method on the Alliant FX/80. 5.... In PAGE 45: ... To assess the speci c gains in performance (time reduction) associated with the eigenvalue problem of order n, we list the speed improvements for LASVD and BLSVD (and the other candidate methods) when eigensystems of ATA are approximated in Table 21. The limited improvement for BLSVD, in this case, stems from the fact that although the less time is spent in re-orthogonalization (see Table1 2) the number of outer iteration steps for the ATA-based recursion (see Table 11) can be as much as 1:5 times greater than the number of outer iterations for the cyclic-based hybrid recursion (Tables 9 and 10). Hence, the de ation associated with larger gaps among the p = 100 eigenvalues of AT A is not quite as e cient.... In PAGE 45: ... To assess the speci c gains in performance (time reduction) associated with the eigenvalue problem of order n, we list the speed improvements for LASVD and BLSVD (and the other candidate methods) when eigensystems of ATA are approximated in Table 21. The limited improvement for BLSVD, in this case, stems from the fact that although the less time is spent in re-orthogonalization (see Table 12) the number of outer iteration steps for the ATA-based recursion (see Table1 1) can be as much as 1:5 times greater than the number of outer iterations for the cyclic-based hybrid recursion (Tables 9 and 10). Hence, the de ation associated with larger gaps among the p = 100 eigenvalues of AT A is not quite as e cient.... In PAGE 46: ... SVD via eigensystems of B, ~ B Matrix LASVD BLSVD SISVD SISVD2 TRSVD AMOCO1 27 32 102 94 43 MED 139 103 1269 333 136 CISI 143 120 1276 515 187 TIME 147 127 1616 634 300 CRAN 167 117 1105 874 176 TECH 479 486 5598 1405 636 AMOCO2 654 360 4250 1160 508 SVD via eigensystems of ATA Matrix LASVD BLSVD SISVD TRSVD AMOCO1 10 26 23 19 MED 16 86 88 52 CISI 23 120 137 76 TIME 13 87 93 56 CRAN 22 117 120 77 TECH 89 490 605 292 AMOCO2 48 360 801 384 Table 20 Cray-2S/4-128 CPU times (in seconds) for determining the 100-largest singular triplets of matrices in Tables 1 and 2. Having completed our comparisons for the rst row of Table1 8, we proceed down the second row and compare the performance of our methods on the Alliant FX/80 (matrices TECH and AMOCO2 omitted due to memory limitations). The AT A implementation of LASVD is on average 2:7 times faster than BLSVD, and 2:4 times faster than TRSVD (via eigensystems of 2In?ATA) on the Alliant FX/80.... In PAGE 46: ... The AT A implementation of LASVD is on average 2:7 times faster than BLSVD, and 2:4 times faster than TRSVD (via eigensystems of 2In?ATA) on the Alliant FX/80. The e ective parallelization of TRSVD and SISVD (see Table1 5) tends to produce comparable times for both methods (see Table 22) for the moderate-order matrices. The concurrency e ciency (see Table 15) is consistently high for TRSVD, and the parallel conjugate gradient (CG) iterations for solving (21)... ..."

Cited by 4

### Table 2. Recognition rate in % words, before/after training. Writer: codes it.. are Italian writers, ir.. are Irish writers a. The recognizer output is a list of words which is sorted in descending order of match quality. Topword: % of correct words at the top of the output list. Top-5: % of correct words found in the topmost 5 words of the recognizer output list, Top-10: % of correct words found in the topmost 10 words. Nwords: number of words in the test set. Nlabeled: number of manually labeled words in the training set, Nlexicon: number of words in the lexicon used in recognition. aWith special thanks to Olivetti, Naples, and Captec, Dublin, who kindly provided the data within the framework of Esprit project P5204 Papyrus.

"... In PAGE 4: ... aWith special thanks to Olivetti, Naples, and Captec, Dublin, who kindly provided the data within the framework of Esprit project P5204 Papyrus. 4 Results Table2 shows untrained and trained recogni- tion results . Looking at the Topword recog- nized column in Table 2, roughly four types of writers can be identi ed.... In PAGE 4: ... 4 Results Table 2 shows untrained and trained recogni- tion results . Looking at the Topword recog- nized column in Table2 , roughly four types of writers can be identi ed. The table is sorted from high to low initial recognition rate.... In PAGE 5: ... All individual letters must be classi ed correctly by the system, for a match to occur. On average, these recognition results are higher than in the comparable Untrained con- dition of the international test in Table2 (45% vs 29%). It is not certain whether this result is due to international style di erences or to the more reliable and slightly higher sampling rate of 100 Hz in the Dutch test set.... ..."

### Table 2 Comparison of execution times (CPU seconds). The above numerical experiments con rm the advantages of the two-level Schwarz method compared other known domain decomposition algorithms. For more computa- tional experiments related to this approach, we refer to [17], which includes a detailed description of the implementation procedures as well as comparisons of di erent solution algorithms when used to solve various applied problems. Acknowledgements. The author would like to express his gratitude to Profes- sor Yu. A. Kuznetsov (Russian Academy of Sciences, Moscow) and Mr. J. Toivanen (University of Jyvaskyla) for fruitful discussions related to the topic. The latter also deserves a special thank for o ering an e cient ctitious domain solver for the numerical experiments.

"... In PAGE 15: ... The subproblems of the Schwarz methods were solved with rather high accuracy ( quot; = 10?8) while the tolerance in the outer iterative processes (the classical and monotonical Schwarz methods) was chosen to be equal to 10?6 in k k1-norm. Table 1 gives the iteration history for the Schwarz methods and Table2 the corre- sponding execution times with two di erent overlappings, of size O( 1 16) and O(18). For the sake of comparison, we have also included the results obtained by the projected SOR method ( apos;PSOR apos;).... ..."

### Table 2. Recognition rate in % words, before/after training. Writer: codes it.. are Italian writers, ir.. are Irish writers The recognizer output is a list of words which is sorted in descending order of match quality. Topword: % of correct words at the top of the output list. Top-5: % of correct words found in the topmost 5 words of the recognizer output list, Top-10: % of correct words found in the topmost 10 words. Nwords: number of words in the test set. Nlabeled: number of manually labeled words in the training set, Nlexicon: number of words in the lexicon used in recognition. With special thanks to Olivetti, Naples, and Captec, Dublin, who kindly provided the handwriting data within the framework of Esprit project P5204 Papyrus.

"... In PAGE 3: ... The training procedure was the same as in the rst test. RESULTS amp; DISCUSSION Word recognition rate as a function of training Table2 shows the untrained and trained recognition results. Looking at the quot;Topword recognized quot; column, roughly four types of writers can be identi ed.... ..."

### Table 2: Percentage of connections violating timing constraints after detailed routing completion. 5 Conclusions We have presented a timing-driven router for FPGAs with segments of various lengths. The router is based on the hierarchical strategy and suited for the special properties of FPGA routing architectures. Experimental results show that our router is very e ective in reducing the number of connections violating timing constraints. Acknowledgments The authors would like to thank Steve Brown and Baharam Fallah of University of Toronto for providing us with the benchmark circuits, Nick Haruyama for helpful discussions, and Cherng-Shiuan Wang for implementing the detailed router.

"... In PAGE 10: ...nd loose cases for timing constraints. If the delay bound B(ti) was less than dmin1, then set B(ti) = dmin. Each circuit was routed by the algorithm and the percentage of source-sink pairs violating the delay bounds was computed. The results are shown in the column \Timing-driven quot; of Table2 . For the purpose of comparison, we also routed the circuits by the same routing algorithm, with the cost function for the linear assignment to minimize the wire length and the delay bound distributions/redistributions being turned o , i.... In PAGE 10: ...o minimize the wire length and the delay bound distributions/redistributions being turned o , i.e., C(3) ij in Equation (4) was set according to the cost function illustrated in Figure 8(b); this leads to a non-timing-driven routing approach [19, 25, 30]. The results are given in the column \Non-timing-driven quot; of Table2 . For all the circuits, the timing-driven routing algorithm substantially reduced the percentage of connections violating the delay bounds.... ..."

### Table 3. Results on the at-tire domain, and the easy and hard 8-puzzle problems. Blackbox was run in its default mode, with -solver graphplan (BlackboxGP), and with -solver walksat (BlackboxWS). The planners implemented apply to problems for which the objective is to maximize the probability of satsifying the problem goals within the given time window, or the related goal of minimizing expected completion time. More gen- eral MDP problems are often given by specifying rewards for performing certain (noop or non-noop) actions. It appears that some of the kinds of information propagated here should be useful in those more general settings, and it would be interesting to see if this generalization could be made without a sacri ce in performance. Acknowledgements.This research is sponsored in part by NSF National Young Investigator grant CCR-9357793, NSF grant CCR-9732705 and an AT amp;T / Lu- cent Special Purpose Grant in Science and Technology. We would like to thank the anonymous reviewers for their detailed, thoughtful, and helpful comments.

1999

"... In PAGE 11: ... We consider the goal of achieving board state ABCDEFGH (reading left to right, top to bottom) from two di erent initial states: one in which a solution requires 18 steps and one in which a solution requires 30 steps (this is the case of initial board HGFEDCBA ). Results are given in Table3 . Note that PGraphplan is the fastest of all planners tested (even the deterministic ones) on this problem.... ..."

Cited by 59

### Table 2.2: Different roles of industrial service suppliers by Kalliokoski et al. (2003).

"... In PAGE 4: ... I am grateful to Professor Harri Ehtamo for supervising the thesis, as well as to the whole staff of Systems Analysis Laboratory for providing high quality education and an inspiring work environment. Professor Vesa Salminen of Lappeenranta University of Technology and all other par- ticipants of BestServ Round Table2 : Service Contract Models deserve special thanks for interesting discussions and for sharing their expertise as well as a dose of realism. I am deeply indebted to my parents, this work would not have been possible without your efforts.... ..."

### Table 2.3: Service process matrix by Schmenner (1986).

"... In PAGE 4: ... I am grateful to Professor Harri Ehtamo for supervising the thesis, as well as to the whole staff of Systems Analysis Laboratory for providing high quality education and an inspiring work environment. Professor Vesa Salminen of Lappeenranta University of Technology and all other par- ticipants of BestServ Round Table2 : Service Contract Models deserve special thanks for interesting discussions and for sharing their expertise as well as a dose of realism. I am deeply indebted to my parents, this work would not have been possible without your efforts.... ..."