### Table 1: Z-score statistics for structural RNA compared to random RNA of the same expected dinucleotide frequency using Algorithm 3. RNA type Number of sequences Mean Stdev Max Min

"... In PAGE 9: ...9 deviation, maximum and minimum Z-score10 for each investigated class of RNA. For Table1 , we computed Z-scores with respect to random RNA of the same expected dinucleotide frequency, using Algorithm 3, while in Table 2 we computed Z-scores with respect to random RNA of the same (exact) din- ucleotide frequency using the provably correct Altschul-Erikson Algorithm 4. Since we correct an assertion of [Workman and Krogh, 1999] concerning tRNA, we implemented their method of computing p-values and list in Table 2 the p-values for all investigated classes of RNA.... In PAGE 13: ... Work of [Se ens and Digby, 1999] and of [Workman and Krogh, 1999] together provide strong evidence that the mononucleotide shu e Algorithm 2 and 0th order Markov chain Algorithm 1 should never be used when com- puting Z-scores. The slight discrepancy between Table1 and 2 for 30 UTR regions of mRNA suggests that Algorithm 4 should be used if possible over Algorithm 3, when computing Z-scores. Additionally, based on new mathematical results concerning asymptotic comportment of random RNA (see the Appendix), we de ne the concept of asymptotic Z-score (see De nition 6 in Section on Materials and Methods), and show how to radically reduce the computation time for moving win- dow, whole genome algorithms which compute Z-scores of window contents.... In PAGE 15: ... Unless otherwise stated, we generated 1000 random RNAs per (real) RNA sequence, for each experiment. Using the mono- and dinucleotide frequen- cies for tRNA from Table1 , we generated random RNAs for each of the 530 tRNA in the database of [Sprinzl et al., 1998] according to two meth- ods, which we respectively dub First-order Markov (Algorithm 3) and Din- ucleotide Shu e (Algorithm 4), and computed the mfe using RNAfold.... ..."

### Table 2: Biological RNA structures.

2007

"... In PAGE 5: ...) We study the behaviour of the algorithm on biological structures since it will have an impact in biological appli- cations such as ribozyme design. Because of the limited availability of true biological structures, we generated structures with biological characteristics based on the set of real structures listed in Table2 . The statistics reported in Table 3 summarise salient structural properties of these naturally occurring RNAs.... In PAGE 5: ... Figure 2a shows the median expected run time for differ- ent structure lengths (where the median is over the struc- tures in a set and the expectation is over multiple runs of the algorithm on a given structure), as well as the expected run time for the structure at the 10% and the 90% quan- tile for the biologically motivated structures. We also show the expected run times for the set of real biological structures summarised in Table2 . Notice that the empiri- cal complexity for designing these real structures fits well within the range of complexity observed for our biologi- cally motivated sets of structures, which provides some evidence that the probabilistic model underlying these sets is reasonably plausible for the purposes of this study.... In PAGE 6: ... Hairpins Stems 2-Branch loops Multiloops Bulges Size [4,8] [3,12] [4, 11] [6,17] [1,3] Number - - [1,8] [0,5] [0,0.17]* Branches - - - [3,4] - Properties of the structures from Table2 ; the intervals specify the minimal and maximal values observed for the respective features. These parameters were used to generate structures with biological properties.... In PAGE 9: ...5 Performance of RNA-SSD with different number and locations of primary base constraints In a second series of experiments, we studied the correla- tion between the number of bases constrained and the performance of the RNA-SSD algorithm. The experiments were conducted using some biological structures from Table2 as well as biologically motivated structures. Table 4 shows some features of these structures.... In PAGE 14: ... Structures with biological char- acteristics were generated with the help of an RNA struc- ture generator [9] that allows us to directly control salient properties of the structures being generated, including the overall size as well as the number and size of bulge, inter- nal, and multiloops, and the length of stems. In order to determine these properties, we selected from the biologi- cal literature ten structures that are consistent with exper- imental evidence and empirical data, ranging from 60 to 600 bases in length (see Table2 ). Average values of each of the features captured in the parameters of the RNA structure generator over our set of structures were used to roughly summarise the structural properties of naturally occurring RNAs (see Table 3).... ..."

### Table 1: Statistics of the training and test sets of 100 tRNA sequences each. The average identity in an alignment is the average pairwise identity of all aligned symbol pairs, with gap/symbol alignments counted as mismatches. Primary sequence information content is calculated according to [48]. Calculating pairwise mutual information content is an NP- complete problem of nding an optimum partition of columns into pairs. A lower bound is calculated by using the model construction procedure to nd an optimal partition subject to a non-pseudoknotting restriction. An upper bound is calculated as sum of the single best pairwise covariation for each position, divided by two; this includes all pairwise tertiary interactions but overcounts because it does not guarantee a disjoint set of pairs. For the meaning of multiple alignment accuracy of ClustalV, see the text.

"... In PAGE 12: ...Table1 ). The accuracy of standard pairwise sequence alignment [46] begins to sharply drop o for pairs of tRNA sequences less than about 65% identical (data not shown).... In PAGE 12: ... ClustalV, a popular and reliable multiple sequence alignment program [47], produces poor alignments for all these data sets which range from 37% to 63% identical to the trusted alignments. This is almost as bad as one gets from uninformative alignments; removing all gaps from the sequences gives \align- ments quot; which are about 30% identical to the correct alignment ( Table1 ). [We measure alignment identity as the fraction of aligned symbol pairs in the trusted alignment that are also aligned in the other alignment.... In PAGE 12: ... These information measures indicate how much extra informa- tion may be gained by models which capture pairwise second-order information. There is almost as much information in the secondary structure of tRNA as in the primary sequence consensus ( Table1 ). A secondary structure representation of tRNA such as a CM should use twice as much information about tRNA sequences as a primary sequence representation such as an HMM or a pro le.... In PAGE 12: ... A secondary structure representation of tRNA such as a CM should use twice as much information about tRNA sequences as a primary sequence representation such as an HMM or a pro le. Table1 also shows an estimate of how much additional pairwise information is available from tertiary contacts that a CM does not capture. We calculated an upper bound on the total pairwise correlation information that includes all pairwise contacts, not just those consistent with classical nested secondary structure.... In PAGE 12: ... We calculated an upper bound on the total pairwise correlation information that includes all pairwise contacts, not just those consistent with classical nested secondary structure. This number is less than 10% greater than the gure for secondary structure ( Table1 ). A CM captures over 90% of all pairwise information in tRNA sequences.... In PAGE 17: ... However, the contribution of tertiary interactions is not crucial for database searching purposes. We show ( Table1 ) that tertiary struc- ture contributes at most two or three bits of pairwise correlation information to tRNAs, compared to 30-40 bits in primary sequence consensus and 30 bits of secondary structure pairwise correlation information. We expect these rough proportions to be about the same for most RNAs; pseudoknots may be functionally important in RNA structures but they usually account for relatively few base pairs.... ..."

### Table 1. CARNAC output for Delta/Epsilon Purple Bacteria RNase P RNA

"... In PAGE 4: ... We folded each sequence against each other. The results are diplayed in Table1 . Despite the poor sequence similarity, the pseudoknots and the variations in structure, more than half of the structure is predicted, with 85% correct in average.... In PAGE 4: ...desulfuricans to be the reference organism. This corresponds to the first column of Table1 . We used DYNALIGN and FOLDALIGN exactly in the same way as CARNAC: we computed all pairwise foldings.... ..."

### Table 7. Best bps Structure mfold vs. Best bps Structure P-RnaPredict (EA): True Positive Base Pairs

"... In PAGE 6: ... We have used the mfold default setting of 5% sub-optimality and searched the produced sub-optimal foldings for structures with high bp overlap compared to the known structures (Tables 7 and 8). Table7 compares the best bp structure found by mfold with that of P- RnaPredict in terms of true positive base pairs. The performance of both approaches is identical for S.... ..."

### Table 1. Characteristics of RNA structural elements and motifs

"... In PAGE 4: ... Characteristics of RNA structural elements vs. motifs are summarized in Table1 . Specific RNA structural elements are listed and described in Table 2.... ..."

### Table 2 cont Grouping of Likeness Profiles used in Fast Searching

"... In PAGE 5: ...Table2 . Grouping of Likeness Profiles used in Fast Searching.... In PAGE 6: ... In a detailed one-on-one conformation search it is one of these seven groupings of feature that is, by default, used in the alignment. Likewise, a default subset of these features is used in the fast 3-D search as shown in column 2 of Table2 . Each of the groupings and their composite features are discussed, however, it should be noted that these defaults sets can be overridden in favor of any combination of the 495 parameters: Local takes into account the local conformational features of a pentapeptide and is useful for detecting local similarities.... In PAGE 9: ... Input Form for Detailed Structure Alignment. Notice that the grouping of searchable parameters corresponds to that given in Table2 . Additional parameters which are defined are the decision level and the type of superposition.... ..."

### Table 1. Folding Algorithms for RNA Secondary Structures.

"... In PAGE 5: ... A variety of computer programs predicting RNA secondary structures have been published. A very brief overview is given in Table1 . Two public domain packages for RNA folding are currently available by anonymous ftp: Zuker apos;s mfold [25] and the Vienna RNA Package [26].... ..."

### Table 1. Folding Algorithms for RNA Secondary Structures.

"... In PAGE 6: ...: Analysis of RNA Sequence-Structure Maps A variety of computer programs predicting RNA secondary structures have been published. A very brief overview is given in Table1 . Two public domain packages for RNA folding are currently available by anonymous ftp: Zuker apos;s mfold [24] and the Vienna RNA Package [25].... ..."

### TABLE 2. Overview of APEC O1 genome

2006