### Table 1: Example queries and results. 2 Shortcomings of Search Engines In this section we identify and discuss four sources of ine ectiveness of traditional search engines. These include:

2000

"... In PAGE 3: ... Users do not usually have the patience to toil through more than the rst thirty hits returned by a search engine. Table1 illustrates this problem by showing the results obtained from querying four popular search engines with some sample queries. In each row of the table, we show a search goal (Goal), the keyword-based query submitted (Query), the search engine that processed the query (Search Engine), the number of pages returned (# of hits), the number of relevant pages that satis ed our goal in the rst thirty hits... In PAGE 4: ...ovember day in some year. (The numbers 19 and 1997 are logically ignored by the engines.) One would nd that all these pages share the same pre x in their URL apos;s, and that they belong to a single logical cluster of a big hyper-document. The last column of Table1 refers to the number of clusters that the rst thirty hits can be grouped into. From the table we see that the answer sets are huge.... In PAGE 4: ... Of course, one would argue that the screening would stop as soon as one good recommendation is found. Still, as suggested by Table1 , the rst relevant page may not be found until a couple dozens pages have been examined, many more if one is unlucky. Also, the rst relevant page may not be the best page that can be found in the answer set.... In PAGE 4: ... More screening is required if one would like to compare relevant hits looking for a better match. Besides illustrating the large answer set problem, Table1 also gives us a hint on how to avoid overwhelming the users with the numerous recommendations. The last column of the table suggests that the large number of pages can be grouped into a small number of logical clusters.... In PAGE 5: ... The implication is that users cannot a ord to examine only the rst few, or any small subset, of the answer set. Table1 illustrates this problem by showing the number of pages among the rst 30 hits that are relevant to a search goal. We see that, for some queries, the numbers are less than honorable.... ..."

Cited by 1

### Table 3: Average number of nodes searched in trie, rounded to the nearest integer while searching through a set of 500 random graphs of 6 vertices.

1994

"... In PAGE 9: ... In our example, when searching for graphs within a distance of 3, a depth of close to 30 seems to be the optimum choice. Table3 shows the average number of internal nodes examined. When the cost of measuring the distance between two objects is very high, even moving through hundreds or thousands of internal nodes may not add much to the total time.... ..."

Cited by 7

### Table 2. Tagging information loss through multiple layers

2001

"... In PAGE 4: ... Even when normal markings are intro- duced, the quality of the image is reduced. Table2 identifies how much tagging information is lost through subsequent taggings and other transforms. Figure 6 (left half) shows an enlarged spot in the image, where the placement of all three layers of tags is visible.... ..."

Cited by 2

### Table 1: Algorithms for identifying web communities. EXACT-FLOW-COMMUNITY augments the web graph in three steps: an artificial source, a0 , is added with infinite capacity edges routed to all seed vertices in a72 ; each preexisting edge is made bidirectional and rescaled to a constant value a73 ; and all vertices except the source, sink, and seed vertices are routed to the artificial sink with unit capacity. After augmenting the web graph, a residual flow graph is produced by a maximum flow procedure. All vertices accessible from a0 through non-zero positive edges form the desired result and satisfy our definition of a community. APPROXIMATE-FLOW-COMMUNITY takes a set of seed web sites as input, crawls to a fixed depth including inbound hyperlinks as well as outbound hyperlinks (with inbound hyperlinks found by querying search engines), applies EXACT-FLOW-COMMUNITY to the induced graph from the crawl, ranks the sites in the community by the number of edges each has inside of the community, adds the highest ranked non-seed sites to the seed set, and iterates the procedure. The first iteration may only identify a very small community; however, when new seeds are added, increasingly larger communities can be identified. Note that a73

2002

"... In PAGE 3: ... Note that a73 is heuristically chosen. our method works without an explicit sink site via graph augmentation as described in Table1 . See [4] for the corresponding theorem and proof.... In PAGE 3: ... Thus, separating the source from the sink finds a community that is strongly connected internally, but relatively disconnected externally to the rest of the graph. Table1 also shows an approximate version of the approach, APPROXIMATE-FLOW-COMMUNITY, which uses a subset of the web graph found by a fixed depth crawl that follows both inbound and out- bound hyperlinks. Results are improved on each iteration by reseeding the algorithm with additional web sites found in the earlier steps.... ..."

Cited by 76

### Table I contains in octal notation codes identified through exhaustive search, for the a26 -PAM reordered

### Table 2. In each of the entries of Table3, Ki (T) will denote that Ki in that lexicon points to site/sites that give relevant results when searched by the entire query string (along with the other associated keywords and other specific query information tagged to the keyword Ki , in the string), while Ki (F) will denote otherwise.

"... In PAGE 8: ... We consider a set of training samples, which are sets of keywords identified from filtering user query search string samples that can be collected. Table2 below gives the samples we will consider in the example... ..."

### Table 4: Change propagation equations for propagating aggregate-change tables

"... In PAGE 12: ... 5.3 Change Propagation Equations For the purposes of change propagation equations shown in Table4 , we assume that an aggregate-change table has been generated (as shown in Section 5.1) at the rst aggregate operator in a view expression, in response to insertions and/or deletions at a base relation.... In PAGE 12: ...1) at the rst aggregate operator in a view expression, in response to insertions and/or deletions at a base relation.10 Table4 gives change propagation equations for 10As mentioned before, when two or more base relations are updated simultaneously (or a relation appears more than once in a view expression), we handle the updates in an arbitrary order.... In PAGE 14: ... Singularity Points. We call the operator nodes in a view expression tree, where none of the refresh equations in Table4 apply, as singularity points. Aggregate-change tables cannot be propagated through singularity points.... In PAGE 14: ... Aggregate-change tables cannot be propagated through singularity points. For example, a selection on the result of an aggregate function is a singularity point, because it will not satisfy the condition Attr(p) G given in the rst row of Table4 . Consider a view V and a singularity point V1, which is a subexpression of V , in the expression tree of V .... In PAGE 14: ... Hence, the presence of singularity points in an expression tree does not preclude application of our change-table techniques for incremental maintenance. Theorem 1 Assume that the refresh operator used in the expression of the third column in Table4 is an aggregate-refresh operator. Then, (1) the change propagation equations given in Table 4 for propagation of aggregate-change tables are correct, i.... In PAGE 14: ... Theorem 1 Assume that the refresh operator used in the expression of the third column in Table 4 is an aggregate-refresh operator. Then, (1) the change propagation equations given in Table4 for propagation of aggregate-change tables are correct, i.e.... In PAGE 14: ... Proof: We refer to the expression E1 tU 2E1 as the change equation throughout this proof. As shown in the Table4 , the expression in the fourth column is referred to as the refresh equation. Selection: V = p(E1).... In PAGE 16: ... Thus, we have V1 = storeID;itemID;SumSISales=sum(price);NumSISales=count( )( date gt;1=1=95sales) V 0 2 = V1 1 stores V2 = city;SumCiSales=sum(SumSISales);NumCiSales=sum(NumSISales)(V 0 2) V 0 3 = V1 1 items V3 = category;SumCaSales=sum(SumSISales);NumCaSales=sum(NumSISales)(V 0 3) where the (virtual) views V 0 2 and V 0 3 have been added for better illustration of how the aggregate-change tables propagate. We use the change propagation equations of Table4 to derive the maintenance expressions for V2 and V3, in response to changes in sales, as follows. In all the equations below, U is of the form lt;(SUM; f); (COUNT; f) gt; and p is of the form ((LHS:COUNT + RHS:COUNT) = 0),13 where SUM is the aggregated attribute (SumSISales, SumCiSales, or SumCaSales) in the corresponding view, COUNT is the count attribute (NumSISales, NumCiSales, or NumCaSales) depending on the view, and f(x; y) = x + y for all x; y.... In PAGE 19: ... Note that (Attrs(U) \ G) = and (Attrs(U) [ G) = Attrs(V ). 2 The refresh equations given in Table4 correctly propagate an outerjoin-change table as well, except for the case of propagation through the outerjoin operator, for which we derive a di erent equation below. Consider a view V = E1 fo .... In PAGE 19: ... Let be ( G ^ p; FALSE), where p is a predicate, and G is a set of attributes common to E1 and 2E1. The following row, which replaces the row (7) in Table4 , shows how to propagate an outerjoin-change table 2E1 through the outerjoin operator. 7b E1 fo .... In PAGE 19: .../JE2) Attrs(J) G As already mentioned, is ( G ^ p; FALSE) in the row above. Also, 1 = (J 1; FALSE), where J 1 is Attrs(E2) ^ (( G ^ p) _ (V e12Attrs(E1)LHS:e1 = NULL)); and U1 = lt;(A1; f); (A2; f); : : :; (Ak; f) gt;; where fA1; A2; : : :; Akg = Attrs(E1) and f(x; y) = y for all x; y: Theorem 3 Assume that the refresh operator used in the expression of the third column in Table4 is an outerjoin-refresh operator. (1) The change propagation equations given in Table 4, with the following two changes, correctly propagate outerjoin-change tables.... In PAGE 19: ... Also, 1 = (J 1; FALSE), where J 1 is Attrs(E2) ^ (( G ^ p) _ (V e12Attrs(E1)LHS:e1 = NULL)); and U1 = lt;(A1; f); (A2; f); : : :; (Ak; f) gt;; where fA1; A2; : : :; Akg = Attrs(E1) and f(x; y) = y for all x; y: Theorem 3 Assume that the refresh operator used in the expression of the third column in Table 4 is an outerjoin-refresh operator. (1) The change propagation equations given in Table4 , with the following two changes, correctly propagate outerjoin-change tables. Disregard the condition in column 6 of the rst row (selection view) Replace the seventh row by row (7b) given above (2) The refresh operator derived in each of the refresh equations (column 4) is also an outerjoin-refresh operator, except for the case of propagation through an aggregate operator (sixth row) where the derived operator is an aggregate-refresh operator.... In PAGE 24: ... The above improvement makes propagation of an aggregate and outerjoin-change table through a union operator very e cient. In essence, the refresh equation of the fth row in Table4 will not involve E1 or E2. When computing the refresh equation, the view contents are used, whenever possible, to compute a required subexpression using minimum number of source queries.... ..."

### Table 2. Distribution of search pattern of queries

"... In PAGE 4: ... The logic for the automatic search pattern identification can be summarized as in Figure 1. Also see Table2 for distribution of queries with respect to search patterns in the training dataset. The levels of the search pattern factor used in MLR are 1 through 7.... ..."

### Table 5: Time to identify relevant Web pages for given queries.

2006

"... In PAGE 41: ... The time the user spent using the classified-by-category inter- face and the classified-in-a-list interface is shown in Table 5. Next, each tester was asked to do exactly the same using the user-defined hierarchy of their corresponding user (see Table5 ). For users that have defined their own thematic hi- erarchy, searching was more than 60% faster than searching of testers that have not defined the categories themselves.... ..."