Agrep  A Fast Approximate PatternMatching Tool
 In Proc. of USENIX Technical Conference
, 1992
"... Searching for a pattern in a text file is a very common operation in many applications ranging from text editors and databases to applications in molecular biology. In many instances the pattern does not appear in the text exactly. Errors in the text or in the query can result from misspelling or fr ..."
Abstract

Cited by 156 (6 self)
. In this paper we describe a new tool, called agrep, for approximate pattern matching. Agrep is based on a new efficient and flexible algorithm for approximate string matching. Agrep is also competitive with other tools for exact string matching; it include many options that make searching more powerful
Network Centric Warfare: Developing and Leveraging Information Superiority
 Command and Control Research Program (CCRP), US DoD
, 2000
"... the mission of improving DoD’s understanding of the national security implications of the Information Age. Focusing upon improving both the state of the art and the state of the practice of command and control, the CCRP helps DoD take full advantage of the opportunities afforded by emerging technolo ..."
Abstract

Cited by 308 (5 self)
technologies. The CCRP pursues a broad program of research and analysis in information superiority, information operations, command and control theory, and associated operational concepts that enable us to leverage shared awareness to improve the effectiveness and efficiency of assigned missions. An important
Automatic Query Expansion Using SMART : TREC 3
 In Proceedings of The third Text REtrieval Conference (TREC3
"... The Smart information retrieval project emphasizes completely automatic approaches to the understanding and retrieval of large quantities of text. We continue our work in TREC 3, performing runs in the routing, adhoc, and foreign language environments. Our major focus is massive query expansion: ad ..."
Abstract

Cited by 202 (4 self)
investigations into combining global similarities, giving an overall indication of how a document matches a query, with local similarities identifying a smaller part of the document which matches the query. Using an overlapping text window definition of "local", we achieve a 16% improvement
Efficient exact setsimilarity joins
 in Proc. of the 32nd Intl. Conf. on Very Large Data Bases
, 2006
"... Given two input collections of sets, a setsimilarity join (SSJoin) identifies all pairs of sets, one from each collection, that have high similarity. Recent work has identified SSJoin as a useful primitive operator in data cleaning. In this paper, we propose new algorithms for SSJoin. Our algorithm ..."
Abstract

Cited by 133 (7 self)
algorithms have two important features: They are exact, i.e., they always produce the correct answer, and they carry precise performance guarantees. We believe our algorithms are the first to have both features; previous algorithms with performance guarantees are only probabilistically approximate. We
Fast Algorithms for Approximate Fréchet Matching Queries in Geometric Trees
, 2013
"... Let T be a tree that is embedded in the plane and let ∆> 0 be a real number. The aim is to preprocess T into a data structure, such that for any query polygonal path Q, we can decide if T contains a path P whose Fréchet distance δF (P,Q) to Q is less than ∆. For any real number ε> 0, we prese ..."
Abstract
present an efficient data structure that solves an approximate version of this problem for the case when T is cpacked and each of the edges of T and Q has length Ω(∆): If the data structure returns NO, then there is no such path P. If it returns YES, then δF (P,Q) ≤ (1 + ε) ∆ if Q is a line segment
Exact and Approximate Algorithms for Unordered 'he Matching
"... AbstractWe consider the problem of comparison between unordered trees, i.e., trees for which the order among siblings is unimportant. The criterion for comparison is the distance as measured by a weighted sum of the costs of deletion, insertion and relabel operations on tree nodes. Such comparisons ..."
Abstract
to approximate solutions. The algorithms are based on probabilistic hill climbing and bipartite matching techniques. The paper evaluates the accuracy and time efficiency of the heuristics by applying them to a set of trees transformed from industrial parts based on a previously proposed morphological model. I.
Simple and efficient algorithm for approximate dictionary matching
, 2010
"... This paper presents a simple and efficient algorithm for approximate dictionary matching designed for similarity measures such as cosine, Dice, Jaccard, and overlap coefficients. We propose this algorithm, called CPMerge, for the τoverlap join of inverted lists. First we show that this task is solv ..."
Abstract

Cited by 8 (0 self)
is solvable exactly by a τoverlap join. Given inverted lists retrieved for a query, the algorithm collects fewer candidate strings and prunes unlikely candidates to efficiently find strings that satisfy the constraint of the τoverlap join. We conducted experiments of approximate dictionary matching on three
Partitioning based algorithms for approximate and exact Iceberg Queries
, 1998
"... In many applications it is necessary to identify items which occur frequently within the data set, which maybe a materialized or nonmaterialized relation. Such queries were recently denoted as iceberg queries. Several algorithms for computing iceberg queries were presented, including an approximati ..."
Abstract
an approximation algorithm based on concise sampling, and an exact algorithm based on sampling combined with multiple hash functions. In this paper, we propose a new approach for approximating iceberg queries, using a hash based partitioning technique. The data set is partitioned using a hash function into value
Exact and Fast Collision Detection
, 1994
"... this document: v vector or point, almost always v 2 R P polyhedron, or just a graphical object VP ; EP ; FP the set of vertices, edges, and polygons of P , resp ..."
Abstract

Cited by 6 (0 self)
Combining evidence using pvalues: Application to sequence homology searches
 Bioinformatics
, 1998
"... Motivation: To illustrate an intuitive and statistically valid method for combining independent sources of evidence that yields a pvalue for the complete evidence, and to apply it to the problem of detecting simultaneous matches to multiple patterns in sequence homology searches. Results: In seque ..."
Abstract

Cited by 226 (13 self)
Motivation: To illustrate an intuitive and statistically valid method for combining independent sources of evidence that yields a pvalue for the complete evidence, and to apply it to the problem of detecting simultaneous matches to multiple patterns in sequence homology searches. Results
