Results 1 -
1 of
1
The LikeIt Intelligent String Comparison Facility
- NEC Research Institute
, 1997
"... A highly-efficient ANSI-C facility is described for intelligently comparing a query string with a series of database strings. The bipartite weighted matching approach taken tolerates ordering violations that are problematic for simple automaton or string edit distance methods---yet common in practic ..."
Abstract
-
Cited by 16 (0 self)
- Add to MetaCart
A highly-efficient ANSI-C facility is described for intelligently comparing a query string with a series of database strings. The bipartite weighted matching approach taken tolerates ordering violations that are problematic for simple automaton or string edit distance methods---yet common in practice. The method is character and polygraph based and does not require that words are properly formed in a query. Database characters are processed at a rate of approximately 2.5 million per second using a 200MHz Pentium Pro processor. A subroutine-level API is described along with an simple executable utility supporting both command-line and Web interfaces. An optimized Web interface is also reported consisting of a daemon that preloads multiple databases, and a corresponding CGI stub. The daemon may be initiated manually or via inetd. Keywords: String Comparison/Similarity, Text/Database Search/Retrieval, Bipartite Matching/Assignment, Edit Distance. Both authors are with the NEC Research I...

