• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations

DMCA

Approximate String Joins in a Database (Almost) for Free - Erratum (2003)

Cached

  • Download as a PDF

Download Links

  • [www1.cs.columbia.edu]
  • [www1.cs.columbia.edu]
  • [www.cs.columbia.edu]
  • [www.cs.columbia.edu]
  • [www1.cs.columbia.edu]
  • [www.ipeirotis.com]
  • [academiccommons.columbia.edu]
  • [www.research.att.com]
  • [www1.cs.columbia.edu]
  • [www.cs.columbia.edu]
  • [public.research.att.com]
  • [www2.research.att.com]
  • [www2.research.att.com]
  • [www.cs.columbia.edu]
  • [www1.cs.columbia.edu]
  • [www.stern.nyu.edu]
  • [dc-pubs.dbs.uni-leipzig.de]
  • [www.ipeirotis.com]
  • [www.cs.columbia.edu]

  • Other Repositories/Bibliography

  • DBLP
  • Save to List
  • Add to Collection
  • Correct Errors
  • Monitor Changes
by Luis Gravano , Panagiotis G. Ipeirotis , H. V. Jagadish , Nick Koudas , S. Muthukrishnan , Divesh Srivastava
Venue:In VLDB
Citations:210 - 16 self
  • Summary
  • Citations
  • Active Bibliography
  • Co-citation
  • Clustered Documents
  • Version History

BibTeX

@INPROCEEDINGS{Gravano03approximatestring,
    author = {Luis Gravano and Panagiotis G. Ipeirotis and H. V. Jagadish and Nick Koudas and S. Muthukrishnan and Divesh Srivastava},
    title = {Approximate String Joins in a Database (Almost) for Free - Erratum},
    booktitle = {In VLDB},
    year = {2003},
    pages = {491--500}
}

Share

Facebook Twitter Reddit Bibsonomy

OpenURL

 

Abstract

case the result returned by the Figure 1 query is incomplete and su#ers from "false negatives," in contrast to our claim to the contrary in [GIJ 01b]. In general, the string pairs that are omitted are pairs of short strings. Even when these strings match within small edit distance, the match tends to be meaningless (e.g., "IBM" matches "ACM" within edit distance 2). However, when it is absolutely necessary to have no false negatives, we can make the appropriate modifications to the SQL query in Figure 1 so that it produces the correct results. Since the false negatives are only pairs of short strings, we can join all pairs of these small strings, using only the length filter, and UNION the result with the result of the SQL query described in [GIJ 01b]. We list the modified query in Figure 2. 2 Experimental Results We now experimentally measure the number of false negatives from which the query in [GIJ 01b] (Figure 1) can su#er. For the experiments we use the same thre

Keyphrases

false negative    approximate string join    free erratum    short string    sql query    string pair    small string    small edit distance    su er    length filter    edit distance    appropriate modification    ibm match acm    modified query    experimental result    correct result   

Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University