Adaptive Duplicate Detection Using Learnable String Similarity Measures (2003)

Cached

Download Links

by Mikhail Bilenko , Raymond J. Mooney
Venue:In Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2003
Citations:180 - 11 self

Documents Related by Co-Citation

254 The Merge/Purge Problem for Large Databases – Mauricio Hernandez, Mauricio A. Hern'andez, Salvatore Stolfo - 1995
161 Interactive Deduplication using Active Learning – Sunita Sarawagi, Anuradha Bhamidipaty - 2002
311 A theory for record linkage – I P Fellegi, A B Sunter - 1969
200 Efficient Clustering of High-Dimensional Data Sets with Application to Reference Matching – Andrew McCallum, Kamal Nigam, Lyle H. Ungar - 2000
97 Eliminating Fuzzy Duplicates in Data Warehouses – Rohit Ananthakrishna, Surajit Chaudhuri, Venkatesh Ganti - 2002
96 Learning to Match and Cluster Large High-Dimensional Data Sets For Data Integration – William W. Cohen, Jacob Richman - 2002
130 Identity Uncertainty and Citation Matching – Hanna Pasula, Bhaskara Marthi, Brian Milch, Stuart Russell, Ilya Shpitser - 2003
154 An Efficient Domain-Independent Algorithm for Detecting Approximately Duplicate Database Records – Alvaro Monge, Charles Elkan - 1997
92 Learning Domain-Independent String Transformation Weights for High Accuracy Object Identification – Sheila Tejada, Craig A. Knoblock - 2002
106 Automatic linkage of vital records – H B Newcombe, M J Kennedy, S J Axford, A P James - 1959
172 The State of Record Linkage and Current Research Problems – William E. Winkler - 1999
122 The field matching problem: Algorithms and applications – Alvaro Monge, Charles Elkan - 1996
77 Learning Object Identification Rules for Information Integration – Sheila Tejada, Craig A. Knoblock, Steven Minton - 2001
49 Iterative record linkage for cleaning and integration – I Bhattacharya, L Getoor - 2004
130 Robust and efficient fuzzy match for online data cleaning – Surajit Chaudhuri, Kris Ganjam, Venkatesh Ganti, Rajeev Motwani - 2003
1548 Conditional random fields: Probabilistic models for segmenting and labeling sequence data – John Lafferty - 2001
276 Discriminative probabilistic models for relational data – Ben Taskar - 2002
99 Conditional models of identity uncertainty with application to noun coreference – Andrew McCallum, Ben Wellner - 2004
193 Integration of Heterogeneous Databases Without Common Domains Using Queries Based on Textual Similarity – William W. Cohen - 1998