Data Integration Using Similarity Joins and a Word-Based Information Representation Language (2000)

by William W. Cohen
Venue:ACM TRANSACTIONS ON INFORMATION SYSTEMS
Citations:83 - 9 self

Active Bibliography

214 Integration of Heterogeneous Databases Without Common Domains Using Queries Based on Textual Similarity – William W. Cohen - 1998
25 WHIRL: A Word-based Information Representation Language – William W. Cohen - 1999
10 Knowledge Integration for Structured Information Sources Containing Text (Extended Abstract) – William W. Cohen - 1997
91 Learning Object Identification Rules for Information Integration – Sheila Tejada, Craig A. Knoblock, Steven Minton - 2001
2 Reasoning about Textual Similarity in a Web-Based Information Access System – William W. Cohen - 1999
91 Collective entity resolution in relational data – Indrajit Bhattacharya, Lise Getoor - 2006
7 Learnable Similarity Functions and Their Applications to Clustering and Record Linkage – Mikhail Bilenko - 2004
8 Learning importance of relationships for reference disambiguation – Dmitri V. Kalashnikov, Sharad Mehrotra - 2004
34 Exploiting relationships for object consolidation – Zhaoqi Chen, Dmitri V. Kalashnikov, Sharad Mehrotra - 2005
45 Domain-independent data cleaning via analysis of entity-relationship graph – Dmitri V. Kalashnikov, Sharad Mehrotra - 2006
54 Hardening Soft Information Sources – William W. Cohen, Henry Kautz, David Mcallester - 2000
128 Learning to Match and Cluster Large High-Dimensional Data Sets For Data Integration – William W. Cohen, Jacob Richman - 2002
237 Database Techniques for the World-Wide Web: A Survey – Daniela Florescu, Alon Levy, Alberto Mendelzon - 1998
53 Joins that Generalize: Text Classification Using WHIRL – William Cohen, Haym Hirsh - 1998
54 A Web-based Information System that Reasons with Structured Collections of Text – William W. Cohen - 1998
3 An Adaptive and Efficient Algorithm for Detecting Approximately Duplicate Database Records – Alvaro E. Monge - 2000
37 Learning to Combine Trained Distance Metrics for Duplicate Detection in Databases – Mikhail Bilenko, Raymond J. Mooney - 2002
258 Efficient Clustering of High-Dimensional Data Sets with Application to Reference Matching – Andrew McCallum , Kamal Nigam , Lyle H. Ungar - 2000
237 Adaptive Duplicate Detection Using Learnable String Similarity Measures – Mikhail Bilenko, Raymond J. Mooney - 2003