DMCA
The field matching problem: Algorithms and Applications (1996)
Citations: | 196 - 4 self |
Citations
4015 |
Introduction to Modern Information Retrieval
- Salton, McGill
- 1983
(Show Context)
Citation Context ...t. A group of size 1 is a single field for which there was no match in a dataset. The performance of a field matching algorithm can be evaluated by viewing the problem in information retrieval terms (=-=Salton and McGill 1983-=-). Given a set of possibly equivalent fields, consider each field in turn to be a query, and rank all other fields according to their degree of match as computed by the field matching algorithm. The a... |
2221 |
Identification of common molecular subsequences
- Smith, Waterman
- 1981
(Show Context)
Citation Context ...ared with each atomic string of B. An important optimization is to apply memoization to remember the results of recursive calls which have already been made. The Smith-Waterman algorithm This method (=-=Smith and Waterman 1981-=-) is a dynamic programming algorithm. It was first developed to find optimal alignments between related DNA or protein sequences. The Smith-Waterman algorithm has three main adjustable parameters. Giv... |
652 | A comparative analysis of methodologies for database schema integration.
- Batini, Lenzerini, et al.
- 1986
(Show Context)
Citation Context ...in relational databases. In order to perform a join between two relations, one must first determine which columns refer to the same category of entities. This is known as the schema matching problem (=-=Batini et al. 1986-=-; Kim et al. 1993). Given a solution to the schema matching problem, one still needs to determine whether two specific tuples, i.e. field values, are equivalent. This is the problem studied in this pa... |
360 | The merge/purge problem for large databases
- Hernandez, Stolfo
- 1995
(Show Context)
Citation Context ...ate different pieces of information about the same taxpayer when social security numbers are missing or incorrect. In general, field matching is the central issue in the so-called "merge/purge&qu=-=ot; task (Hernandez and Stolfo 1995-=-): identifying and combining multiple records, from one database or many, that concern the same entity but are distinct because of data entry errors. Published previous work deals with special cases o... |
85 | On Resolving Schematic Heterogeneity in Multidatabase Systems., Distributed and Parallel Databases. - Kim, Choi, et al. - 1993 |
83 | Category translation: Learning to understand information on the internet.
- Perkowitz, Etzioni
- 1995
(Show Context)
Citation Context ...gory of entities. This is known as the schema matching problem (Batini et al. 1986; Kim et al. 1993). Given a solution to the schema matching problem, one still needs to determine whether two specific tuples, i.e. field values, are equivalent. This is the problem studied in this paper. In general the field matching problem is to determine whether or not two field values are syntactic alternatives that designate the same -zplmcxmti~ r?nt;tv UUlllUY”I” “LLU’YJ. A solution to the field matching problem can be applied to solve the schema matching problem. The “information learning agent” (ILA) of Etzioni and Perkowitz (1995) learns the schema of other information sources based on the known schema of one source. To learn a new schema, equivalent pieces of information must be detected. The ILA can match (206) 616-1845 and 616.1845 for example, but details of the matching method are not given by Etzioni and Perkowitz (1995). G eneral matching methods are the topic of this paper. The field matching problem Many information sources, e.g. relational databases or worldwide web pages, provide information about the same real-world entities, but designate these entities differently. We refer to a designator of an entity as... |
4 |
Matchmaker : : : matchmaker : : : find me the address (exact address match processing). Telephone Engineer and Management
- Ace, Marvel, et al.
- 1992
(Show Context)
Citation Context ...base or many, that concern the same entity but are distinct because of data entry errors. Published previous work deals with special cases of the field matching problem, involving customer addresses (=-=Ace et al. 1992), census -=-records (Slaven 1992), or variant entries in a lexicon (Jacquemin and Royaute 1994). The most similar work to ours is due to Hernandez and Stolfo (1995), for the "merge/purge" task. After cl... |
3 | Integrating external information sources to guide worldwide web information retrieval - Monge, Elkan - 1996 |
3 |
The set theory matching system: an application to ethnographic research.
- Slaven
- 1992
(Show Context)
Citation Context ...e entity but are distinct because of data entry errors. Published previous work deals with special cases of the field matching problem, involving customer addresses (Ace et al. 1992), census records (=-=Slaven 1992), or vari-=-ant entries in a lexicon (Jacquemin and Royaute 1994). The most similar work to ours is due to Hernandez and Stolfo (1995), for the "merge/purge" task. After clustering tuples using indices,... |
1 | find me the address (exact address match processing). - matchmaker - 1992 |
1 | Integrating ansactions external information sources to guide worldwide web on Networking, 2(5):426-information retrieval. - Monge, Elkan - 1994 |