Results 1 -
4 of
4
The State of Record Linkage and Current Research Problems
- Statistical Research Division, U.S. Census Bureau
, 1999
"... This paper provides an overview of methods and systems developed for record linkage. Modern record linkage begins with the pioneering work of Newcombe and is especially based on the formal mathematical model of Fellegi and Sunter. In their seminal work, Fellegi and Sunter introduced many powerful id ..."
Abstract
-
Cited by 172 (7 self)
- Add to MetaCart
This paper provides an overview of methods and systems developed for record linkage. Modern record linkage begins with the pioneering work of Newcombe and is especially based on the formal mathematical model of Fellegi and Sunter. In their seminal work, Fellegi and Sunter introduced many powerful ideas for estimating record linkage parameters and other ideas that still influence record linkage today. Record linkage research is characterized by its synergism of statistics, computer science, and operations research. Many difficult algorithms have been developed and put in software systems. Record linkage practice is still very limited. Some limits are due to existing software. Other limits are due to the difficulty in automatically estimating matching parameters and error rates, with current research highlighted by the work of Larsen and Rubin. Keywords: computer matching, modeling, iterative fitting, string comparison, optimization RsSUMs Cet article donne une vue d'ensemble sur les ...
Matching and Record Linkage
- Business Survey Methods
, 1995
"... INTRODUCTION Matching has a long history of uses in statistical surveys and administrative data development. A business register consisting of names, addresses, and other identifying information such as total financial receipts might be constructed from tax and employment data bases (see chapters b ..."
Abstract
-
Cited by 77 (14 self)
- Add to MetaCart
INTRODUCTION Matching has a long history of uses in statistical surveys and administrative data development. A business register consisting of names, addresses, and other identifying information such as total financial receipts might be constructed from tax and employment data bases (see chapters by Colledge, Nijhowne, and Archer). A survey of retail establishments or agricultural establishments might combine results from an area frame and a list frame. To produce a combined estimator, units from the area frame would need to be identified in the list frame (see Vogel-Kott chapter). To estimate the size of a (sub)population via capture-recapture techniques, one needs to accurately determine units common to two or more independent listings (Sekar and Deming 1949; Scheuren 1983; Winkler 1989b). Samples must be drawn appropriately to estimate overlap (Deming and Gleser 1959). Rather than develop a special survey to collect data for policy decisions, it might be more appropriate t
Overview of record linkage and current research directions
- BUREAU OF THE CENSUS
, 2006
"... This paper provides background on record linkage methods that can be used in combining data from a variety of sources such as person lists business lists. It also gives some areas of current research. ..."
Abstract
-
Cited by 55 (1 self)
- Add to MetaCart
This paper provides background on record linkage methods that can be used in combining data from a variety of sources such as person lists business lists. It also gives some areas of current research.
Random Juror Selection from Multiple Lists
, 1975
"... We examine the selection of jurors ' names from multiple source lists, using statistical and optimization methodology. Five plans for sampling at random from overlapping lists of names are analyzed for their probabilistic and cost properties. In each plan the probability of a name being selected is ..."
Abstract
- Add to MetaCart
We examine the selection of jurors ' names from multiple source lists, using statistical and optimization methodology. Five plans for sampling at random from overlapping lists of names are analyzed for their probabilistic and cost properties. In each plan the probability of a name being selected is independent of which and how many lists it appears on. We consider the optimal ordering of the frames to minimize cost and develop a heuristic for solving this problem. Although the methods are discussed in terms of juror selection, the results apply to sampling from overlapping frames in any context. For instance, if lists of equipment are kept according to possible uses, with versatile equipment listed many times, the methods of this paper can be used to draw a random sample of equipment to check for. readiness. flE FEDERAL Jury Selection and Service Act of 1968 (P.L. 90-274, T82 Stat. 53) provides methods for the selection of citizens to serve

