Skip to main content

Lund University Publications

LUND UNIVERSITY LIBRARIES

A new strategy for linking U.S. historical censuses : A case study for the IPUMS multigenerational longitudinal panel

Helgertz, Jonas LU ; Price, Joseph ; Wellington, Jacob ; Thompson, Kelly J. ; Ruggles, Steven and Fitch, Catherine A. (2022) In Historical Methods 55(1). p.12-29
Abstract

This paper presents a probabilistic method of record linkage, developed using the U.S. full count censuses of 1900 and 1910 but applicable to many sources of digitized historical records. The method links records using a two-step approach, first establishing high confidence matches among men by exploiting a comprehensive set of individual and contextual characteristics. The method then proceeds to link both men and women by leveraging links between households established in the first step. While only the first stage links can be directly comparable to other popular methods in research on the U.S., our method yields both considerably higher linkage rates and greater accuracy while only performing negligibly worse than other algorithms in... (More)

This paper presents a probabilistic method of record linkage, developed using the U.S. full count censuses of 1900 and 1910 but applicable to many sources of digitized historical records. The method links records using a two-step approach, first establishing high confidence matches among men by exploiting a comprehensive set of individual and contextual characteristics. The method then proceeds to link both men and women by leveraging links between households established in the first step. While only the first stage links can be directly comparable to other popular methods in research on the U.S., our method yields both considerably higher linkage rates and greater accuracy while only performing negligibly worse than other algorithms in resembling the target population.

(Less)
Please use this url to cite or link to this publication:
author
; ; ; ; and
organization
publishing date
type
Contribution to journal
publication status
published
subject
keywords
census data, machine learning, Record linkage, United States of America
in
Historical Methods
volume
55
issue
1
pages
12 - 29
publisher
Heldref Publications
external identifiers
  • scopus:85119255501
  • pmid:35846520
ISSN
0161-5440
DOI
10.1080/01615440.2021.1985027
language
English
LU publication?
yes
id
2d78057c-9133-4b4f-ab2c-2f1306f7f322
date added to LUP
2021-12-13 15:08:09
date last changed
2024-06-15 22:36:50
@article{2d78057c-9133-4b4f-ab2c-2f1306f7f322,
  abstract     = {{<p>This paper presents a probabilistic method of record linkage, developed using the U.S. full count censuses of 1900 and 1910 but applicable to many sources of digitized historical records. The method links records using a two-step approach, first establishing high confidence matches among men by exploiting a comprehensive set of individual and contextual characteristics. The method then proceeds to link both men and women by leveraging links between households established in the first step. While only the first stage links can be directly comparable to other popular methods in research on the U.S., our method yields both considerably higher linkage rates and greater accuracy while only performing negligibly worse than other algorithms in resembling the target population.</p>}},
  author       = {{Helgertz, Jonas and Price, Joseph and Wellington, Jacob and Thompson, Kelly J. and Ruggles, Steven and Fitch, Catherine A.}},
  issn         = {{0161-5440}},
  keywords     = {{census data; machine learning; Record linkage; United States of America}},
  language     = {{eng}},
  number       = {{1}},
  pages        = {{12--29}},
  publisher    = {{Heldref Publications}},
  series       = {{Historical Methods}},
  title        = {{A new strategy for linking U.S. historical censuses : A case study for the IPUMS multigenerational longitudinal panel}},
  url          = {{http://dx.doi.org/10.1080/01615440.2021.1985027}},
  doi          = {{10.1080/01615440.2021.1985027}},
  volume       = {{55}},
  year         = {{2022}},
}