Comment: The Essential Role of Policy Evaluation for the 2020 Census Disclosure Avoidance System

Our response to boyd and Sarathy (2022) is now published in the HDSR!

Authors
Affiliations

Christopher T. Kenny

Department of Government, Harvard University

Department of Political Science, Yale University

Cory McCartan

Department of Statistics, Harvard University

Evan Rosenman

Harvard Data Science Initiative

Tyler Simko

Department of Government, Harvard University

Kosuke Imai

Departments of Government and Statistics, Harvard University

Published

January 31, 2023

We’re excited to share that Comment: The Essential Role of Policy Evaluation for the 2020 Census Disclosure Avoidance System is now available at the Harvard Data Science Review. We discuss boyd and Sarathy (2022), addressing both factual inaccuracies in their work and their contention that disagreements over privacy in the 2020 Census are primarily academic issues.

In “Differential Perspectives: Epistemic Disconnects Surrounding the U.S. Census Bureau’s Use of Differential Privacy,” boyd and Sarathy argue that empirical evaluations of the Census Disclosure Avoidance System (DAS), including our published analysis (Kenny et al., 2021b), failed to recognize that the benchmark data against which the 2020 DAS was evaluated is never a ground truth of population counts. In this commentary, we explain why policy evaluation, which was the main goal of our analysis, is still meaningful without access to a perfect ground truth. We also point out that our evaluation leveraged features specific to the decennial census and redistricting data, such as block-level population invariance under swapping and voter file racial identification, better approximating a comparison with the ground truth. Lastly, we show that accurate statistical predictions of individual race based on the Bayesian Improved Surname Geocoding, while not a violation of differential privacy, substantially increases the disclosure risk of private information the Census Bureau sought to protect. We conclude by arguing that policymakers must confront a key trade-off between data utility and privacy protection, and an epistemic disconnect alone is insufficient to explain disagreements between policy choices.