Private Genetic Genealogy Search
50 Pages Posted: 8 Jul 2021 Last revised: 9 Nov 2021
Date Written: June 28, 2021
Genetic genealogy search is a powerful tool for identifying individuals within a wider population by using their genetic information in combination with a genealogical network. The current practice relies on searching within a pre-constructed genetic database, and as such exposes those in the database to substantial privacy risks. Motivated by these privacy concerns, we propose a framework of genealogy search that takes into account the amount of privacy exposure. Instead of collecting a large amount of genetic data beforehand, we advocate for a new search paradigm whereby genetic data are accessed sequentially. Our results show that carefully designed sequential search procedures can significantly outperform existing static approaches in terms of the trade-off between cost and privacy exposure. We further characterize the optimal trade-off and propose a family of search strategies that provably achieve it over path- and grid-like genealogy networks. Finally, we validate our findings via numerical experiments on both real and synthetic genealogical networks and discuss the policy implications of our results.
Keywords: genetic privacy, optimal stopping, graph search, long-range familial search.
Suggested Citation: Suggested Citation