Synthetic Health Data: Real Ethical Promise and Peril
Hastings Center Report, volume 54, issue 5, 2024[10.1002/hast.4911]
6 Pages Posted:
Date Written: November 02, 2024
Abstract
Researchers and practitioners are increasingly using machine-generated synthetic data as a tool for advancing health science and practice, by expanding access to health data while—potentially—mitigating privacy and related ethical concerns around data sharing. While using synthetic data in this way holds promise, we argue that it also raises significant ethical, legal, and policy concerns, including persistent privacy and security problems, accuracy and reliability issues, worries about fairness and bias, and new regulatory challenges. The virtue of synthetic data is often understood to be its detachment from the data subjects whose measurement data is used to generate it. However, we argue that addressing the ethical issues synthetic data raises might require bringing data subjects back into the picture, finding ways that researchers and data subjects can be more meaningfully engaged in the construction and evaluation of datasets and in the creation of institutional safeguards that promote responsible use.
Keywords: synthetic data, privacy, health data, machine learning, privacy-enhancing technologies, PETs
Suggested Citation: Suggested Citation