Privacy Preserving Machine Learning
32 Pages Posted: 6 Nov 2023
Date Written: September 30, 2023
Abstract
This tutorial provides an introduction into the evolving topic of privacy preserving machine learning. It discusses how to run models on data from potentially multiple data providers without any data provider having to share any non-encrypted data with any other party. In particular, for use cases requiring large amounts of sensitive personal data, and in the context of strict regulations like GDPR in the European Union, this topic is highly relevant. We introduce the concept of (multiparty) homomorphic encryption and demonstrate the approach on two synthetic datasets of personal health information. Furthermore, this tutorial also provides an introduction into the most common (survival) models to predict health outcomes.
Keywords: Life & health, sensitive personal data, cryptography, security, multiparty homomorphic encryption, logistic regression, Cox proportional hazards model, neural network, hazard ratio, relative risk, odds ratio, federated machine learning
JEL Classification: G22, C45, C53, C55, D82, I13
Suggested Citation: Suggested Citation