Data Valuation for Vertical Federated Learning: A Model-free and Privacy-preserving Method

55 Pages Posted: 30 Jan 2024

See all articles by Xiao Han

Xiao Han

Shanghai University of Finance and Economics - School of Information Management and Engineering

Leye Wang

Peking University

Junjie Wu

Beihang University (BUAA)

Xiao Fang

Lerner College of Business and Economics, University of Delaware

Date Written: January 3, 2024

Abstract

Vertical Federated learning (VFL) is a promising paradigm for predictive analytics, empowering an organization (i.e., task party) to enhance its predictive models through collaborations with multiple data suppliers (i.e., data parties) in a decentralized and privacy-preserving way. Despite the fast-growing interest in VFL, the lack of effective and secure tools for assessing the value of data owned by data parties hinders the application of VFL in business contexts. In response, we propose FedValue, a privacy-preserving, task-specific but model-free data valuation method for VFL, which consists of a data valuation metric and a federated computation method. Specifically, we first introduce a novel data valuation metric, namely MShapley-CMI. The metric evaluates a data party's contribution to a predictive analytics task without the need to execute a machine learning model, making it well-suited for real-world applications of VFL. Next, we develop an innovative federated computation method that calculates the MShapley-CMI value for each data party in a privacy-preserving manner. Extensive experiments conducted on six public datasets validate the efficacy of FedValue for data valuation in the context of VFL. In addition, we illustrate the practical utility of FedValue with a case study involving federated movie recommendations.

Keywords: data valuation, predictive analytics, privacy, vertical federated learning, federated recommendation

Suggested Citation

Han, Xiao and Wang, Leye and Wu, Junjie and Fang, Xiao, Data Valuation for Vertical Federated Learning: A Model-free and Privacy-preserving Method (January 3, 2024). Available at SSRN: https://ssrn.com/abstract=4682439 or http://dx.doi.org/10.2139/ssrn.4682439

Xiao Han (Contact Author)

Shanghai University of Finance and Economics - School of Information Management and Engineering ( email )

No. 100 Wudong Road
Shanghai, Shanghai 200433
China

Leye Wang

Peking University ( email )

No. 38 Xueyuan Road
Haidian District
Beijing, Beijing 100871
China

Junjie Wu

Beihang University (BUAA) ( email )

37 Xue Yuan Road
Beijing 100083
China

Xiao Fang

Lerner College of Business and Economics, University of Delaware ( email )

Newark, DE 19716
United States

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Downloads
85
Abstract Views
354
Rank
644,310
PlumX Metrics