Researcher Access to Social Media Data: Lessons from Clinical Trial Data Sharing
96 Pages Posted: 6 Feb 2024 Last revised: 29 May 2024
Date Written: April 2024
Abstract
For years, social media companies have sparred with lawmakers over how much
independent access to platform data they should provide researchers. Sharing data with
researchers allows the public to better understand the risks and harms associated with social
media, including areas such as misinformation, child safety, and political polarization. Yet
researcher access is controversial. Privacy advocates and companies raise the potential privacy
threats of researchers using such data irresponsibly. In addition, social media companies raise
concerns over trade secrecy: the data these companies hold and the algorithms powered by
that data are secretive sources of competitive advantage. This Article shows that one way to
navigate this difficult strait is by drawing on lessons from the successful governance program
that has emerged to regulate the sharing of clinical trial data. Like social media data, clinical
trial data implicates both individual privacy and trade secrecy concerns. Nonetheless, clinical
trial data’s governance regime was gradually legislated, regulated, and brokered into existence,
managing the interests of industry, academia, and other stakeholders. The result is a
functionally successful (albeit imperfect) clinical trial data-sharing ecosystem. Part II sketches
the status quo of researchers’ access to social media data and provides a novel taxonomy of
the problems that arise under this regime. Part III reviews the legal structures governing
sharing of clinical trial data and traces the history of scandals, investigations, industry protest,
and legislative response that gave rise to the mix of mandated sharing and experimental
programs we have today. Part IV applies lessons from clinical trial data sharing to social media data and charts a strategic course forward. Three primary lessons emerge: first, the benefits of
research on otherwise secret data are cascading and unpredictable; second, law without
institutions to implement the law is insufficient; and, third, data access regimes must be tailored
to the different sorts of data they make available.
Suggested Citation: Suggested Citation