Recommender Algorithms Do No Harm ~90% But… An Exploratory Risk-Utility Meta-Analysis of Algorithmic Audits
32 Pages Posted: 2 May 2023 Last revised: 3 May 2023
Date Written: April 23, 2023
We obtain a quantitatively coarse-grained, but wide-ranging evaluation of the frequency recommender algorithms provide ‘good’ and ‘bad’ recommendations. We found 146 algorithmic audits from 32 studies that report fitting risk-utility statistics from YouTube, Google Search, Twitter, Facebook, TikTok, Amazon, and others. The vast majority of algorithmic recommendations do no harm (around 90 %), while about a quarter of recommendations safeguard humans from self-induced harm (‘do good’). The frequency of ‘bad’ recommendations is around 7-10 % on average, which poses a potential risk that is notably higher than risks posed by other consumer products. This average is remarkably robust across the audits and neither depends on the platform nor on the kind of harm (bias/ discrimination, mental health and child harm, misinformation, or political extremism). Algorithmic audits find negative feedback loops that lock users into spirals of ‘bad’ recommendations (or being ‘dragged down the rabbit hole’), but they find an even larger probability of positive spirals of ‘good recommendations’. Our analysis refrains from any qualitative judgment of the severity of different harms. As concerns for ‘AI alignment’ with human values grow, necessitating more algorithmic audits, our study offers preliminary figures for quantitative comparison to other contemporary risks of modern life.
Keywords: recommender algorithms, algorithmic auditing, machine behavior, meta-analysis, digital harms
Suggested Citation: Suggested Citation