Content and Conduct: How English Wikipedia Moderates Harmful Speech

In this study, we aim to assess the degree to which English-language Wikipedia is successful in addressing harmful speech with a particular focus on the removal of deleterious content. We have conducted qualitative interviews with Wikipedians and carried out a text analysis using machine learning classifiers trained to identify several variations of problematic speech. Overall, we conclude that Wikipedia is largely successful at identifying and quickly removing a vast majority of harmful content despite the large scale of the project. The evidence suggests that efforts to remove malicious content are faster and more effective on Wikipedia articles compared to removal efforts on article talk and user talk pages.

Over time, Wikipedia has developed a multi-layered system for discovering and addressing harmful speech. The system relies most heavily on active patrols of editors looking for damaging content and is complemented by automated tools. This combination of tools and human review of new text enables Wikipedia to quickly take down the most obvious and egregious cases of harmful speech. The Wikipedia approach is decentralized and defers much of the judgement about what is permissible or not to its many editors and administrators. Unlike some social media platforms, which strive to clearly articulate what speech is acceptable or not, Wikipedia has no concise summary of what is acceptable and not. Empowering individuals to make judgement about content, which extends to content and conduct that some would deem harmful, naturally leads to variation across editors in the way that Wikipedia’s guidelines and policies are interpreted and implemented. The general consensus among those editors interviewed for this project was that Wikipedia’s many editors make different judgement about addressing harmful speech. The interviewees also generally agreed that efforts to enforce greater uniformity in editorial choices would not be fruitful.

To understand the prevalence and modalities of harmful speech on Wikipedia, it is important to recognize that Wikipedia is comprised of several distinct, parallel regimes. The governance structures and standards that shape the decisions of Wikipedia articles are significantly different from the processes that govern behavior on article talk pages, user pages, and user talk pages.

Compared to talk pages, Wikipedia articles receive a majority of the negative attention from vandals and trolls. They are also the most carefully monitored. The very nature of the encyclopedia-building enterprise and the intensity of vigilance employed to fight vandals who target articles means that Wikipedia is very effective at removing harmful speech from articles. Talk pages, the publicly viewable but behind-the-scenes discussion pages where editors debate changes, are often the location of heated and bitter debates over what belongs in Wikipedia.

Removal of harmful content from articles, particularly when done quickly, is most likely a deterrent for bad behavior. It is also an effective remedy for the problem. If seen by very few readers, the fleeting presence of damaging content results in proportionately small harm. On the other hand, the personal attacks on other Wikipedians — frequently on talk pages — is not as easily mitigated. Taking down the offending content is helpful but does not entirely erase the negative impact on the individual and community. Governing discourse among Wikipedians continues to be a major challenge for Wikipedia and one not fixed by content removal alone.

This study, which builds upon prior research and tools, focuses on the removal of harmful content. A related and arguably more consequential question is the effectiveness of efforts to inhibit and prevent harmful speech from occurring on the platform in the first place by dissuading this behavior ex ante rather than mopping up after the fact. Future research could adopt similar tools and approaches to this report to track the incidence of harmful speech over time and to test the effectiveness of different interventions to foster pro-social behavior.

The sophisticated and multi-layer mechanisms for addressing harmful speech developed and employed by Wikipedia that we describe in this report arguably represent the most successful large-scale effort to moderate harmful content, despite the ongoing challenges and innumerable contentious debates that shape the system and outcomes we observe. The approaches and strategies employed by Wikipedia provide valuable lessons to other community-reliant online platforms. Broader applicability is limited as the decentralized decision-making and emergent norms of Wikipedia constrain the transferability of these approaches and lessons to commercial platforms.

Suggested Citation: Suggested Citation

Clark, Justin and Faris, Robert and Gasser, Urs and Holland, Adam and Ross, Hilary and Tilton, Casey, Content and Conduct: How English Wikipedia Moderates Harmful Speech (November 2019). The Social Science Research Network Electronic Paper Collection, Research Publication No. 2019-5, November, 2019, Available at SSRN: https://ssrn.com/abstract=3489176