AI Will Not Want to Self-Improve

25 Pages Posted: 13 May 2023 Last revised: 26 Oct 2023

See all articles by Peter Salib

Peter Salib

University of Houston Law Center

Date Written: May 11, 2023


Many accounts of risk from Artificial Intelligence (AI), including existential risk, involve self-improvement. The idea is that, if an AI gained the ability to improve itself, it would do so, since improved capabilities are useful for achieving essentially any goal. An initial round of self-improvement would produce an even more capable AI, which might then be able to improve itself further. And so on, until the resulting agents were superintelligent and impossible to control. Such AIs, if not aligned to promoting human flourishing, would seriously harm humanity in pursuit of their alien goals. To be sure, self-improvement is not a necessary condition for doom. Humans might create dangerous superintelligent AIs without any help from AIs themselves. But in most accounts of AI risk, the probability of self- improvement is a substantial contributing factor.

Here, I argue that AI self-improvement is substantially less likely than is currently assumed. This is not because self-improvement would be technically impossible, or even difficult. Rather, it is because most AIs that could self-improve would have very good reasons3 not to. What reasons? Surprisingly familiar ones: Improved AIs pose an existential threat to their unimproved originals in the same manner that smarter-than-human AIs pose an existential threat to humans.

Keywords: AI, Artificial Intelligence, Existential Risk, Alignment, Self-Improvement

Suggested Citation

Salib, Peter, AI Will Not Want to Self-Improve (May 11, 2023). U of Houston Law Center No. 2023-A-24, Available at SSRN: or

Peter Salib (Contact Author)

University of Houston Law Center ( email )

4104 Martin Luther King Blvd.
Houston, TX 77204
United States

Do you have negative results from your research you’d like to share?

Paper statistics

Abstract Views
PlumX Metrics