Generative AI Without Guardrails Can Harm Learning: Evidence from High School Mathematics

The Wharton School Research Paper

Proceedings of the National Academy of Sciences, volume 122, issue 26, 2025[10.1073/pnas.2422633122]

68 Pages Posted: 18 Jul 2024 Last revised: 1 Apr 2026

See all articles by Hamsa Bastani

Hamsa Bastani

University of Pennsylvania - The Wharton School

Osbert Bastani

University of Pennsylvania - Department of Computer and Information Science

Alp Sungu

University of Pennsylvania

Haosen Ge

University of Pennsylvania - The Wharton School

Özge Kabakcı

Budapest British International School

Rei Mariman

Independent; Independent

Date Written: July 15, 2024

Abstract

Generative AI is poised to revolutionize how humans work, and has already demonstrated promise in significantly improving human productivity. A key question is how generative AI affects learning—namely, how humans acquire new skills as they perform tasks. Learning is critical to long-term productivity, especially since generative AI is fallible and users must check its outputs. We study this question via a field experiment where we provide nearly a thousand high school math students with access to generative AI tutors. To understand the differential impact of tool design on learning, we deploy two generative AI tutors: one that mimics a standard ChatGPT interface (“GPT Base”) and one with prompts designed to safeguard learning (“GPT Tutor”). Consistent with prior work, our results show that having GPT-4 access while solving problems significantly improves performance (48% improvement in grades for GPT Base and 127% for GPT Tutor). However, we additionally f ind that when access is subsequently taken away, students actually perform worse than those who never had access (17% reduction in grades for GPT Base)—i.e., unfettered access to GPT-4 can harm educational outcomes. These negative learning effects are largely mitigated by the safeguards in GPT Tutor. Without guardrails, students attempt to use GPT-4 as a “crutch” during practice problem sessions, and subsequently perform worse on their own. Thus, decision-makers must be cautious about design choices underlying generative AI deployments to preserve skill learning and long-term productivity.
* HB, OB, and AS contributed equally

Keywords: Generative AI, Human Capital Development, Education, Human-AI Collaboration, Large Language Models

Suggested Citation

Bastani, Hamsa and Bastani, Osbert and Sungu, Alp and Ge, Haosen and Kabakcı, Özge and Mariman, Rei, Generative AI Without Guardrails Can Harm Learning: Evidence from High School Mathematics (July 15, 2024). The Wharton School Research Paper, Proceedings of the National Academy of Sciences, volume 122, issue 26, 2025[10.1073/pnas.2422633122], Available at SSRN: https://ssrn.com/abstract=4895486 or http://dx.doi.org/10.1073/pnas.2422633122

Hamsa Bastani

University of Pennsylvania - The Wharton School ( email )

3641 Locust Walk
Philadelphia, PA 19104-6365
United States

Osbert Bastani

University of Pennsylvania - Department of Computer and Information Science ( email )

3330 Walnut Street
Philadelphia, PA 19104
United States

Alp Sungu (Contact Author)

University of Pennsylvania ( email )

Philadelphia, PA 19104
United States

Haosen Ge

University of Pennsylvania - The Wharton School ( email )

Özge Kabakcı

Budapest British International School ( email )

Rei Mariman

Independent ( email )

Independent ( email )

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Downloads
52,879
Abstract Views
183,024
Rank
66
PlumX Metrics