Physics of Language Models: Part 2.2, How to Learn From Mistakes on Grade-School Math Problems

ICLR 2025: International Conference on Learning Representations

23 Pages Posted: 14 May 2025

See all articles by Tian Ye

Tian Ye

CMU

Zicheng Xu

Google Research

Yuanzhi Li

Mohamed bin Zayed University of Artificial Intelligence

Zeyuan Allen-Zhu

Meta Platforms Inc; Allen-Zhu Research

Date Written: August 28, 2024

Abstract

Language models have demonstrated remarkable performance in solving reasoning tasks; however, even the strongest models still occasionally make reasoning mistakes. Recently, there has been active research aimed at improving reasoning accuracy, particularly by using pretrained language models to ``self-correct'' their mistakes via multi-round prompting. In this paper, we follow this line of work but focus on understanding the usefulness of incorporating ``error-correction'' data directly into the pretraining stage. This data consists of erroneous solution steps immediately followed by their corrections. Using a synthetic math dataset, we show promising results: this type of pretrain data can help language models achieve higher reasoning accuracy directly (i.e., through simple auto-regression, without multi-round prompting) compared to pretraining on the same amount of error-free data. We also delve into many details, such as (1) how this approach differs from beam search, (2) how such data can be prepared, (3) whether masking is needed on the erroneous tokens, (4) the amount of error required, (5) whether such data can be deferred to the fine-tuning stage, and many others.

Keywords: pretraining, language model, error correction, error detection

Suggested Citation

Ye, Tian and Xu, Zicheng and Li, Yuanzhi and Allen-Zhu, Zeyuan, Physics of Language Models: Part 2.2, How to Learn From Mistakes on Grade-School Math Problems (August 28, 2024). ICLR 2025: International Conference on Learning Representations, Available at SSRN: https://ssrn.com/abstract=5250631 or http://dx.doi.org/10.2139/ssrn.5250631

Tian Ye

CMU ( email )

Zicheng Xu

Google Research ( email )

Yuanzhi Li

Mohamed bin Zayed University of Artificial Intelligence ( email )

Zeyuan Allen-Zhu (Contact Author)

Meta Platforms Inc ( email )

Menlo Park, CA
United States

Allen-Zhu Research

WA
United States

HOME PAGE: http://zeyuan.allen-zhu.com

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Downloads
52
Abstract Views
100
Rank
856,476
PlumX Metrics