Sps-Sql: Enhancing Text-to-Sql Generation on Small-Scale Llms with Pre-Synthesized Queries

Yan, Liang; Wan, Qichen; Liu, Chuanyi; Duan, Shaoming; Han, Peiyi; Xu, Yong

doi:10.2139/ssrn.5139784

Download This Paper

Open PDF in Browser

Add Paper to My Library

Sps-Sql: Enhancing Text-to-Sql Generation on Small-Scale Llms with Pre-Synthesized Queries

6 Pages Posted: 16 Feb 2025

See all articles by Liang Yan

Large Language Models (LLMs) have demonstrated strong performance in Text-to-SQL generation, converting natural language questions into SQL queries. While most research focuses on enhancing large LLMs like GPT-4 by OpenAI, small-scale open-source LLMs remain overlooked and underutilized. This paper introduces SPS-SQL, a novel lightweight approach designed to boost the Text-to-SQL accuracy on small-scale open-source LLMs. By leveraging semantic information to extract templates from training data, SPS-SQL pre-synthesizes queries based solely on schema information, which serve as few-shot examples to guide further SQL generation. SPS-SQL achieves a execution accuracy on the Spider development set with Llama 3.1 (8 billion parameters) of 80.5%, improved 3.9% over baseline, significantly outperforming other methods on the same model. SPS-SQL achieves competitive results on other LLMs as well, further emphasizing its flexibility and adaptability.

Keywords: Text-to-SQL, Large Language Model, In-context Learning, SQL Synthesis

Suggested Citation: Suggested Citation

Yan, Liang and Wan, Qichen and Liu, Chuanyi and Duan, Shaoming and Han, Peiyi and Xu, Yong, Sps-Sql: Enhancing Text-to-Sql Generation on Small-Scale Llms with Pre-Synthesized Queries. Available at SSRN: https://ssrn.com/abstract=5139784 or http://dx.doi.org/10.2139/ssrn.5139784