Sps-Sql: Enhancing Text-to-Sql Generation on Small-Scale Llms with Pre-Synthesized Queries
6 Pages Posted: 16 Feb 2025
Abstract
Large Language Models (LLMs) have demonstrated strong performance in Text-to-SQL generation, converting natural language questions into SQL queries. While most research focuses on enhancing large LLMs like GPT-4 by OpenAI, small-scale open-source LLMs remain overlooked and underutilized. This paper introduces SPS-SQL, a novel lightweight approach designed to boost the Text-to-SQL accuracy on small-scale open-source LLMs. By leveraging semantic information to extract templates from training data, SPS-SQL pre-synthesizes queries based solely on schema information, which serve as few-shot examples to guide further SQL generation. SPS-SQL achieves a execution accuracy on the Spider development set with Llama 3.1 (8 billion parameters) of 80.5%, improved 3.9% over baseline, significantly outperforming other methods on the same model. SPS-SQL achieves competitive results on other LLMs as well, further emphasizing its flexibility and adaptability.
Keywords: Text-to-SQL, Large Language Model, In-context Learning, SQL Synthesis
Suggested Citation: Suggested Citation