The Construction of Instruction-tuned LLMs for Finance without Instruction Data Using Continual Pretraining and Model Merging

9 Pages Posted: 7 Nov 2024

See all articles by Masanori Hirano

Masanori Hirano

Preferred Networks, Inc.

Kentaro Imajo

Preferred Networks, Inc.

Date Written: September 30, 2024

Abstract

This paper proposes a novel method for constructing instruction-tuned large language models (LLMs) for finance without instruction data. Traditionally, developing such domainspecific LLMs has been resource-intensive, requiring a large dataset and significant computational power for continual pretraining and instruction tuning. Our study proposes a simpler approach that combines domain-specific continual pretraining with model merging. Given that general-purpose pretrained LLMs and their instruction-tuned LLMs are often publicly available, they can be leveraged to obtain the necessary instruction task vector. By merging this with a domain-specific pretrained vector, we can effectively create instruction-tuned LLMs for finance without additional instruction data. Our process involves two steps: first, we perform continual pretraining on financial data; second, we merge the instruction-tuned vector with the domain-specific pretrained vector. Our experiments demonstrate the successful construction of instruction-tuned LLMs for finance. One major advantage of our method is that the instruction-tuned and domain-specific pretrained vectors are nearly independent. This independence makes our approach highly effective. The Japanese financial instruction-tuned LLMs we developed in this study are available at https://huggingface. co/pfnet/nekomata-14b-pfn-qfin-inst-merge.

Keywords: finance, large language models, continual pretraining, model merging, instruction

Suggested Citation

Hirano, Masanori and Imajo, Kentaro, The Construction of Instruction-tuned LLMs for Finance without Instruction Data Using Continual Pretraining and Model Merging (September 30, 2024). Available at SSRN: https://ssrn.com/abstract=4971271 or http://dx.doi.org/10.2139/ssrn.4971271

Masanori Hirano (Contact Author)

Preferred Networks, Inc. ( email )

Otemachi Bldg., 1-6-1 Otemachi
Chiyoda-ku, Tokyo 1000004
Japan

Kentaro Imajo

Preferred Networks, Inc.

Otemachi Bldg., 1-6-1 Otemachi
Chiyoda-ku, Tokyo 1000004
Japan

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Downloads
17
Abstract Views
123
PlumX Metrics