AlphaQuant: LLM-Driven Automated Robust Feature Engineering for Quantitative Finance

7 Pages Posted: 25 Mar 2025 Last revised: 18 Apr 2025

Date Written: February 05, 2025

Abstract

Feature engineering is critical to predictive modeling, transforming raw data into meaningful features that enhance performance. However, traditional feature engineering is labor-intensive and prone to biases, while automated methods often lack robustness and interpretability. This paper introduces a novel framework that combines large language models (LLMs) with evolutionary optimization to automate robust feature discovery. The framework integrates LLMs for domain specific feature generation and a rigorous evaluation loop using machine-learning models hyper-tuned with time-series cross-validation on historical asset performance to ensure their robustness. The key contributions: (1) an LLM-powered system for generating domain-relevant, interpretable feature extraction functions, (2) an evolutionary illumination process that iteratively refines the feature-set based on importance scores from hyper-tuned models, and (3) empirical validation on financial data demonstrating significant improvements in predictive accuracy and feature robustness. The results highlight the potential of LLMs to revolutionize feature engineering, paving the way for interpretable machine-learning models. The set of discovered features is open-sourced for the reproducibility of results.

Suggested Citation

Yuksel, Kamer Ali, AlphaQuant: LLM-Driven Automated Robust Feature Engineering for Quantitative Finance (February 05, 2025). Available at SSRN: https://ssrn.com/abstract=5124841 or http://dx.doi.org/10.2139/ssrn.5124841

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Downloads
330
Abstract Views
1,018
Rank
196,543
PlumX Metrics