Prompt Provenance: Toward Traceable LLM Interactions
5 Pages Posted: 14 Nov 2025
Date Written: October 07, 2025
Abstract
Large Language Models (LLMs) operate as black boxes, with transient inputs and outputs, i.e., there is little trace of the context, agents, or data that shapes responses. Yet every LLM interaction is a provenance event: an activity producing an artifact under the influence of agents and prior states. This paper introduces the Prompt Provenance Model (PPM), a conceptual model for representing the lineage of prompts, completions, and dialogue histories using the PROV framework of the World Wide Web Consortium (W3C). The PPM extends PROV-O to treat prompts as first-class entities, defining relations between user intent, retrieval sources, system messages, and generated artifacts. It is posited that capturing prompt-level provenance is essential for auditability, explainability, and regulatory compliance in LLM ecosystems. Example applications demonstrate its use for research reproducibility, model debugging, and forensic accountability. This paper contends that prompt provenance is foundational to the trustworthy deployment of Findable, Accessible, Interoperable, and Reusable (FAIR) generative AI.
Keywords: LLM, Prompt Engineering, Provenance, PROV-O, Fair
Suggested Citation: Suggested Citation