Graph Retrieval-Augmented Generation for Large Language Models: A Survey

4 Pages Posted: 15 Aug 2024

See all articles by Tyler Procko

Tyler Procko

Embry-Riddle Aeronautical University

Omar Ochoa

Embry-Riddle Aeronautical University

Date Written: July 13, 2024

Abstract

Large Language Models (LLMs) demonstrate general knowledge, but they suffer when specifically needed knowledge is not present in their training set. Two approaches to ameliorating this, without retraining , are 1) prompt engineering and 2) Retrieval-Augmented Generation (RAG). RAG is a form of prompt engineering, insofar as relevant lexical snippets retrieved from RAG corpora are vectorized and aggregated with prompts. However, RAG documents are often noisy, i.e., while relevant to a given prompt, they can contain much other information that obfuscates the desired snippet. If the purpose of pre-training a LLM on massive and general corpora is to engender a generally applicable model, RAG is not: it is a means of LLM optimization, and as such, RAG document selection must be precise, not general. For expert tasks, it is imperative that a RAG corpus be as noise-free as possible, in much the same way a good prompt should be free of irrelevant text. Knowledge Graphs (KGs) provide a concise means of representing domain knowledge free of noisy information. This paper surveys work incorporating KGs with LLM RAG, intending to equip scientists with a better understanding of this novel research area for future work.

Keywords: LLM, GPT, fine-tuning, knowledge graphs, RAG

Suggested Citation

Procko, Tyler and Ochoa, Omar, Graph Retrieval-Augmented Generation for Large Language Models: A Survey (July 13, 2024). Available at SSRN: https://ssrn.com/abstract=4895062

Tyler Procko (Contact Author)

Embry-Riddle Aeronautical University ( email )

600 South Clyde Morris Blvd.
Daytona Beach, FL 32114
United States

Omar Ochoa

Embry-Riddle Aeronautical University ( email )

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Downloads
3,317
Abstract Views
13,978
Rank
9,107
PlumX Metrics