“How many electricians are needed to change a lightbulb?” Matching the large language model to the task at hand.
23 Pages Posted: 20 Jun 2023
Date Written: June 9, 2023
Large language models (LLMs), such as GPT-4 and Anthropic Claude, are increasingly used by diverse actors from a wide range of industries and by individuals. The largest LLMs have an important energy impact during their training and also through their continued use. Here, the potential of eight smaller scale models, which can be run on high-spec personal machines is swiftly assessed through 8 tasks. Across several relatively simple tasks (realistic use cases), the 4-bit quantized version of the LLMs Vicuna-7B and WizardML-7B show subjectively satisfactory capabilities, but fail at other tasks. All assessed models are not capable of detecting figurative language (one task) and responding appropriately. The findings are discussed in the context the minimization of the ecologic impact of such technology.
Keywords: Large language models, Energy efficiency, Subjective performance, Swift assessment
JEL Classification: C6, C63
Suggested Citation: Suggested Citation