Skip to main content

Lund University Publications

LUND UNIVERSITY LIBRARIES

Memory-based Language Models : An Efficient, Explainable, and Eco-friendly Approach to Large Language Modeling

van den Bosch, Antal ; Risco Patón, Ainhoa ; Buijse, Teun ; Berck, Peter LU orcid and van Gompel, Maarten (2025)
Abstract
We present memory-based language modeling as an efficient, eco-friendly alternative to deep neural network-based language modeling. It offers log-linearly scalable next-token prediction performance and strong memorization capabilities. Implementing fast approximations of k-nearest neighbor classification, memory-based language modeling leaves a relatively small ecological footprint both in training and in inference mode, as it relies fully on CPUs and attains low token latencies. Its internal workings are simple and fully transparent. We compare our implementation of memory-based language modeling, OLIFANT, with GPT-2 and GPT-Neo on next-token prediction accuracy, estimated emissions and speeds, and offer some deeper analyses of the model.
Please use this url to cite or link to this publication:
author
; ; ; and
organization
publishing date
type
Working paper/Preprint
publication status
published
subject
keywords
AI, machine learning, language modelling
pages
15 pages
publisher
arXiv.org
language
English
LU publication?
yes
id
a6723008-96f5-414f-8ad3-7f4674c2e745
alternative location
https://arxiv.org/abs/2510.22317
date added to LUP
2025-11-17 12:45:16
date last changed
2025-12-01 16:23:17
@misc{a6723008-96f5-414f-8ad3-7f4674c2e745,
  abstract     = {{We present memory-based language modeling as an efficient, eco-friendly alternative to deep neural network-based language modeling. It offers log-linearly scalable next-token prediction performance and strong memorization capabilities. Implementing fast approximations of k-nearest neighbor classification, memory-based language modeling leaves a relatively small ecological footprint both in training and in inference mode, as it relies fully on CPUs and attains low token latencies. Its internal workings are simple and fully transparent. We compare our implementation of memory-based language modeling, OLIFANT, with GPT-2 and GPT-Neo on next-token prediction accuracy, estimated emissions and speeds, and offer some deeper analyses of the model.}},
  author       = {{van den Bosch, Antal and Risco Patón, Ainhoa and Buijse, Teun and Berck, Peter and van Gompel, Maarten}},
  keywords     = {{AI; machine learning; language modelling}},
  language     = {{eng}},
  month        = {{10}},
  note         = {{Preprint}},
  publisher    = {{arXiv.org}},
  title        = {{Memory-based Language Models : An Efficient, Explainable, and Eco-friendly Approach to Large Language Modeling}},
  url          = {{https://arxiv.org/abs/2510.22317}},
  year         = {{2025}},
}