Skip to main content

Lund University Publications

LUND UNIVERSITY LIBRARIES

Natural language processing (NLP)-based frameworks for cyber threat intelligence and early prediction of cyberattacks in Industry 4.0 : a systematic literature review

Albarrak, Majed ; Salonitis, Konstantinos and Jagtap, Sandeep LU orcid (2026) In Applied Sciences (Switzerland) 16(2).
Abstract
This study provides a systematic overview of Natural Language Processing (NLP)-based frameworks for Cyber Threat Intelligence (CTI) and the early prediction of cyberattacks in Industry 4.0. As digital transformation accelerates through the integration of IoT, SCADA, and cyber-physical systems, manufacturing environments face an expanding and complex cyber threat landscape. Following the PRISMA 2020 systematic review protocol, 80 peer-reviewed studies published between 2015 and 2025 were analyzed across IEEE Xplore, Scopus, and Web of Science to identify methods that employ NLP for CTI extraction, reasoning, and predictive modelling. The review finds that transformer-based architectures, knowledge graph reasoning, and social media mining... (More)
This study provides a systematic overview of Natural Language Processing (NLP)-based frameworks for Cyber Threat Intelligence (CTI) and the early prediction of cyberattacks in Industry 4.0. As digital transformation accelerates through the integration of IoT, SCADA, and cyber-physical systems, manufacturing environments face an expanding and complex cyber threat landscape. Following the PRISMA 2020 systematic review protocol, 80 peer-reviewed studies published between 2015 and 2025 were analyzed across IEEE Xplore, Scopus, and Web of Science to identify methods that employ NLP for CTI extraction, reasoning, and predictive modelling. The review finds that transformer-based architectures, knowledge graph reasoning, and social media mining are increasingly used to convert unstructured data into actionable intelligence, thereby enabling earlier detection and forecasting of cyber threats. Large Language Models (LLMs) demonstrate strong potential for anticipating attack sequences, while domain-specific models enhance industrial relevance. Persistent challenges include data scarcity, domain adaptation, explainability, and real-time scalability in operational-technology environments. The review concludes that NLP is reshaping Industry 4.0 cybersecurity from reactive defense toward predictive, adaptive, and intelligence-driven protection, and it highlights the need for interpretable, domain-specific, and resource-efficient frameworks to secure Industry 4.0 ecosystems. (Less)
Please use this url to cite or link to this publication:
author
; and
organization
publishing date
type
Contribution to journal
publication status
published
subject
in
Applied Sciences (Switzerland)
volume
16
issue
2
article number
619
publisher
MDPI AG
ISSN
2076-3417
DOI
10.3390/app16020619
language
English
LU publication?
yes
additional info
This article belongs to the special issue: Advances in Cyber Security / Georgi R. Tsochev et al., 2026
id
c624c04b-7076-4ffc-912f-b81fd7a731e4
date added to LUP
2026-01-08 09:01:54
date last changed
2026-01-08 10:58:25
@article{c624c04b-7076-4ffc-912f-b81fd7a731e4,
  abstract     = {{This study provides a systematic overview of Natural Language Processing (NLP)-based frameworks for Cyber Threat Intelligence (CTI) and the early prediction of cyberattacks in Industry 4.0. As digital transformation accelerates through the integration of IoT, SCADA, and cyber-physical systems, manufacturing environments face an expanding and complex cyber threat landscape. Following the PRISMA 2020 systematic review protocol, 80 peer-reviewed studies published between 2015 and 2025 were analyzed across IEEE Xplore, Scopus, and Web of Science to identify methods that employ NLP for CTI extraction, reasoning, and predictive modelling. The review finds that transformer-based architectures, knowledge graph reasoning, and social media mining are increasingly used to convert unstructured data into actionable intelligence, thereby enabling earlier detection and forecasting of cyber threats. Large Language Models (LLMs) demonstrate strong potential for anticipating attack sequences, while domain-specific models enhance industrial relevance. Persistent challenges include data scarcity, domain adaptation, explainability, and real-time scalability in operational-technology environments. The review concludes that NLP is reshaping Industry 4.0 cybersecurity from reactive defense toward predictive, adaptive, and intelligence-driven protection, and it highlights the need for interpretable, domain-specific, and resource-efficient frameworks to secure Industry 4.0 ecosystems.}},
  author       = {{Albarrak, Majed and Salonitis, Konstantinos and Jagtap, Sandeep}},
  issn         = {{2076-3417}},
  language     = {{eng}},
  month        = {{01}},
  number       = {{2}},
  publisher    = {{MDPI AG}},
  series       = {{Applied Sciences (Switzerland)}},
  title        = {{Natural language processing (NLP)-based frameworks for cyber threat intelligence and early prediction of cyberattacks in Industry 4.0 : a systematic literature review}},
  url          = {{https://lup.lub.lu.se/search/files/238360495/applsci-16-00619.pdf}},
  doi          = {{10.3390/app16020619}},
  volume       = {{16}},
  year         = {{2026}},
}