AC_MAPPER : a robust approach to ATT&CK technique classification using input augmentation and class rebalancing

Albarrak, Majed; Alqudhaibi, Adel; Jagtap, Sandeep

AC_MAPPER : a robust approach to ATT&CK technique classification using input augmentation and class rebalancing

Mark

Albarrak, Majed ; Alqudhaibi, Adel and Jagtap, Sandeep ^LU

(2025) In International Journal of Information Security 24.

Abstract: The detection and classification of adversarial techniques from cyber threat intelligence (CTI) text is a critical task in threat analysis and mitigation. While recent transformer-based models have shown promise, their general-purpose nature often limits effectiveness on complex, domain-specific datasets. In this paper, we present a novel model designed to address the challenges of technique classification across heterogeneous CTI datasets. The proposed method is evaluated against several baselines, including CTI-specific models as well as general-purpose transformers like SciBERT and DistilBERT. The proposed approach “AC_MAPPER” consistently outperforms all baselines in both Accuracy and F1 scores across five benchmark datasets,... (More); The detection and classification of adversarial techniques from cyber threat intelligence (CTI) text is a critical task in threat analysis and mitigation. While recent transformer-based models have shown promise, their general-purpose nature often limits effectiveness on complex, domain-specific datasets. In this paper, we present a novel model designed to address the challenges of technique classification across heterogeneous CTI datasets. The proposed method is evaluated against several baselines, including CTI-specific models as well as general-purpose transformers like SciBERT and DistilBERT. The proposed approach “AC_MAPPER” consistently outperforms all baselines in both Accuracy and F1 scores across five benchmark datasets, achieving up to 93.59% accuracy and 93.78% macro F1 on the TRAM Bootstrap dataset. It also demonstrates superior robustness on highly imbalanced and sparse datasets such as HALdata and CAPEC, where baseline models struggle. Comprehensive performance comparisons, highlights the effectiveness of proposed approach. These results underscore the potential of integrating domain-specific design with transformer architectures to advance automated CTI analysis. Our findings contribute toward more accurate and reliable threat detection systems in real-world security applications.
(Less)

Please use this url to cite or link to this publication: https://lup.lub.lu.se/record/43d9da2e-9c90-4485-b2e9-aef7774828df

author

Albarrak, Majed ; Alqudhaibi, Adel and Jagtap, Sandeep ^LU

organization

publishing date

2025-12

type

Contribution to journal

publication status

published

subject

Computer Sciences

keywords

Cyber threat intelligence (CTI), Data augmentation, Large language models (LLMs), MITRE ATT&CK framework, Natural language processing (NLP)

in

International Journal of Information Security

volume

24

article number

232

publisher

Springer

external identifiers

scopus:105021235563

ISSN

1615-5262

DOI

10.1007/s10207-025-01146-5

language

English

LU publication?

yes

id

43d9da2e-9c90-4485-b2e9-aef7774828df

date added to LUP

2025-10-31 08:38:27

date last changed

2025-12-09 09:25:06

@article{43d9da2e-9c90-4485-b2e9-aef7774828df,
  abstract     = {{<p>The detection and classification of adversarial techniques from cyber threat intelligence (CTI) text is a critical task in threat analysis and mitigation. While recent transformer-based models have shown promise, their general-purpose nature often limits effectiveness on complex, domain-specific datasets. In this paper, we present a novel model designed to address the challenges of technique classification across heterogeneous CTI datasets. The proposed method is evaluated against several baselines, including CTI-specific models as well as general-purpose transformers like SciBERT and DistilBERT. The proposed approach “AC_MAPPER” consistently outperforms all baselines in both Accuracy and F1 scores across five benchmark datasets, achieving up to 93.59% accuracy and 93.78% macro F1 on the TRAM Bootstrap dataset. It also demonstrates superior robustness on highly imbalanced and sparse datasets such as HALdata and CAPEC, where baseline models struggle. Comprehensive performance comparisons, highlights the effectiveness of proposed approach. These results underscore the potential of integrating domain-specific design with transformer architectures to advance automated CTI analysis. Our findings contribute toward more accurate and reliable threat detection systems in real-world security applications.</p>}},
  author       = {{Albarrak, Majed and Alqudhaibi, Adel and Jagtap, Sandeep}},
  issn         = {{1615-5262}},
  keywords     = {{Cyber threat intelligence (CTI); Data augmentation; Large language models (LLMs); MITRE ATT&CK framework; Natural language processing (NLP)}},
  language     = {{eng}},
  publisher    = {{Springer}},
  series       = {{International Journal of Information Security}},
  title        = {{AC_MAPPER : a robust approach to ATT&CK technique classification using input augmentation and class rebalancing}},
  url          = {{https://lup.lub.lu.se/search/files/232615778/s10207-025-01146-5.pdf}},
  doi          = {{10.1007/s10207-025-01146-5}},
  volume       = {{24}},
  year         = {{2025}},
}

Lund University Publications

LUND UNIVERSITY LIBRARIES

AC_MAPPER : a robust approach to ATT&CK technique classification using input augmentation and class rebalancing