Knowledge Distillation and Transformer-Based Framework for Automatic Spine CT Report Generation

Batool, Humaira; Mukhtar, Asmat; Gul Khawaja, Sajid; Alghamdi, Norah Saleh; Mansoor Khan, Asad; Qayyum, Adil; Adil, Ruqqayia; Khan, Zawar; Usman Akram, Muhammad; Usman Akbar, Muhammad; Eklund, Anders

Knowledge Distillation and Transformer-Based Framework for Automatic Spine CT Report Generation

Mark

Batool, Humaira ; Mukhtar, Asmat ; Gul Khawaja, Sajid ; Alghamdi, Norah Saleh ; Mansoor Khan, Asad ; Qayyum, Adil ; Adil, Ruqqayia ; Khan, Zawar ; Usman Akram, Muhammad and Usman Akbar, Muhammad ^LU , et al. (2025) In IEEE Access 13. p.42949-42964

Abstract: Spine Computed Tomography (SCT) is essential for identifying fractures, tumors and degenerative spine diseases, assisting medical practitioners in formulating an accurate diagnosis and treatment. One of the core element of SCT is reporting. The effectiveness of spine reporting is often limited by challenges such as an inadequate infrastructure and lack of experts. Automated SCT analysis has the potential to revolutionize spinal healthcare and improve patient outcomes. To achieve this objective, we proposed a framework for spine report generation that utilizes transformer architecture, trained on textual reports alongside the visual features extracted from the sagittal slices of the SCT volume. A foundation model is used to perform... (More); Spine Computed Tomography (SCT) is essential for identifying fractures, tumors and degenerative spine diseases, assisting medical practitioners in formulating an accurate diagnosis and treatment. One of the core element of SCT is reporting. The effectiveness of spine reporting is often limited by challenges such as an inadequate infrastructure and lack of experts. Automated SCT analysis has the potential to revolutionize spinal healthcare and improve patient outcomes. To achieve this objective, we proposed a framework for spine report generation that utilizes transformer architecture, trained on textual reports alongside the visual features extracted from the sagittal slices of the SCT volume. A foundation model is used to perform Knowledge Distillation (KD) alongside an encoder to ensure an optimal performance. The proposed framework is evaluated on the public dataset (VerSe20). The incorporation of KD results improved both the BERT and BLEU1 score on the dataset, from 0.7486 to 0.7522 and 0.6361 to 0.7291. Additionally, the proposed framework is evaluated using four different types of reports: original radiologist reports, reports without spine-level annotations, rephrased reports, and reports generated by ChatGPT-4o (ChatGPT). The evaluation without spine-level annotations demonstrates superior performance across most metrics, achieving the highest BLEU-1 and ROUGE-L scores, with a BLEU-1 of 0.9293 and a ROUGE-L score of 0.9297. In contrast, the other techniques achieved moderate scores across all metrics. Finally, experienced radiologists assessed the spine report and have given high rating to the original reports across all three criteria (completeness, conciseness and correctness), in comparison to the generated reports. This study's findings suggest that omitting spine-level annotations can improve the quality of text generation.
(Less)

Please use this url to cite or link to this publication: https://lup.lub.lu.se/record/0676c43b-c5fb-476a-a11f-08b6ced6cb10

author

Batool, Humaira ; Mukhtar, Asmat ; Gul Khawaja, Sajid ; Alghamdi, Norah Saleh ; Mansoor Khan, Asad ; Qayyum, Adil ; Adil, Ruqqayia ; Khan, Zawar ; Usman Akram, Muhammad and Usman Akbar, Muhammad ^LU , et al. (More)

Batool, Humaira ; Mukhtar, Asmat ; Gul Khawaja, Sajid ; Alghamdi, Norah Saleh ; Mansoor Khan, Asad ; Qayyum, Adil ; Adil, Ruqqayia ; Khan, Zawar ; Usman Akram, Muhammad ; Usman Akbar, Muhammad ^LU and Eklund, Anders (Less)

organization

publishing date

2025

type

Contribution to journal

publication status

published

subject

keywords

ChatGPT, foundation model, knowledge distillation, Spine report generation

in

IEEE Access

volume

13

pages

16 pages

publisher

IEEE - Institute of Electrical and Electronics Engineers Inc.

external identifiers

scopus:105001061916

ISSN

2169-3536

DOI

10.1109/ACCESS.2025.3546131

language

English

LU publication?

yes

id

0676c43b-c5fb-476a-a11f-08b6ced6cb10

date added to LUP

2025-09-04 09:40:17

date last changed

2025-10-14 10:43:10

@article{0676c43b-c5fb-476a-a11f-08b6ced6cb10,
  abstract     = {{<p>Spine Computed Tomography (SCT) is essential for identifying fractures, tumors and degenerative spine diseases, assisting medical practitioners in formulating an accurate diagnosis and treatment. One of the core element of SCT is reporting. The effectiveness of spine reporting is often limited by challenges such as an inadequate infrastructure and lack of experts. Automated SCT analysis has the potential to revolutionize spinal healthcare and improve patient outcomes. To achieve this objective, we proposed a framework for spine report generation that utilizes transformer architecture, trained on textual reports alongside the visual features extracted from the sagittal slices of the SCT volume. A foundation model is used to perform Knowledge Distillation (KD) alongside an encoder to ensure an optimal performance. The proposed framework is evaluated on the public dataset (VerSe20). The incorporation of KD results improved both the BERT and BLEU1 score on the dataset, from 0.7486 to 0.7522 and 0.6361 to 0.7291. Additionally, the proposed framework is evaluated using four different types of reports: original radiologist reports, reports without spine-level annotations, rephrased reports, and reports generated by ChatGPT-4o (ChatGPT). The evaluation without spine-level annotations demonstrates superior performance across most metrics, achieving the highest BLEU-1 and ROUGE-L scores, with a BLEU-1 of 0.9293 and a ROUGE-L score of 0.9297. In contrast, the other techniques achieved moderate scores across all metrics. Finally, experienced radiologists assessed the spine report and have given high rating to the original reports across all three criteria (completeness, conciseness and correctness), in comparison to the generated reports. This study's findings suggest that omitting spine-level annotations can improve the quality of text generation.</p>}},
  author       = {{Batool, Humaira and Mukhtar, Asmat and Gul Khawaja, Sajid and Alghamdi, Norah Saleh and Mansoor Khan, Asad and Qayyum, Adil and Adil, Ruqqayia and Khan, Zawar and Usman Akram, Muhammad and Usman Akbar, Muhammad and Eklund, Anders}},
  issn         = {{2169-3536}},
  keywords     = {{ChatGPT; foundation model; knowledge distillation; Spine report generation}},
  language     = {{eng}},
  pages        = {{42949--42964}},
  publisher    = {{IEEE - Institute of Electrical and Electronics Engineers Inc.}},
  series       = {{IEEE Access}},
  title        = {{Knowledge Distillation and Transformer-Based Framework for Automatic Spine CT Report Generation}},
  url          = {{http://dx.doi.org/10.1109/ACCESS.2025.3546131}},
  doi          = {{10.1109/ACCESS.2025.3546131}},
  volume       = {{13}},
  year         = {{2025}},
}

Lund University Publications

LUND UNIVERSITY LIBRARIES

Knowledge Distillation and Transformer-Based Framework for Automatic Spine CT Report Generation