Vision transformers for segmenting organs and tissues in CT scans of arbitrary imaging ranges

Thånell, Morris; Melander, Petter

Vision transformers for segmenting organs and tissues in CT scans of arbitrary imaging ranges

Mark

Thånell, Morris ^LU and Melander, Petter ^LU (2024) In Master's Theses in Mathematical Sciences FMAM05 20241
Mathematics (Faculty of Engineering)

Abstract: Automatic segmentation of organs and tissues in computed tomography (CT) images can aid clinicians in anatomical contextualization for planning surgery or dosimetry. CT scans can cover varying axial ranges of the body. This thesis aims to develop a neural network based on vision transformers for segmenting organs and tissues in CT images of arbitrary axial ranges. Two models are presented, one based on Swin UNETR, and a new model that uses axial slices for its patch embedding.

It is difficult to segment each rib and vertebra semantically in limited imaging ranges, and therefore, instance segmentation was implemented for those classes. Both models were trained to perform semantic and instance segmentation simultaneously. Sliding window... (More); Automatic segmentation of organs and tissues in computed tomography (CT) images can aid clinicians in anatomical contextualization for planning surgery or dosimetry. CT scans can cover varying axial ranges of the body. This thesis aims to develop a neural network based on vision transformers for segmenting organs and tissues in CT images of arbitrary axial ranges. Two models are presented, one based on Swin UNETR, and a new model that uses axial slices for its patch embedding.

It is difficult to segment each rib and vertebra semantically in limited imaging ranges, and therefore, instance segmentation was implemented for those classes. Both models were trained to perform semantic and instance segmentation simultaneously. Sliding window inference was used to segment arbitrary axial ranges, and methods for ensuring continuity of the instances were developed. The instance segmentation was implemented in two ways, one using a discriminative loss function and one using connected component labeling.

The models presented can perform both semantic and instance segmentation simultaneously with a simple approach. Both models performed well on semantic segmentation of all organs except the ribs and vertebrae, with Dice scores above 0.8 for most organs, and our best model achieved a score of 0.847 on test data, averaged across all organs. Instance segmentation of ribs and vertebrae through discriminative loss worked well, with accurate segmentations and few false positives and false negatives. Separating ribs into instances through the use of connected component labeling gave even better results. Overall, the Swin-based model performed better than the slice model. (Less)

Please use this url to cite or link to this publication: http://lup.lub.lu.se/student-papers/record/9163056

author

Thånell, Morris ^LU and Melander, Petter ^LU

supervisor

organization

Mathematics (Faculty of Engineering)

alternative title

Vision transformers för segmentering av organ och vävnader i CT-bilder av olika snitt av kroppen

course

FMAM05 20241

year

2024

type

H2 - Master's Degree (Two Years)

subject

Mathematics and Statistics

keywords

Semantic segmentation, Instance segmentation, CT, Vision transformer

publication/series

Master's Theses in Mathematical Sciences

report number

LUTFMA-3546-2024

ISSN

1404-6342

other publication id

2024:E46

language

English

id

9163056

date added to LUP

2024-06-14 14:46:04

date last changed

2024-06-14 14:46:04

@misc{9163056,
  abstract     = {{Automatic segmentation of organs and tissues in computed tomography (CT) images can aid clinicians in anatomical contextualization for planning surgery or dosimetry. CT scans can cover varying axial ranges of the body. This thesis aims to develop a neural network based on vision transformers for segmenting organs and tissues in CT images of arbitrary axial ranges. Two models are presented, one based on Swin UNETR, and a new model that uses axial slices for its patch embedding. 

It is difficult to segment each rib and vertebra semantically in limited imaging ranges, and therefore, instance segmentation was implemented for those classes. Both models were trained to perform semantic and instance segmentation simultaneously. Sliding window inference was used to segment arbitrary axial ranges, and methods for ensuring continuity of the instances were developed. The instance segmentation was implemented in two ways, one using a discriminative loss function and one using connected component labeling. 

The models presented can perform both semantic and instance segmentation simultaneously with a simple approach. Both models performed well on semantic segmentation of all organs except the ribs and vertebrae, with Dice scores above 0.8 for most organs, and our best model achieved a score of 0.847 on test data, averaged across all organs. Instance segmentation of ribs and vertebrae through discriminative loss worked well, with accurate segmentations and few false positives and false negatives. Separating ribs into instances through the use of connected component labeling gave even better results. Overall, the Swin-based model performed better than the slice model.}},
  author       = {{Thånell, Morris and Melander, Petter}},
  issn         = {{1404-6342}},
  language     = {{eng}},
  note         = {{Student Paper}},
  series       = {{Master's Theses in Mathematical Sciences}},
  title        = {{Vision transformers for segmenting organs and tissues in CT scans of arbitrary imaging ranges}},
  year         = {{2024}},
}

LUP Student Papers

LUND UNIVERSITY LIBRARIES

Vision transformers for segmenting organs and tissues in CT scans of arbitrary imaging ranges