Vision transformers for segmenting organs and tissues in CT scans of arbitrary imaging ranges
(2024) In Master's Theses in Mathematical Sciences FMAM05 20241Mathematics (Faculty of Engineering)
- Abstract
- Automatic segmentation of organs and tissues in computed tomography (CT) images can aid clinicians in anatomical contextualization for planning surgery or dosimetry. CT scans can cover varying axial ranges of the body. This thesis aims to develop a neural network based on vision transformers for segmenting organs and tissues in CT images of arbitrary axial ranges. Two models are presented, one based on Swin UNETR, and a new model that uses axial slices for its patch embedding.
It is difficult to segment each rib and vertebra semantically in limited imaging ranges, and therefore, instance segmentation was implemented for those classes. Both models were trained to perform semantic and instance segmentation simultaneously. Sliding window... (More) - Automatic segmentation of organs and tissues in computed tomography (CT) images can aid clinicians in anatomical contextualization for planning surgery or dosimetry. CT scans can cover varying axial ranges of the body. This thesis aims to develop a neural network based on vision transformers for segmenting organs and tissues in CT images of arbitrary axial ranges. Two models are presented, one based on Swin UNETR, and a new model that uses axial slices for its patch embedding.
It is difficult to segment each rib and vertebra semantically in limited imaging ranges, and therefore, instance segmentation was implemented for those classes. Both models were trained to perform semantic and instance segmentation simultaneously. Sliding window inference was used to segment arbitrary axial ranges, and methods for ensuring continuity of the instances were developed. The instance segmentation was implemented in two ways, one using a discriminative loss function and one using connected component labeling.
The models presented can perform both semantic and instance segmentation simultaneously with a simple approach. Both models performed well on semantic segmentation of all organs except the ribs and vertebrae, with Dice scores above 0.8 for most organs, and our best model achieved a score of 0.847 on test data, averaged across all organs. Instance segmentation of ribs and vertebrae through discriminative loss worked well, with accurate segmentations and few false positives and false negatives. Separating ribs into instances through the use of connected component labeling gave even better results. Overall, the Swin-based model performed better than the slice model. (Less)
Please use this url to cite or link to this publication:
http://lup.lub.lu.se/student-papers/record/9163056
- author
- Thånell, Morris LU and Melander, Petter LU
- supervisor
- organization
- alternative title
- Vision transformers för segmentering av organ och vävnader i CT-bilder av olika snitt av kroppen
- course
- FMAM05 20241
- year
- 2024
- type
- H2 - Master's Degree (Two Years)
- subject
- keywords
- Semantic segmentation, Instance segmentation, CT, Vision transformer
- publication/series
- Master's Theses in Mathematical Sciences
- report number
- LUTFMA-3546-2024
- ISSN
- 1404-6342
- other publication id
- 2024:E46
- language
- English
- id
- 9163056
- date added to LUP
- 2024-06-14 14:46:04
- date last changed
- 2024-06-14 14:46:04
@misc{9163056, abstract = {{Automatic segmentation of organs and tissues in computed tomography (CT) images can aid clinicians in anatomical contextualization for planning surgery or dosimetry. CT scans can cover varying axial ranges of the body. This thesis aims to develop a neural network based on vision transformers for segmenting organs and tissues in CT images of arbitrary axial ranges. Two models are presented, one based on Swin UNETR, and a new model that uses axial slices for its patch embedding. It is difficult to segment each rib and vertebra semantically in limited imaging ranges, and therefore, instance segmentation was implemented for those classes. Both models were trained to perform semantic and instance segmentation simultaneously. Sliding window inference was used to segment arbitrary axial ranges, and methods for ensuring continuity of the instances were developed. The instance segmentation was implemented in two ways, one using a discriminative loss function and one using connected component labeling. The models presented can perform both semantic and instance segmentation simultaneously with a simple approach. Both models performed well on semantic segmentation of all organs except the ribs and vertebrae, with Dice scores above 0.8 for most organs, and our best model achieved a score of 0.847 on test data, averaged across all organs. Instance segmentation of ribs and vertebrae through discriminative loss worked well, with accurate segmentations and few false positives and false negatives. Separating ribs into instances through the use of connected component labeling gave even better results. Overall, the Swin-based model performed better than the slice model.}}, author = {{Thånell, Morris and Melander, Petter}}, issn = {{1404-6342}}, language = {{eng}}, note = {{Student Paper}}, series = {{Master's Theses in Mathematical Sciences}}, title = {{Vision transformers for segmenting organs and tissues in CT scans of arbitrary imaging ranges}}, year = {{2024}}, }