Automatic Gleason Classification of Prostate Cancer - Classification of Small Regions

Tall, Kasper

Automatic Gleason Classification of Prostate Cancer - Classification of Small Regions

Mark

Tall, Kasper ^LU (2018) In Master's Theses in Mathematical Sciences FMAM05 20181
Mathematics (Faculty of Engineering)

Abstract: Purpose: To classify the severity of a case of prostate cancer, physicians use the 10-grade Gleason score. The purpose of this Master’s thesis is to study how small dimensions of image crops affect the Gleason 5-classification capability of a machine learning system. In this thesis, two aspects of dimensionality have been taken into account when creating image crops, the image crop size and the degree of magnification.

Methodology: 70 x 70 and 128 x 128 pixel images, both with a 40X magnification, were cropped from larger tissue images annotated at Skåne University Hospital (SUS), creating one data set for each image crop size. The networks trained on these data sets were as follows: a CNN-architecture, a CNN-architecture with an... (More); Purpose: To classify the severity of a case of prostate cancer, physicians use the 10-grade Gleason score. The purpose of this Master’s thesis is to study how small dimensions of image crops affect the Gleason 5-classification capability of a machine learning system. In this thesis, two aspects of dimensionality have been taken into account when creating image crops, the image crop size and the degree of magnification.

Methodology: 70 x 70 and 128 x 128 pixel images, both with a 40X magnification, were cropped from larger tissue images annotated at Skåne University Hospital (SUS), creating one data set for each image crop size. The networks trained on these data sets were as follows: a CNN-architecture, a CNN-architecture with an Inception-v4-module at the end, a ResNet-architecture, and a CNN-architecture with an Inception-ResNet-v1-module at the end.

Results: The ResNet-architectures performed the best on the created data sets, achieving mean 5-fold cross-validation accuracies of 91.9% and 96.5 % for the 70 x 70 and 128 x 128 pixel images respectively. However, these architectures experienced temporary drops in accuracy. Furthermore, the modified CNN-networks could not be determined to definitely outperform the base CNN-networks.

Conclusion: The results indicated that image crops of sizes larger than 70 x 70 when using a magnification of 40X were preferable for PCa-classification purposes. However, the classification effects of using different architecture designs were inconclusive. (Less)
Popular Abstract (Swedish): I medicinsk bildanalys används bilder föreställande bland annat cellprover för att träna program att känna igen vissa mönster, så kallad maskininlärning. Programmen, som kan variera i sin design, kan därefter användas för att automatiskt avgöra om prover visar på förekomst av allvarliga sjukdomar, så kallad klassificering. Programdesignen som används kallas även för arkitektur. Prostatacancer, en av de vanligaste formerna av cancer hos män, har tidigare studerats med hjälp av många olika maskininlärningstekniker. Men hur stora bilder behöver man egentligen för att träna sådana program? Denna fråga är värd att besvara för att kunna minska minnesåtgången och träningstiden för maskinlärningsprogram.

I detta examensarbete har automatisk... (More); I medicinsk bildanalys används bilder föreställande bland annat cellprover för att träna program att känna igen vissa mönster, så kallad maskininlärning. Programmen, som kan variera i sin design, kan därefter användas för att automatiskt avgöra om prover visar på förekomst av allvarliga sjukdomar, så kallad klassificering. Programdesignen som används kallas även för arkitektur. Prostatacancer, en av de vanligaste formerna av cancer hos män, har tidigare studerats med hjälp av många olika maskininlärningstekniker. Men hur stora bilder behöver man egentligen för att träna sådana program? Denna fråga är värd att besvara för att kunna minska minnesåtgången och träningstiden för maskinlärningsprogram.

I detta examensarbete har automatisk mönsterigenkänning av prostatacancerbilder av olika storlekar med 40 gångers förstoring studerats. De bilder som har använts har bestått av infärgade snitt av biopsiprover. Prostatacancerprover extraherade genom nålprover bedöms på en femgradig skala. Som ett första steg skapades därför mindre urklipp ur större prostatacancerbilder med olika grader av cancerspridning. Dessa bilder delades in i två kategorier, Gleason 5 (den mest elakartade typen av prostatacancer), respektive icke-Gleason 5. För att träna maskininlärningsprogram med olika design på olika sorters bilder skapades även roterade och speglade varianter av urklippen. I syfte att undersöka storlekens påverkan på automatisk prostatacancergradering skapades uppsättningar av bilder med storlek 70 x 70 respektive 128 x 128 pixlar.

Maskininlärningsdesign kan anta olika form beroende på vad dess skapare vill att programmen skall fokusera på i bilderna. I detta examensarbete skapades arkitekturer baserade på ett flertal olika tekniker. Till vissa av dessa arkitekturer har även moduler baserade på andra tekniker lagts till och därigenom kombinerat olika tekniker. Metoder som har använts är bland annat detektering av bildmönster av olika omfång i samma bild samt tekniker för att förenkla mönsterigenkänningsprocessen. Totalt skapades sju olika sorters arkitekturer, baserade på fyra sorters maskininlärningsmetodologier. Varje arkitektur var specialdesignad för bilder av en viss storlek, förutom en arkitektur som kunde användas både för 70 x 70 och 128 x 128 pixels urklipp.

Testresultaten för programmen visade att arkitekturer med större tränings- och testbilder uppnådde högre klassificeringsnoggrannhet. Dock kunde det ej bevisas att någon arkitektur garanterat presterade bättre än de andra. Anledningen till detta berodde på att de modifierade arkitekturerna uppvisade stora svängningar i noggrannhet under den tid då de tränades (testperioden). Arkitekturen med högst klassificeringsnoggrannhet var även instabilt då det led av kraftiga temporära minskningar i klassificeringsnoggrannheten. Undersökningar av felklassificerade bilder för de bästa arkitekturerna visade att dessa arkitekturer hade problem att klassificera bilder med låg kontrast eller med tätt intilliggande celler.

I medicinska bilder kan det förekomma variationer i hur elakartade de cellulära mönstren är. Genom att skapa små bilder kan bilderna komma att spegla andra mönster än de som de ursprungliga bilderna är klassificerade som. Genom att träna arkitekturer på mindre bilder kan dessa arkitekturer därför komma att förknippa fel mönster med en viss sorts klassificering. Resultaten av detta examensarbete indikerar att användning av bilder större än 70 x 70 kan vara bättre lämpade för prostatacancerklassificering. Vad det gäller hur arkitekturdesign bäst anpassas till små bilder var resultaten dock oklara. Hur små bilder som kan användas för att träna arkitekturer för att uppnå goda klassificeringsresultat samt vilken arkitekturdesign som ger bäst klassificeringsresultat återstår för framtida forskare att avgöra. (Less)

Please use this url to cite or link to this publication: http://lup.lub.lu.se/student-papers/record/8956950

author

Tall, Kasper ^LU

supervisor

Anders Heyden ^LU
Ida Arvidsson ^LU

organization

Mathematics (Faculty of Engineering)

course

FMAM05 20181

year

2018

type

H2 - Master's Degree (Two Years)

subject

Technology and Engineering

keywords

Prostate cancer, Gleason grading, CNN, Inception, ResNet, Inception-ResNet

publication/series

Master's Theses in Mathematical Sciences

report number

LUTFMA-3363-2018

ISSN

1404-6342

other publication id

2018:E63

language

English

id

8956950

date added to LUP

2018-09-05 14:48:52

date last changed

2018-09-05 14:48:52

@misc{8956950,
  abstract     = {{Purpose: To classify the severity of a case of prostate cancer, physicians use the 10-grade Gleason score. The purpose of this Master’s thesis is to study how small dimensions of image crops affect the Gleason 5-classification capability of a machine learning system. In this thesis, two aspects of dimensionality have been taken into account when creating image crops, the image crop size and the degree of magnification.

Methodology: 70 x 70 and 128 x 128 pixel images, both with a 40X magnification, were cropped from larger tissue images annotated at Skåne University Hospital (SUS), creating one data set for each image crop size. The networks trained on these data sets were as follows: a CNN-architecture, a CNN-architecture with an Inception-v4-module at the end, a ResNet-architecture, and a CNN-architecture with an Inception-ResNet-v1-module at the end.

Results: The ResNet-architectures performed the best on the created data sets, achieving mean 5-fold cross-validation accuracies of 91.9% and 96.5 % for the 70 x 70 and 128 x 128 pixel images respectively. However, these architectures experienced temporary drops in accuracy. Furthermore, the modified CNN-networks could not be determined to definitely outperform the base CNN-networks.

Conclusion: The results indicated that image crops of sizes larger than 70 x 70 when using a magnification of 40X were preferable for PCa-classification purposes. However, the classification effects of using different architecture designs were inconclusive.}},
  author       = {{Tall, Kasper}},
  issn         = {{1404-6342}},
  language     = {{eng}},
  note         = {{Student Paper}},
  series       = {{Master's Theses in Mathematical Sciences}},
  title        = {{Automatic Gleason Classification of Prostate Cancer - Classification of Small Regions}},
  year         = {{2018}},
}

LUP Student Papers

LUND UNIVERSITY LIBRARIES

Automatic Gleason Classification of Prostate Cancer - Classification of Small Regions