Two subtypes of lung cancer classification from histopathology images based on deep learning

Shen, Xiao Chen

Two subtypes of lung cancer classification from histopathology images based on deep learning

Mark

Shen, Xiao Chen ^LU (2019) FYTM03 20182
Computational Biology and Biological Physics - Has been reorganised

Abstract: Adenocarcinoma (LUAD) and squamous cell carcinoma (LUSC) are the most common forms of lung cancer. These two subtypes of lung cancer are usually classified by visual inspection clinically. Our aim is to design an effective strategy based on convolutional neural networks to classify histopathology slides of these two types of lung cancer. With augmentation of the histopathology slides, different classifiers were trained and three ensemble learning methods were compared in this project. Finally, we determined which training strategy that produced the best result.

In the case of limited samples, we find that the combination of transfer learning and ensemble learning greatly improves the classification accuracy for whole-slide images of... (More); Adenocarcinoma (LUAD) and squamous cell carcinoma (LUSC) are the most common forms of lung cancer. These two subtypes of lung cancer are usually classified by visual inspection clinically. Our aim is to design an effective strategy based on convolutional neural networks to classify histopathology slides of these two types of lung cancer. With augmentation of the histopathology slides, different classifiers were trained and three ensemble learning methods were compared in this project. Finally, we determined which training strategy that produced the best result.

In the case of limited samples, we find that the combination of transfer learning and ensemble learning greatly improves the classification accuracy for whole-slide images of lung tissue. The optimal strategy achieved 94.2% accuracy with 120 training cases and 86.3% accuracy with 80 independent test cases. We consider this comprehensive strategy remarkable in solving the classification problem with the different kinds of lung cancer. (Less)
Popular Abstract: Artificial Neural network (ANN), as the name suggests, is a mathematical model to simulate the brain. Therefore, its development is more or less inspired by the research of brain science. In the 1960s, when Hubel and Wiesel studied the local sensitive and directional selection of neurons in the cat's primary visual cortex, they found that a unique network structure can effectively reduce the complexity of the traditional ANN. Subsequently, many scientific researches were inspired by this work and built the basic structure of convolutional neural network (CNN). Thanks to massive reductions in both the amount of computation and the number of parameters, CNN has achieved unprecedented success in the field of computer vision. Now, CNN has... (More); Artificial Neural network (ANN), as the name suggests, is a mathematical model to simulate the brain. Therefore, its development is more or less inspired by the research of brain science. In the 1960s, when Hubel and Wiesel studied the local sensitive and directional selection of neurons in the cat's primary visual cortex, they found that a unique network structure can effectively reduce the complexity of the traditional ANN. Subsequently, many scientific researches were inspired by this work and built the basic structure of convolutional neural network (CNN). Thanks to massive reductions in both the amount of computation and the number of parameters, CNN has achieved unprecedented success in the field of computer vision. Now, CNN has become one of the research hotspots in many scientific fields, such as autonomous driving and natural language processing. While, with the development of ANN, various technologies, such as transfer learning and ensemble learning, have been proved effective in many aspects of computer vision.

Since visual inspection of histopathology slides is one of the main methods used by pathologists to assess the stage, type and subtype of tumors in the medical field, this naturally leads to introduce CNN into the task of medical image classification. The use of machine learning in medicine can not only improve the accuracy of medical image classification, but also reduce the workload of physiologists. This study expects to build a powerful deep CNN when the number of the medical image is limited, so that the model can achieve better classification accuracy with two main subtypes of lung cancer, LUAD and LUSC. Usually, in a large tissue section image, the cancer cells cover only part of the image. So when the large image is segmented into tiles for training, we need to select tiles which contain really important tumor information to train the neural network. As a result, we adopted a multi-level training process to filter the unimportant tiles, allowing our model to capture the really important cancer features from the informative images. Based on this idea, only a small part of images in the database can be used to get a classification accuracy rate higher than 85%. Besides, because many medical images have some degree of similarity, we also hope to extend the remarkable strategy to other related medical image classification tasks. (Less)

Please use this url to cite or link to this publication: http://lup.lub.lu.se/student-papers/record/8991237

author

Shen, Xiao Chen ^LU

supervisor

Mattias Ohlsson ^LU

organization

Computational Biology and Biological Physics - Has been reorganised

course

FYTM03 20182

year

2019

type

H2 - Master's Degree (Two Years)

subject

Physics and Astronomy

keywords

deep learning, lung cancer, transfer learning, ensemble learning

language

English

id

8991237

date added to LUP

2019-08-05 10:47:51

date last changed

2019-08-05 10:47:51

@misc{8991237,
  abstract     = {{Adenocarcinoma (LUAD) and squamous cell carcinoma (LUSC) are the most common forms of lung cancer. These two subtypes of lung cancer are usually classified by visual inspection clinically. Our aim is to design an effective strategy based on convolutional neural networks to classify histopathology slides of these two types of lung cancer. With augmentation of the histopathology slides, different classifiers were trained and three ensemble learning methods were compared in this project. Finally, we determined which training strategy that produced the best result. 

In the case of limited samples, we find that the combination of transfer learning and ensemble learning greatly improves the classification accuracy for whole-slide images of lung tissue. The optimal strategy achieved 94.2% accuracy with 120 training cases and 86.3% accuracy with 80 independent test cases. We consider this comprehensive strategy remarkable in solving the classification problem with the different kinds of lung cancer.}},
  author       = {{Shen, Xiao Chen}},
  language     = {{eng}},
  note         = {{Student Paper}},
  title        = {{Two subtypes of lung cancer classification from histopathology images based on deep learning}},
  year         = {{2019}},
}

LUP Student Papers

LUND UNIVERSITY LIBRARIES

Two subtypes of lung cancer classification from histopathology images based on deep learning