Simultaneous Classification of Sets of Images Using Deep Learning and Clustering
(2020) In Master’s Theses in Mathematical Sciences FMAM05 20201Mathematics (Faculty of Engineering)
- Abstract
- Classification of cell images is conventionally done manually in hematology laboratories by medical technologists. CellaVision aims to automate this work in order to make the analysis process faster, better and more flexible. The automatic classification is currently done by processing each individual cell image through a Convolutional Neural Network. This methodology does not exploit any correlations that might exist between cells from the same blood sample.
We suggest a method to first compress the images of a whole sample using a Convolutional Neural Network and a Variational Autoencoder, then cluster these compressed data points using DBSCAN clustering and Bayesian Optimization, and finally assign a cell class to each cluster using... (More) - Classification of cell images is conventionally done manually in hematology laboratories by medical technologists. CellaVision aims to automate this work in order to make the analysis process faster, better and more flexible. The automatic classification is currently done by processing each individual cell image through a Convolutional Neural Network. This methodology does not exploit any correlations that might exist between cells from the same blood sample.
We suggest a method to first compress the images of a whole sample using a Convolutional Neural Network and a Variational Autoencoder, then cluster these compressed data points using DBSCAN clustering and Bayesian Optimization, and finally assign a cell class to each cluster using statistical tools such as Earth Mover's Distance. We used data from CellaVision's system DC-1 to train a Convolutional Neural Network with 90.68% accuracy on training data and 82.85% accuracy on test data. This was used both as a benchmark and as the foundation to our method. We managed to enhance the accuracies to 90.90% on training data and 83.13% on test data by applying our method.
We explored the feasibility of using our method on mixed cell data from different systems, but the results were not as good as on DC-1 data. Applying our method on images of handwritten digits from the MNIST dataset could be made advantageous by forming customized subsets of images. This indicates that our method is versatile enough to use on general image data, provided that correlations within the subsets exist. (Less)
Please use this url to cite or link to this publication:
http://lup.lub.lu.se/student-papers/record/9011189
- author
- Carleke, Nellie LU and Sellerberg, Hugo LU
- supervisor
-
- Karl Åström LU
- organization
- course
- FMAM05 20201
- year
- 2020
- type
- H2 - Master's Degree (Two Years)
- subject
- publication/series
- Master’s Theses in Mathematical Sciences
- report number
- LUTFMA-3410-2020
- ISSN
- 1404-6342
- other publication id
- 2020:E29
- language
- English
- id
- 9011189
- date added to LUP
- 2020-06-24 13:46:30
- date last changed
- 2020-06-24 13:46:30
@misc{9011189, abstract = {{Classification of cell images is conventionally done manually in hematology laboratories by medical technologists. CellaVision aims to automate this work in order to make the analysis process faster, better and more flexible. The automatic classification is currently done by processing each individual cell image through a Convolutional Neural Network. This methodology does not exploit any correlations that might exist between cells from the same blood sample. We suggest a method to first compress the images of a whole sample using a Convolutional Neural Network and a Variational Autoencoder, then cluster these compressed data points using DBSCAN clustering and Bayesian Optimization, and finally assign a cell class to each cluster using statistical tools such as Earth Mover's Distance. We used data from CellaVision's system DC-1 to train a Convolutional Neural Network with 90.68% accuracy on training data and 82.85% accuracy on test data. This was used both as a benchmark and as the foundation to our method. We managed to enhance the accuracies to 90.90% on training data and 83.13% on test data by applying our method. We explored the feasibility of using our method on mixed cell data from different systems, but the results were not as good as on DC-1 data. Applying our method on images of handwritten digits from the MNIST dataset could be made advantageous by forming customized subsets of images. This indicates that our method is versatile enough to use on general image data, provided that correlations within the subsets exist.}}, author = {{Carleke, Nellie and Sellerberg, Hugo}}, issn = {{1404-6342}}, language = {{eng}}, note = {{Student Paper}}, series = {{Master’s Theses in Mathematical Sciences}}, title = {{Simultaneous Classification of Sets of Images Using Deep Learning and Clustering}}, year = {{2020}}, }