Cell Identification from Microscopy Images using Deep Learning on Automatically Labeled Data

Salomon-Sörensen, Fredrik

Cell Identification from Microscopy Images using Deep Learning on Automatically Labeled Data

Mark

Salomon-Sörensen, Fredrik ^LU (2023) EITM01 20222
Department of Electrical and Information Technology

Abstract: In biology, cell counting provides a fundamental metric for live-cell experiments. Unfortunately, most researchers are constrained to using tedious and invasive methods for counting cells. Automatic identification of cells in microscopy images would therefore be a valuable tool for such researchers.

In recent years, deep learning-based image segmentation methods such as the U-Net have been explored for this task. However, deep learning models often require large amounts of labeled data for training. For identifying cells in microscopy images, this type of labeled data is commonly generated through manual pixel-wise annotations of hundreds of cells. To address this problem, we explore an approach for automatically generating large... (More); In biology, cell counting provides a fundamental metric for live-cell experiments. Unfortunately, most researchers are constrained to using tedious and invasive methods for counting cells. Automatic identification of cells in microscopy images would therefore be a valuable tool for such researchers.

In recent years, deep learning-based image segmentation methods such as the U-Net have been explored for this task. However, deep learning models often require large amounts of labeled data for training. For identifying cells in microscopy images, this type of labeled data is commonly generated through manual pixel-wise annotations of hundreds of cells. To address this problem, we explore an approach for automatically generating large numbers of labeled examples by imaging cells that were stained with a fluorescent dye. By using fluorescence microscopy alongside non-invasive microscopy, we obtain visualizations of the positions of nuclei in each cell image. We transform the fluorescence images into binary masks with a pipeline based on classical segmentation techniques: histogram equalization through CLAHE and thresholding using Otsu's method. We then use these masks as labels for the cell images, so that each image is accompanied by pixel-wise annotations of the nuclei. We generate datasets for three different cell types, and use them to train U-Net models for automatic cell identification. The trained models show excellent performance (~2% false positives, <1% false negatives), on par with expert annotation.

This method therefore shows great promise as a tool for biologists to perform automatic cell identification and counting. The trained U-Nets can potentially also be used for tracking cells in time-lapse imaging. These new data extraction methods could assist researchers in deepening their understanding of the phenomena that they are studying. (Less)
Popular Abstract: Starving cancer cells of the amino acid methionine has proven to be a way of
stopping them from growing and dividing. While details behind this phenomenon
are blurry, researchers are investigating this so that it might one day be exploited to fight cancer. This is known as the methionine dependency problem.

A problem that many researchers in biomedicine face is that even though they
can conduct cell experiments by giving cells some treatment (in this case, by removing their access to methionine), they do not have access to many methods for
extracting experimental data. Researchers are usually interested in measuring cell
counts in order to see how the number of cells changes over time when subject to
their chosen treatment.... (More); Starving cancer cells of the amino acid methionine has proven to be a way of
stopping them from growing and dividing. While details behind this phenomenon
are blurry, researchers are investigating this so that it might one day be exploited to fight cancer. This is known as the methionine dependency problem.

A problem that many researchers in biomedicine face is that even though they
can conduct cell experiments by giving cells some treatment (in this case, by removing their access to methionine), they do not have access to many methods for
extracting experimental data. Researchers are usually interested in measuring cell
counts in order to see how the number of cells changes over time when subject to
their chosen treatment. Unfortunately, in order to count cells they often have to
remove the cells from their nutrient solution and pass them through a measurement tool. This does not only take a lot of time (counting cells at a single time
point can require hours of manual work) but can also have unexpected effects on
the experiment. In addition, by exclusively focusing on cell counts, valuable information such as cell movements, divisions and deaths are lost. This information can be found by imaging cells through microscopy, but manually reviewing microscopy images is difficult and time-consuming. Furthermore, single experiments may yield tens of thousands of images and analyzing this copious amount of data manually is simply not feasible.

Recent development in artificial intelligence (AI) and machine learning has
opened the doors to new ways of tackling this issue. AI techniques have been used
to detect specific objects of interest in images for years, but some problems are
more difficult than others. For example, while AI-based software in smartphones
easily detects faces when using their cameras, identifying cells in microscopy images has proven to be a difficult task.

If we could train AI to detect cells in microscopy images, we could automatically analyze these tens of thousands of images in no time at all. If all cells were found, we could of course count them with ease. We could also track events such as
movement, division and death, which would be very valuable for the researchers.
There is only one problem with training AI: you need training data. Usually, lots
of it.

For this kind of task, training data would need to consist of two things: cell
microscopy images and pixel-wise locations of the cells in said images. These pixel-wise annotations are called labels. Because microscopy images of cells tend to be hard to analyze even for humans, accurate labels are not easy to obtain. In addition, the images that specific research groups obtain might have unique properties because of experimental setups and chosen cell types. So, even if someone trained AI for this, it might not be easy to share it.

When working with small amounts of data, experts can generate labels by
manually pointing out the locations of each cell. This process is difficult, time-consuming and becomes an enormous project for larger datasets. However, if there
was an automatic way to do this, researchers could easily use their own data to
train their own AI for cell counting and tracking purposes.

To solve this labeling problem, we set out to explore automatic labeling methods. We tried a new approach that involved a form of microscopy that separates
cells from their background much more clearly. This technique is toxic to the cells and cannot be used for actual cell experiments, but it could be used together with non-toxic microscopy techniques to generate pairs of images. These pairs would consist of one normal, non-toxic cell microscopy image and one image of cells that were clearly separated from the background. When put together, one is a normal microscopy image of cells and one clearly shows the pixel-wise locations of the cells in said image. This is the kind of labeled data that you could use to train AI!

To start, we built a pipeline that automatically transformed the raw cell microscopy data into training data that could be used to train AI for cell identification. Then, with state-of-the-art deep learning techniques and our own automatically labeled data, we trained AI to identify cells. We applied our method
to several "different-looking" cell types and obtained excellent results. Experts
evaluated the work of the AI and found that the accuracy of the AI was at least
on par with expert annotation. We then explored ways to couple the AI with
existing cell tracking algorithms that have been proven to perform well as long as
cell locations were provided. We ensured that the entire process would be fully
automatic, so that researchers could make use of this software without having any
experience in AI development.

With our method, it is our hope that we may assist all kinds of biologists in
extracting valuable data from their cell experiments, and take us one step closer
to solving the methionine dependency problem once and for all. (Less)

Please use this url to cite or link to this publication: http://lup.lub.lu.se/student-papers/record/9132460

author

Salomon-Sörensen, Fredrik ^LU

supervisor

organization

Department of Electrical and Information Technology

course

EITM01 20222

year

2023

type

H2 - Master's Degree (Two Years)

subject

Technology and Engineering

keywords

Deep Learning, Cell Identification, Image Segmentation, Nuclei Segmentation, Convolutional Neural Networks, UNet, Image Analysis, Microscopy, Noisy Labels, Phase-Contrast, Automatic Labeling

report number

LU/LTH-EIT 2023-939

language

English

id

9132460

date added to LUP

2023-08-15 13:54:49

date last changed

2023-08-15 13:54:49

@misc{9132460,
  abstract     = {{In biology, cell counting provides a fundamental metric for live-cell experiments. Unfortunately, most researchers are constrained to using tedious and invasive methods for counting cells. Automatic identification of cells in microscopy images would therefore be a valuable tool for such researchers. 

In recent years, deep learning-based image segmentation methods such as the U-Net have been explored for this task. However, deep learning models often require large amounts of labeled data for training. For identifying cells in microscopy images, this type of labeled data is commonly generated through manual pixel-wise annotations of hundreds of cells. To address this problem, we explore an approach for automatically generating large numbers of labeled examples by imaging cells that were stained with a fluorescent dye. By using fluorescence microscopy alongside non-invasive microscopy, we obtain visualizations of the positions of nuclei in each cell image. We transform the fluorescence images into binary masks with a pipeline based on classical segmentation techniques: histogram equalization through CLAHE and thresholding using Otsu's method. We then use these masks as labels for the cell images, so that each image is accompanied by pixel-wise annotations of the nuclei. We generate datasets for three different cell types, and use them to train U-Net models for automatic cell identification. The trained models show excellent performance (~2% false positives, <1% false negatives), on par with expert annotation.

This method therefore shows great promise as a tool for biologists to perform automatic cell identification and counting. The trained U-Nets can potentially also be used for tracking cells in time-lapse imaging. These new data extraction methods could assist researchers in deepening their understanding of the phenomena that they are studying.}},
  author       = {{Salomon-Sörensen, Fredrik}},
  language     = {{eng}},
  note         = {{Student Paper}},
  title        = {{Cell Identification from Microscopy Images using Deep Learning on Automatically Labeled Data}},
  year         = {{2023}},
}

LUP Student Papers

LUND UNIVERSITY LIBRARIES

Cell Identification from Microscopy Images using Deep Learning on Automatically Labeled Data