An annotated high-content fluorescence microscopy dataset with 2 EGFP-Galectin-3-stained cells and manually labelled outlines
(2024)- Abstract
- Many forms of bioimage analysis involve the detection of objects and their outlines. In the context of microscopy-based high-throughput drug and genomic screening and even in smaller scale microscopy experiments, the objects that most often need to be detected are cells. In order to develop and benchmark algorithms and neural networks that can perform this task, high-quality datasets with annotated cell outlines are needed.
We have created a dataset, named Aitslab_bioimaging2, consisting of 60 fluorescence microscopy images with EGFP-Galectin-3 labelled cells and their hand-labelled outlines. Images were acquired on a Thermo Fischer CX7 high-content imaging system at 20x magnification created as part of an RNA interference screen... (More) - Many forms of bioimage analysis involve the detection of objects and their outlines. In the context of microscopy-based high-throughput drug and genomic screening and even in smaller scale microscopy experiments, the objects that most often need to be detected are cells. In order to develop and benchmark algorithms and neural networks that can perform this task, high-quality datasets with annotated cell outlines are needed.
We have created a dataset, named Aitslab_bioimaging2, consisting of 60 fluorescence microscopy images with EGFP-Galectin-3 labelled cells and their hand-labelled outlines. Images were acquired on a Thermo Fischer CX7 high-content imaging system at 20x magnification created as part of an RNA interference screen with a modified U2OS osteosarcoma cell line. Outlines were labelled by three annotators, who had high inter-annotator agreement between them and with a biomedical expert, who labelled some of the objects for comparison and reviewed a subset of the labels, making minor corrections as needed.
The dataset comprises over 2200 annotated cell objects in total, making it sufficient in size to train high-performing neural networks for instance or semantic segmentation. Labels can also easily be converted to boxes for object detection tasks. The dataset is already pre-divided into training, development, and test sets. Matching nuclear staining and outlines are available for part of the dataset from a previous publication (dataset Aitslab_bioimaging1) [https://zenodo.org/records/6677913]. (Less)
Please use this url to cite or link to this publication:
https://lup.lub.lu.se/record/0c0ece82-601a-4c64-bffc-38f15bea3b7c
- author
- Kazemi Rashed, Salma
LU
; Arvidsson, Malou
; Ahmed, Rafsan
LU
and Aits, Sonja LU
- organization
- publishing date
- 2024-06-24
- type
- Working paper/Preprint
- publication status
- published
- subject
- keywords
- deep learning, computer vision, high-content imaging, open data, galectin, supervised learning, instance segmentation
- publisher
- Zenodo
- DOI
- 10.5281/zenodo.12512172
- project
- Lund University AI Research
- Lysosomes in cell death - from molecular mechanisms to new treatment strategies
- Revealing drivers of cell death disruption across species and theri links to biodiversity loss and human disease
- Analysing large-scale microscopy datasets with AI-based approaches
- language
- English
- LU publication?
- yes
- id
- 0c0ece82-601a-4c64-bffc-38f15bea3b7c
- date added to LUP
- 2024-12-13 11:05:09
- date last changed
- 2025-04-04 14:25:17
@misc{0c0ece82-601a-4c64-bffc-38f15bea3b7c, abstract = {{Many forms of bioimage analysis involve the detection of objects and their outlines. In the context of microscopy-based high-throughput drug and genomic screening and even in smaller scale microscopy experiments, the objects that most often need to be detected are cells. In order to develop and benchmark algorithms and neural networks that can perform this task, high-quality datasets with annotated cell outlines are needed.<br/><br/>We have created a dataset, named Aitslab_bioimaging2, consisting of 60 fluorescence microscopy images with EGFP-Galectin-3 labelled cells and their hand-labelled outlines. Images were acquired on a Thermo Fischer CX7 high-content imaging system at 20x magnification created as part of an RNA interference screen with a modified U2OS osteosarcoma cell line. Outlines were labelled by three annotators, who had high inter-annotator agreement between them and with a biomedical expert, who labelled some of the objects for comparison and reviewed a subset of the labels, making minor corrections as needed.<br/><br/>The dataset comprises over 2200 annotated cell objects in total, making it sufficient in size to train high-performing neural networks for instance or semantic segmentation. Labels can also easily be converted to boxes for object detection tasks. The dataset is already pre-divided into training, development, and test sets. Matching nuclear staining and outlines are available for part of the dataset from a previous publication (dataset Aitslab_bioimaging1) [https://zenodo.org/records/6677913].}}, author = {{Kazemi Rashed, Salma and Arvidsson, Malou and Ahmed, Rafsan and Aits, Sonja}}, keywords = {{deep learning; computer vision; high-content imaging; open data; galectin; supervised learning; instance segmentation}}, language = {{eng}}, month = {{06}}, note = {{Preprint}}, publisher = {{Zenodo}}, title = {{An annotated high-content fluorescence microscopy dataset with 2 EGFP-Galectin-3-stained cells and manually labelled outlines}}, url = {{http://dx.doi.org/10.5281/zenodo.12512172}}, doi = {{10.5281/zenodo.12512172}}, year = {{2024}}, }