An annotated high-content fluorescence microscopy dataset with EGFP-Galectin-3-stained cells and manually labelled outlines
(2025) In Data in Brief 58. p.1-9- Abstract
Many forms of bioimage analysis involve the detection of objects and their outlines. In the context of microscopy-based high-throughput drug and genomic screening and even in smaller scale microscopy experiments, the objects that most often need to be detected are cells. In order to develop and benchmark algorithms and neural networks that can perform this task, high-quality datasets with annotated cell outlines are needed. We have created a dataset, named Aitslab_bioimaging2, consisting of 60 fluorescence microscopy images with EGFP-Galectin-3 labelled cells and their hand-labelled outlines. Images were acquired on a Thermo Fischer CX7 high-content imaging system at 20x magnification created as part of an RNA interference screen with a... (More)
Many forms of bioimage analysis involve the detection of objects and their outlines. In the context of microscopy-based high-throughput drug and genomic screening and even in smaller scale microscopy experiments, the objects that most often need to be detected are cells. In order to develop and benchmark algorithms and neural networks that can perform this task, high-quality datasets with annotated cell outlines are needed. We have created a dataset, named Aitslab_bioimaging2, consisting of 60 fluorescence microscopy images with EGFP-Galectin-3 labelled cells and their hand-labelled outlines. Images were acquired on a Thermo Fischer CX7 high-content imaging system at 20x magnification created as part of an RNA interference screen with a modified U2OS osteosarcoma cell line. Outlines were labelled by three annotators, who had high inter-annotator agreement between them and with a biomedical expert, who labelled some of the objects for comparison and reviewed a subset of the labels, making minor corrections as needed. The dataset comprises over 2200 annotated cell objects in total, making it sufficient in size to train high-performing neural networks for instance or semantic segmentation. Labels can also easily be converted to boxes for object detection tasks. The dataset is already pre-divided into training, development, and test sets. Matching nuclear staining and outlines are available for part of the dataset from a previous publication (dataset Aitslab_bioimaging1) [1].
(Less)
- author
- Rashed, Salma Kazemi
LU
; Arvidsson, Malou
; Ahmed, Rafsan
LU
and Aits, Sonja LU
- organization
-
- LTH Profile Area: Engineering Health
- LU Profile Area: Natural and Artificial Cognition
- Cell Death, Lysosomes and Artificial Intelligence (research group)
- LU Profile Area: Proactive Ageing
- LU Profile Area: Nature-based future solutions
- LTH Profile Area: AI and Digitalization
- EpiHealth: Epidemiology for Health
- LUCC: Lund University Cancer Centre
- eSSENCE: The e-Science Collaboration
- publishing date
- 2025-02
- type
- Contribution to journal
- publication status
- published
- subject
- keywords
- Biomedical image analysis, Computer vision, Deep learning training and evaluation, Fluorescence microscopy, High-content screening, Instance segmentation
- in
- Data in Brief
- volume
- 58
- article number
- 111148
- pages
- 1 - 9
- publisher
- Elsevier
- external identifiers
-
- pmid:39845150
- scopus:85213574238
- ISSN
- 2352-3409
- DOI
- 10.1016/j.dib.2024.111148
- language
- English
- LU publication?
- yes
- additional info
- Publisher Copyright: © 2024
- id
- 8130bb44-6d36-479e-a0b8-42d25cb346b0
- date added to LUP
- 2025-02-14 02:05:31
- date last changed
- 2025-07-04 14:13:24
@article{8130bb44-6d36-479e-a0b8-42d25cb346b0, abstract = {{<p>Many forms of bioimage analysis involve the detection of objects and their outlines. In the context of microscopy-based high-throughput drug and genomic screening and even in smaller scale microscopy experiments, the objects that most often need to be detected are cells. In order to develop and benchmark algorithms and neural networks that can perform this task, high-quality datasets with annotated cell outlines are needed. We have created a dataset, named Aitslab_bioimaging2, consisting of 60 fluorescence microscopy images with EGFP-Galectin-3 labelled cells and their hand-labelled outlines. Images were acquired on a Thermo Fischer CX7 high-content imaging system at 20x magnification created as part of an RNA interference screen with a modified U2OS osteosarcoma cell line. Outlines were labelled by three annotators, who had high inter-annotator agreement between them and with a biomedical expert, who labelled some of the objects for comparison and reviewed a subset of the labels, making minor corrections as needed. The dataset comprises over 2200 annotated cell objects in total, making it sufficient in size to train high-performing neural networks for instance or semantic segmentation. Labels can also easily be converted to boxes for object detection tasks. The dataset is already pre-divided into training, development, and test sets. Matching nuclear staining and outlines are available for part of the dataset from a previous publication (dataset Aitslab_bioimaging1) [1].</p>}}, author = {{Rashed, Salma Kazemi and Arvidsson, Malou and Ahmed, Rafsan and Aits, Sonja}}, issn = {{2352-3409}}, keywords = {{Biomedical image analysis; Computer vision; Deep learning training and evaluation; Fluorescence microscopy; High-content screening; Instance segmentation}}, language = {{eng}}, pages = {{1--9}}, publisher = {{Elsevier}}, series = {{Data in Brief}}, title = {{An annotated high-content fluorescence microscopy dataset with EGFP-Galectin-3-stained cells and manually labelled outlines}}, url = {{http://dx.doi.org/10.1016/j.dib.2024.111148}}, doi = {{10.1016/j.dib.2024.111148}}, volume = {{58}}, year = {{2025}}, }