Efficient, automated and robust pollen analysis using deep learning

Olsson, Ola; Karlsson, Melanie; Persson, Anna S.; Smith, Henrik G.; Varadarajan, Vidula; Yourstone, Johanna; Stjernman, Martin

Efficient, automated and robust pollen analysis using deep learning

Mark

Olsson, Ola ^LU

; Karlsson, Melanie ^LU

; Persson, Anna S. ^LU ; Smith, Henrik G. ^LU

; Varadarajan, Vidula ; Yourstone, Johanna ^LU and Stjernman, Martin ^LU

(2021) In Methods in Ecology and Evolution 12(5). p.850-862

Abstract: Pollen analysis is an important tool in many fields, including pollination ecology, paleoclimatology, paleoecology, honey quality control, and even medicine and forensics. However, labour‐intensive manual pollen analysis often constrains the number of samples processed or the number of pollen analysed per sample. Thus, there is a desire to develop reliable, high‐throughput, automated systems.
We present an automated method for pollen analysis, based on deep learning convolutional neural networks (CNN). We scanned microscope slides with fuchsine stained, fresh pollen and automatically extracted images of all individual pollen grains. CNN models were trained on reference samples (122,000 pollen grains, from 347 flowers of 83 species of... (More); Pollen analysis is an important tool in many fields, including pollination ecology, paleoclimatology, paleoecology, honey quality control, and even medicine and forensics. However, labour‐intensive manual pollen analysis often constrains the number of samples processed or the number of pollen analysed per sample. Thus, there is a desire to develop reliable, high‐throughput, automated systems.
We present an automated method for pollen analysis, based on deep learning convolutional neural networks (CNN). We scanned microscope slides with fuchsine stained, fresh pollen and automatically extracted images of all individual pollen grains. CNN models were trained on reference samples (122,000 pollen grains, from 347 flowers of 83 species of 17 families). The models were used to classify images of different pollen grains in a series of experiments. We also propose an adjustment to reduce overestimation of sample diversity in cases where samples are likely to contain few species.
Accuracy of a model for 83 species was 0.98 when all samples of each species were first pooled, and then split into a training and a validation set (splitting experiment). However, accuracy was much lower (0.41) when individual reference samples from different flowers were kept separate, and one such sample was used for validation of models trained on remaining samples of the species (leave‐one‐out experiment). We therefore combined species into 28 pollen types where a new leave‐one‐out experiment revealed an overall accuracy of 0.68, and recall rates >0.90 in most pollen types. When validating against 63,650 manually identified pollen grains from 370 bumblebee samples, we obtained an accuracy of 0.79, but our adjustment procedure increased this to 0.85.
Validation through splitting experiments may overestimate robustness of CNN pollen analysis in new contexts (samples). Nevertheless, our method has the potential to allow large quantities of real pollen data to be analysed with reasonable accuracy. Although compiling pollen reference libraries is time‐consuming, this is simplified by our method, and can lead to widely accessible and shareable resources for pollen analysis. (Less)

Please use this url to cite or link to this publication: https://lup.lub.lu.se/record/b86e2226-8a67-40ca-86a3-4241ac6a2b98

author

Olsson, Ola ^LU

; Karlsson, Melanie ^LU

; Persson, Anna S. ^LU ; Smith, Henrik G. ^LU

; Varadarajan, Vidula ; Yourstone, Johanna ^LU and Stjernman, Martin ^LU

organization

publishing date

2021

type

Contribution to journal

publication status

published

subject

Ecology (including Biodiversity Conservation)

in

Methods in Ecology and Evolution

volume

12

issue

5

pages

13 pages

publisher

John Wiley & Sons Inc.

external identifiers

scopus:85101856554

ISSN

2041-210X

DOI

10.1111/2041-210X.13575

project

Scale-dependence of mitigation of pollinator loss

language

English

LU publication?

yes

id

b86e2226-8a67-40ca-86a3-4241ac6a2b98

date added to LUP

2021-03-18 15:42:50

date last changed

2025-10-14 09:46:17

@article{b86e2226-8a67-40ca-86a3-4241ac6a2b98,
  abstract     = {{Pollen analysis is an important tool in many fields, including pollination ecology, paleoclimatology, paleoecology, honey quality control, and even medicine and forensics. However, labour‐intensive manual pollen analysis often constrains the number of samples processed or the number of pollen analysed per sample. Thus, there is a desire to develop reliable, high‐throughput, automated systems.<br>
We present an automated method for pollen analysis, based on deep learning convolutional neural networks (CNN). We scanned microscope slides with fuchsine stained, fresh pollen and automatically extracted images of all individual pollen grains. CNN models were trained on reference samples (122,000 pollen grains, from 347 flowers of 83 species of 17 families). The models were used to classify images of different pollen grains in a series of experiments. We also propose an adjustment to reduce overestimation of sample diversity in cases where samples are likely to contain few species.<br>
Accuracy of a model for 83 species was 0.98 when all samples of each species were first pooled, and then split into a training and a validation set (splitting experiment). However, accuracy was much lower (0.41) when individual reference samples from different flowers were kept separate, and one such sample was used for validation of models trained on remaining samples of the species (leave‐one‐out experiment). We therefore combined species into 28 pollen types where a new leave‐one‐out experiment revealed an overall accuracy of 0.68, and recall rates &gt;0.90 in most pollen types. When validating against 63,650 manually identified pollen grains from 370 bumblebee samples, we obtained an accuracy of 0.79, but our adjustment procedure increased this to 0.85.<br>
Validation through splitting experiments may overestimate robustness of CNN pollen analysis in new contexts (samples). Nevertheless, our method has the potential to allow large quantities of real pollen data to be analysed with reasonable accuracy. Although compiling pollen reference libraries is time‐consuming, this is simplified by our method, and can lead to widely accessible and shareable resources for pollen analysis.}},
  author       = {{Olsson, Ola and Karlsson, Melanie and Persson, Anna S. and Smith, Henrik G. and Varadarajan, Vidula and Yourstone, Johanna and Stjernman, Martin}},
  issn         = {{2041-210X}},
  language     = {{eng}},
  number       = {{5}},
  pages        = {{850--862}},
  publisher    = {{John Wiley & Sons Inc.}},
  series       = {{Methods in Ecology and Evolution}},
  title        = {{Efficient, automated and robust pollen analysis using deep learning}},
  url          = {{http://dx.doi.org/10.1111/2041-210X.13575}},
  doi          = {{10.1111/2041-210X.13575}},
  volume       = {{12}},
  year         = {{2021}},
}

Lund University Publications

LUND UNIVERSITY LIBRARIES

Efficient, automated and robust pollen analysis using deep learning