Skip to main content

Lund University Publications

LUND UNIVERSITY LIBRARIES

Development of Training Materials for Pathologists to Provide Machine Learning Validation Data of Tumor-Infiltrating Lymphocytes in Breast Cancer

Garcia, Victor ; Elfer, Katherine ; Peeters, Dieter J.E. ; Ehinger, Anna LU orcid ; Werness, Bruce ; Ly, Amy ; Li, Xiaoxian ; Hanna, Matthew G. ; Blenman, Kim R.M. and Salgado, Roberto , et al. (2022) In Cancers 14(10). p.1-14
Abstract

The High Throughput Truthing project aims to develop a dataset for validating artificial intelligence and machine learning models (AI/ML) fit for regulatory purposes. The context of this AI/ML validation dataset is the reporting of stromal tumor-infiltrating lymphocytes (sTILs) density evaluations in hematoxylin and eosin-stained invasive breast cancer biopsy specimens. After completing the pilot study, we found notable variability in the sTILs estimates as well as inconsistencies and gaps in the provided training to pathologists. Using the pilot study data and an expert panel, we created custom training materials to improve pathologist annotation quality for the pivotal study. We categorized regions of interest (ROIs) based on their... (More)

The High Throughput Truthing project aims to develop a dataset for validating artificial intelligence and machine learning models (AI/ML) fit for regulatory purposes. The context of this AI/ML validation dataset is the reporting of stromal tumor-infiltrating lymphocytes (sTILs) density evaluations in hematoxylin and eosin-stained invasive breast cancer biopsy specimens. After completing the pilot study, we found notable variability in the sTILs estimates as well as inconsistencies and gaps in the provided training to pathologists. Using the pilot study data and an expert panel, we created custom training materials to improve pathologist annotation quality for the pivotal study. We categorized regions of interest (ROIs) based on their mean sTILs density and selected ROIs with the highest and lowest sTILs variability. In a series of eight one-hour sessions, the expert panel reviewed each ROI and provided verbal density estimates and comments on features that confounded the sTILs evaluation. We aggregated and shaped the comments to identify pitfalls and instructions to improve our training materials. From these selected ROIs, we created a training set and proficiency test set to improve pathologist training with the goal to improve data collection for the pivotal study. We are not exploring AI/ML performance in this paper. Instead, we are creating materials that will train crowd-sourced pathologists to be the reference standard in a pivotal study to create an AI/ML model validation dataset. The issues discussed here are also important for clinicians to understand about the evaluation of sTILs in clinical practice and can provide insight to developers of AI/ML models.

(Less)
Please use this url to cite or link to this publication:
author
; ; ; ; ; ; ; ; and , et al. (More)
; ; ; ; ; ; ; ; ; and (Less)
organization
publishing date
type
Contribution to journal
publication status
published
subject
keywords
biomarker, expert panel, pathologist training/education, tumor-infiltrating lymphocytes, validation dataset
in
Cancers
volume
14
issue
10
article number
2467
pages
1 - 14
publisher
MDPI AG
external identifiers
  • scopus:85130864152
  • pmid:35626070
ISSN
2072-6694
DOI
10.3390/cancers14102467
language
English
LU publication?
yes
additional info
Funding Information: This work was supported by the FDA Office of Women’s Health (FDA-OWH-2021-Gallas). This project was supported in part by an appointment (V.G.) to the ORISE Research Participation Program at the CDRH, U.S. Food and Drug Administration, administered by the Oak Ridge Institute for Science and Education through an interagency agreement between the U.S. Department of Energy and FDA/Center (FDA-ORISE-DIDSR 2022). Publisher Copyright: © 2022 by the authors. Licensee MDPI, Basel, Switzerland.
id
f857aedf-2b61-4237-b41d-b6b2bc064e1d
date added to LUP
2022-08-09 16:40:04
date last changed
2024-06-13 18:11:05
@article{f857aedf-2b61-4237-b41d-b6b2bc064e1d,
  abstract     = {{<p>The High Throughput Truthing project aims to develop a dataset for validating artificial intelligence and machine learning models (AI/ML) fit for regulatory purposes. The context of this AI/ML validation dataset is the reporting of stromal tumor-infiltrating lymphocytes (sTILs) density evaluations in hematoxylin and eosin-stained invasive breast cancer biopsy specimens. After completing the pilot study, we found notable variability in the sTILs estimates as well as inconsistencies and gaps in the provided training to pathologists. Using the pilot study data and an expert panel, we created custom training materials to improve pathologist annotation quality for the pivotal study. We categorized regions of interest (ROIs) based on their mean sTILs density and selected ROIs with the highest and lowest sTILs variability. In a series of eight one-hour sessions, the expert panel reviewed each ROI and provided verbal density estimates and comments on features that confounded the sTILs evaluation. We aggregated and shaped the comments to identify pitfalls and instructions to improve our training materials. From these selected ROIs, we created a training set and proficiency test set to improve pathologist training with the goal to improve data collection for the pivotal study. We are not exploring AI/ML performance in this paper. Instead, we are creating materials that will train crowd-sourced pathologists to be the reference standard in a pivotal study to create an AI/ML model validation dataset. The issues discussed here are also important for clinicians to understand about the evaluation of sTILs in clinical practice and can provide insight to developers of AI/ML models.</p>}},
  author       = {{Garcia, Victor and Elfer, Katherine and Peeters, Dieter J.E. and Ehinger, Anna and Werness, Bruce and Ly, Amy and Li, Xiaoxian and Hanna, Matthew G. and Blenman, Kim R.M. and Salgado, Roberto and Gallas, Brandon D.}},
  issn         = {{2072-6694}},
  keywords     = {{biomarker; expert panel; pathologist training/education; tumor-infiltrating lymphocytes; validation dataset}},
  language     = {{eng}},
  number       = {{10}},
  pages        = {{1--14}},
  publisher    = {{MDPI AG}},
  series       = {{Cancers}},
  title        = {{Development of Training Materials for Pathologists to Provide Machine Learning Validation Data of Tumor-Infiltrating Lymphocytes in Breast Cancer}},
  url          = {{http://dx.doi.org/10.3390/cancers14102467}},
  doi          = {{10.3390/cancers14102467}},
  volume       = {{14}},
  year         = {{2022}},
}