Skip to main content

Lund University Publications

LUND UNIVERSITY LIBRARIES

Artificial Intelligence Evaluation of 122 969 Mammography Examinations from a Population-based Screening Program

Larsen, Marthe ; Aglen, Camilla F. ; Lee, Christoph I. ; Hoff, Solveig R ; Lund-Hanssen, Håkon ; Lång, Kristina LU ; Nygård, Jan F ; Ursin, Giske and Hofvind, Solveig (2022) In Radiology 303(3). p.502-511
Abstract

Background Artificial intelligence (AI) has shown promising results for cancer detection with mammographic screening. However, evidence related to the use of AI in real screening settings remain sparse. Purpose To compare the performance of a commercially available AI system with routine, independent double reading with consensus as performed in a population-based screening program. Furthermore, the histopathologic characteristics of tumors with different AI scores were explored. Materials and Methods In this retrospective study, 122 969 screening examinations from 47 877 women performed at four screening units in BreastScreen Norway from October 2009 to December 2018 were included. The data set included 752 screen-detected cancers (6.1... (More)

Background Artificial intelligence (AI) has shown promising results for cancer detection with mammographic screening. However, evidence related to the use of AI in real screening settings remain sparse. Purpose To compare the performance of a commercially available AI system with routine, independent double reading with consensus as performed in a population-based screening program. Furthermore, the histopathologic characteristics of tumors with different AI scores were explored. Materials and Methods In this retrospective study, 122 969 screening examinations from 47 877 women performed at four screening units in BreastScreen Norway from October 2009 to December 2018 were included. The data set included 752 screen-detected cancers (6.1 per 1000 examinations) and 205 interval cancers (1.7 per 1000 examinations). Each examination had an AI score between 1 and 10, where 1 indicated low risk of breast cancer and 10 indicated high risk. Threshold 1, threshold 2, and threshold 3 were used to assess the performance of the AI system as a binary decision tool (selected vs not selected). Threshold 1 was set at an AI score of 10, threshold 2 was set to yield a selection rate similar to the consensus rate (8.8%), and threshold 3 was set to yield a selection rate similar to an average individual radiologist (5.8%). Descriptive statistics were used to summarize screening outcomes. Results A total of 653 of 752 screen-detected cancers (86.8%) and 92 of 205 interval cancers (44.9%) were given a score of 10 by the AI system (threshold 1). Using threshold 3, 80.1% of the screen-detected cancers (602 of 752) and 30.7% of the interval cancers (63 of 205) were selected. Screen-detected cancer with AI scores not selected using the thresholds had favorable histopathologic characteristics compared to those selected; opposite results were observed for interval cancer. Conclusion The proportion of screen-detected cancers not selected by the artificial intelligence (AI) system at the three evaluated thresholds was less than 20%. The overall performance of the AI system was promising according to cancer detection.

(Less)
Please use this url to cite or link to this publication:
author
; ; ; ; ; ; ; and
organization
publishing date
type
Contribution to journal
publication status
published
subject
in
Radiology
volume
303
issue
3
pages
10 pages
publisher
Radiological Society of North America
external identifiers
  • scopus:85130862001
  • pmid:35348377
ISSN
1527-1315
DOI
10.1148/radiol.212381
language
English
LU publication?
yes
id
1ffd245d-a843-42cd-bd19-ffa8b943051c
date added to LUP
2022-04-20 15:28:19
date last changed
2024-04-18 08:36:45
@article{1ffd245d-a843-42cd-bd19-ffa8b943051c,
  abstract     = {{<p>Background Artificial intelligence (AI) has shown promising results for cancer detection with mammographic screening. However, evidence related to the use of AI in real screening settings remain sparse. Purpose To compare the performance of a commercially available AI system with routine, independent double reading with consensus as performed in a population-based screening program. Furthermore, the histopathologic characteristics of tumors with different AI scores were explored. Materials and Methods In this retrospective study, 122 969 screening examinations from 47 877 women performed at four screening units in BreastScreen Norway from October 2009 to December 2018 were included. The data set included 752 screen-detected cancers (6.1 per 1000 examinations) and 205 interval cancers (1.7 per 1000 examinations). Each examination had an AI score between 1 and 10, where 1 indicated low risk of breast cancer and 10 indicated high risk. Threshold 1, threshold 2, and threshold 3 were used to assess the performance of the AI system as a binary decision tool (selected vs not selected). Threshold 1 was set at an AI score of 10, threshold 2 was set to yield a selection rate similar to the consensus rate (8.8%), and threshold 3 was set to yield a selection rate similar to an average individual radiologist (5.8%). Descriptive statistics were used to summarize screening outcomes. Results A total of 653 of 752 screen-detected cancers (86.8%) and 92 of 205 interval cancers (44.9%) were given a score of 10 by the AI system (threshold 1). Using threshold 3, 80.1% of the screen-detected cancers (602 of 752) and 30.7% of the interval cancers (63 of 205) were selected. Screen-detected cancer with AI scores not selected using the thresholds had favorable histopathologic characteristics compared to those selected; opposite results were observed for interval cancer. Conclusion The proportion of screen-detected cancers not selected by the artificial intelligence (AI) system at the three evaluated thresholds was less than 20%. The overall performance of the AI system was promising according to cancer detection.</p>}},
  author       = {{Larsen, Marthe and Aglen, Camilla F. and Lee, Christoph I. and Hoff, Solveig R and Lund-Hanssen, Håkon and Lång, Kristina and Nygård, Jan F and Ursin, Giske and Hofvind, Solveig}},
  issn         = {{1527-1315}},
  language     = {{eng}},
  month        = {{06}},
  number       = {{3}},
  pages        = {{502--511}},
  publisher    = {{Radiological Society of North America}},
  series       = {{Radiology}},
  title        = {{Artificial Intelligence Evaluation of 122 969 Mammography Examinations from a Population-based Screening Program}},
  url          = {{http://dx.doi.org/10.1148/radiol.212381}},
  doi          = {{10.1148/radiol.212381}},
  volume       = {{303}},
  year         = {{2022}},
}