Artificial intelligence-supported screen reading versus standard double reading in the Mammography Screening with Artificial Intelligence trial (MASAI) : a clinical safety analysis of a randomised, controlled, non-inferiority, single-blinded, screening accuracy study

Lång, Kristina; Josefsson, Viktoria; Larsson, Anna-Maria; Larsson, Stefan; Högberg, Charlotte; Sartor, Hanna; Hofvind, Solveig; Andersson, Ingvar; Rosso, Aldana

Artificial intelligence-supported screen reading versus standard double reading in the Mammography Screening with Artificial Intelligence trial (MASAI) : a clinical safety analysis of a randomised, controlled, non-inferiority, single-blinded, screening accuracy study

Mark

Lång, Kristina ^LU ; Josefsson, Viktoria ^LU ; Larsson, Anna-Maria ^LU ; Larsson, Stefan ^LU ; Högberg, Charlotte ^LU

; Sartor, Hanna ^LU ; Hofvind, Solveig ; Andersson, Ingvar ^LU and Rosso, Aldana ^LU

(2023) In The Lancet. Oncology 24(8). p.936-944

Abstract

BACKGROUND: Retrospective studies have shown promising results using artificial intelligence (AI) to improve mammography screening accuracy and reduce screen-reading workload; however, to our knowledge, a randomised trial has not yet been conducted. We aimed to assess the clinical safety of an AI-supported screen-reading protocol compared with standard screen reading by radiologists following mammography.

METHODS: In this randomised, controlled, population-based trial, women aged 40-80 years eligible for mammography screening (including general screening with 1·5-2-year intervals and annual screening for those with moderate hereditary risk of breast cancer or a history of breast cancer) at four screening sites in Sweden were... (More)

BACKGROUND: Retrospective studies have shown promising results using artificial intelligence (AI) to improve mammography screening accuracy and reduce screen-reading workload; however, to our knowledge, a randomised trial has not yet been conducted. We aimed to assess the clinical safety of an AI-supported screen-reading protocol compared with standard screen reading by radiologists following mammography.

METHODS: In this randomised, controlled, population-based trial, women aged 40-80 years eligible for mammography screening (including general screening with 1·5-2-year intervals and annual screening for those with moderate hereditary risk of breast cancer or a history of breast cancer) at four screening sites in Sweden were informed about the study as part of the screening invitation. Those who did not opt out were randomly allocated (1:1) to AI-supported screening (intervention group) or standard double reading without AI (control group). Screening examinations were automatically randomised by the Picture Archive and Communications System with a pseudo-random number generator after image acquisition. The participants and the radiographers acquiring the screening examinations, but not the radiologists reading the screening examinations, were masked to study group allocation. The AI system (Transpara version 1.7.0) provided an examination-based malignancy risk score on a 10-level scale that was used to triage screening examinations to single reading (score 1-9) or double reading (score 10), with AI risk scores (for all examinations) and computer-aided detection marks (for examinations with risk score 8-10) available to the radiologists doing the screen reading. Here we report the prespecified clinical safety analysis, to be done after 80 000 women were enrolled, to assess the secondary outcome measures of early screening performance (cancer detection rate, recall rate, false positive rate, positive predictive value [PPV] of recall, and type of cancer detected [invasive or in situ]) and screen-reading workload. Analyses were done in the modified intention-to-treat population (ie, all women randomly assigned to a group with one complete screening examination, excluding women recalled due to enlarged lymph nodes diagnosed with lymphoma). The lowest acceptable limit for safety in the intervention group was a cancer detection rate of more than 3 per 1000 participants screened. The trial is registered with ClinicalTrials.gov, NCT04838756, and is closed to accrual; follow-up is ongoing to assess the primary endpoint of the trial, interval cancer rate.

FINDINGS: Between April 12, 2021, and July 28, 2022, 80 033 women were randomly assigned to AI-supported screening (n=40 003) or double reading without AI (n=40 030). 13 women were excluded from the analysis. The median age was 54·0 years (IQR 46·7-63·9). Race and ethnicity data were not collected. AI-supported screening among 39 996 participants resulted in 244 screen-detected cancers, 861 recalls, and a total of 46 345 screen readings. Standard screening among 40 024 participants resulted in 203 screen-detected cancers, 817 recalls, and a total of 83 231 screen readings. Cancer detection rates were 6·1 (95% CI 5·4-6·9) per 1000 screened participants in the intervention group, above the lowest acceptable limit for safety, and 5·1 (4·4-5·8) per 1000 in the control group-a ratio of 1·2 (95% CI 1·0-1·5; p=0·052). Recall rates were 2·2% (95% CI 2·0-2·3) in the intervention group and 2·0% (1·9-2·2) in the control group. The false positive rate was 1·5% (95% CI 1·4-1·7) in both groups. The PPV of recall was 28·3% (95% CI 25·3-31·5) in the intervention group and 24·8% (21·9-28·0) in the control group. In the intervention group, 184 (75%) of 244 cancers detected were invasive and 60 (25%) were in situ; in the control group, 165 (81%) of 203 cancers were invasive and 38 (19%) were in situ. The screen-reading workload was reduced by 44·3% using AI.

INTERPRETATION: AI-supported mammography screening resulted in a similar cancer detection rate compared with standard double reading, with a substantially lower screen-reading workload, indicating that the use of AI in mammography screening is safe. The trial was thus not halted and the primary endpoint of interval cancer rate will be assessed in 100 000 enrolled participants after 2-years of follow up.

FUNDING: Swedish Cancer Society, Confederation of Regional Cancer Centres, and the Swedish governmental funding for clinical research (ALF).

(Less)

Please use this url to cite or link to this publication: https://lup.lub.lu.se/record/be033994-ed89-4a99-904a-f6e97b3d2b21

author

Lång, Kristina ^LU ; Josefsson, Viktoria ^LU ; Larsson, Anna-Maria ^LU ; Larsson, Stefan ^LU ; Högberg, Charlotte ^LU

; Sartor, Hanna ^LU ; Hofvind, Solveig ; Andersson, Ingvar ^LU and Rosso, Aldana ^LU

organization

publishing date

2023-08

type

Contribution to journal

publication status

published

subject

Cancer and Oncology

in

The Lancet. Oncology

volume

24

issue

8

pages

936 - 944

publisher

Elsevier

external identifiers

pmid:37541274
scopus:85166551153

ISSN

1474-5488

DOI

10.1016/S1470-2045(23)00298-X

project

Mammography Screening with Artificial Intelligence

Artificial intelligence in mammography screening

language

English

LU publication?

yes

additional info

id

be033994-ed89-4a99-904a-f6e97b3d2b21

date added to LUP

2023-08-08 14:22:53

date last changed

2025-07-29 10:07:59

@article{be033994-ed89-4a99-904a-f6e97b3d2b21,
  abstract     = {{<p>BACKGROUND: Retrospective studies have shown promising results using artificial intelligence (AI) to improve mammography screening accuracy and reduce screen-reading workload; however, to our knowledge, a randomised trial has not yet been conducted. We aimed to assess the clinical safety of an AI-supported screen-reading protocol compared with standard screen reading by radiologists following mammography.</p><p>METHODS: In this randomised, controlled, population-based trial, women aged 40-80 years eligible for mammography screening (including general screening with 1·5-2-year intervals and annual screening for those with moderate hereditary risk of breast cancer or a history of breast cancer) at four screening sites in Sweden were informed about the study as part of the screening invitation. Those who did not opt out were randomly allocated (1:1) to AI-supported screening (intervention group) or standard double reading without AI (control group). Screening examinations were automatically randomised by the Picture Archive and Communications System with a pseudo-random number generator after image acquisition. The participants and the radiographers acquiring the screening examinations, but not the radiologists reading the screening examinations, were masked to study group allocation. The AI system (Transpara version 1.7.0) provided an examination-based malignancy risk score on a 10-level scale that was used to triage screening examinations to single reading (score 1-9) or double reading (score 10), with AI risk scores (for all examinations) and computer-aided detection marks (for examinations with risk score 8-10) available to the radiologists doing the screen reading. Here we report the prespecified clinical safety analysis, to be done after 80 000 women were enrolled, to assess the secondary outcome measures of early screening performance (cancer detection rate, recall rate, false positive rate, positive predictive value [PPV] of recall, and type of cancer detected [invasive or in situ]) and screen-reading workload. Analyses were done in the modified intention-to-treat population (ie, all women randomly assigned to a group with one complete screening examination, excluding women recalled due to enlarged lymph nodes diagnosed with lymphoma). The lowest acceptable limit for safety in the intervention group was a cancer detection rate of more than 3 per 1000 participants screened. The trial is registered with ClinicalTrials.gov, NCT04838756, and is closed to accrual; follow-up is ongoing to assess the primary endpoint of the trial, interval cancer rate.</p><p>FINDINGS: Between April 12, 2021, and July 28, 2022, 80 033 women were randomly assigned to AI-supported screening (n=40 003) or double reading without AI (n=40 030). 13 women were excluded from the analysis. The median age was 54·0 years (IQR 46·7-63·9). Race and ethnicity data were not collected. AI-supported screening among 39 996 participants resulted in 244 screen-detected cancers, 861 recalls, and a total of 46 345 screen readings. Standard screening among 40 024 participants resulted in 203 screen-detected cancers, 817 recalls, and a total of 83 231 screen readings. Cancer detection rates were 6·1 (95% CI 5·4-6·9) per 1000 screened participants in the intervention group, above the lowest acceptable limit for safety, and 5·1 (4·4-5·8) per 1000 in the control group-a ratio of 1·2 (95% CI 1·0-1·5; p=0·052). Recall rates were 2·2% (95% CI 2·0-2·3) in the intervention group and 2·0% (1·9-2·2) in the control group. The false positive rate was 1·5% (95% CI 1·4-1·7) in both groups. The PPV of recall was 28·3% (95% CI 25·3-31·5) in the intervention group and 24·8% (21·9-28·0) in the control group. In the intervention group, 184 (75%) of 244 cancers detected were invasive and 60 (25%) were in situ; in the control group, 165 (81%) of 203 cancers were invasive and 38 (19%) were in situ. The screen-reading workload was reduced by 44·3% using AI.</p><p>INTERPRETATION: AI-supported mammography screening resulted in a similar cancer detection rate compared with standard double reading, with a substantially lower screen-reading workload, indicating that the use of AI in mammography screening is safe. The trial was thus not halted and the primary endpoint of interval cancer rate will be assessed in 100 000 enrolled participants after 2-years of follow up.</p><p>FUNDING: Swedish Cancer Society, Confederation of Regional Cancer Centres, and the Swedish governmental funding for clinical research (ALF).</p>}},
  author       = {{Lång, Kristina and Josefsson, Viktoria and Larsson, Anna-Maria and Larsson, Stefan and Högberg, Charlotte and Sartor, Hanna and Hofvind, Solveig and Andersson, Ingvar and Rosso, Aldana}},
  issn         = {{1474-5488}},
  language     = {{eng}},
  number       = {{8}},
  pages        = {{936--944}},
  publisher    = {{Elsevier}},
  series       = {{The Lancet. Oncology}},
  title        = {{Artificial intelligence-supported screen reading versus standard double reading in the Mammography Screening with Artificial Intelligence trial (MASAI) : a clinical safety analysis of a randomised, controlled, non-inferiority, single-blinded, screening accuracy study}},
  url          = {{http://dx.doi.org/10.1016/S1470-2045(23)00298-X}},
  doi          = {{10.1016/S1470-2045(23)00298-X}},
  volume       = {{24}},
  year         = {{2023}},
}

Lund University Publications

LUND UNIVERSITY LIBRARIES

Artificial intelligence-supported screen reading versus standard double reading in the Mammography Screening with Artificial Intelligence trial (MASAI) : a clinical safety analysis of a randomised, controlled, non-inferiority, single-blinded, screening accuracy study