Skip to main content

Lund University Publications

LUND UNIVERSITY LIBRARIES

Intra- and inter-observer agreement when describing adnexal masses using the International Ovarian Tumour Analysis (IOTA) terms and definitions: a study on three-dimensional (3D) ultrasound volumes.

Sladkevicius, Povilas LU orcid and Valentin, Lil LU orcid (2013) In Ultrasound in Obstetrics & Gynecology 41(3). p.318-327
Abstract
Objectives:

To estimate intra-observer repeatability and inter-observer agreement in 1) describing adnexal masses using the International Ovarian Tumor Analysis (IOTA) terms and definitions, 2) the risk of malignancy calculated using IOTA logistic regression models LR1 and LR2, 3) the diagnosis made on the basis of subjective assessment of ultrasound images.



Methods:

One hundred and three adnexal masses were examined with transvaginal gray scale and power Doppler ultrasound using a GE Voluson 730 Expert system. Three-dimensional ultrasound volumes of the mass were saved. After 12-18 months the volumes were analyzed twice 1-6 months apart by each of two independent experienced sonologists who used the... (More)
Objectives:

To estimate intra-observer repeatability and inter-observer agreement in 1) describing adnexal masses using the International Ovarian Tumor Analysis (IOTA) terms and definitions, 2) the risk of malignancy calculated using IOTA logistic regression models LR1 and LR2, 3) the diagnosis made on the basis of subjective assessment of ultrasound images.



Methods:

One hundred and three adnexal masses were examined with transvaginal gray scale and power Doppler ultrasound using a GE Voluson 730 Expert system. Three-dimensional ultrasound volumes of the mass were saved. After 12-18 months the volumes were analyzed twice 1-6 months apart by each of two independent experienced sonologists who used the IOTA terms and definitions to describe the masses. The risk of malignancy was calculated using LR1 and LR2. The sonologists also classified the masses as benign or malignant using subjective assessment.



Results:

Eighty-four masses were benign, eight borderline and 11 invasively malignant. There was substantial variability within and between observers in the results of measurements included in LR1 and LR2 and some variability also when assessing categorical variables included in the models (agreement 51-100%, Kappa 0.42 -1.00). This resulted in substantial variability in the calculated risk of malignancy, the limits of agreement indicating that the calculated risk of malignancy could vary by a factor of five to twenty within and between observers. The reliability of the calculated risk of malignancy was moderate (LR1) or poor (LR2) when the calculated risk of malignancy was > 10% (intra-class correlation coefficients varying from 0.21 to 0.73). Inter-observer agreement when classifying tumors as benign or malignant using the predetermined risk of malignancy cut-off of 10% was fair to good (agreement 85%, Kappa 0.61 for LR1; agreement 81%, Kappa 0.52 for LR2). Intra- and inter-observer agreements for subjective assessment were 96%, 96% and 96% with Kappa values of 0.89, 0.87 and 0.88.



Conclusions:

Intra- and inter-observer agreement in classifying tumors as benign or malignant using the risk of malignancy cut off of 10% for LR1 and LR2 was fair or good, while the reproducibility of subjective assessment was excellent. The reliability of calculated risks > 10% was poor, and calculated risk > 10% cannot be used to discriminate between individuals at different risk. These results cannot be extrapolated to real-time ultrasound examinations. Copyright © 2012 ISUOG. Published by John Wiley & Sons, Ltd. (Less)
Please use this url to cite or link to this publication:
author
and
organization
publishing date
type
Contribution to journal
publication status
published
subject
in
Ultrasound in Obstetrics & Gynecology
volume
41
issue
3
pages
318 - 327
publisher
John Wiley & Sons Inc.
external identifiers
  • wos:000315742400014
  • pmid:22915506
  • scopus:84874678100
  • pmid:22915506
ISSN
1469-0705
DOI
10.1002/uog.12289
language
English
LU publication?
yes
id
d8e7337f-1860-4a3d-96e3-589f7453344e (old id 3047284)
alternative location
http://www.ncbi.nlm.nih.gov/pubmed/22915506?dopt=Abstract
date added to LUP
2016-04-01 10:10:17
date last changed
2022-02-09 23:24:23
@article{d8e7337f-1860-4a3d-96e3-589f7453344e,
  abstract     = {{Objectives: <br/><br>
To estimate intra-observer repeatability and inter-observer agreement in 1) describing adnexal masses using the International Ovarian Tumor Analysis (IOTA) terms and definitions, 2) the risk of malignancy calculated using IOTA logistic regression models LR1 and LR2, 3) the diagnosis made on the basis of subjective assessment of ultrasound images. <br/><br>
<br/><br>
Methods: <br/><br>
One hundred and three adnexal masses were examined with transvaginal gray scale and power Doppler ultrasound using a GE Voluson 730 Expert system. Three-dimensional ultrasound volumes of the mass were saved. After 12-18 months the volumes were analyzed twice 1-6 months apart by each of two independent experienced sonologists who used the IOTA terms and definitions to describe the masses. The risk of malignancy was calculated using LR1 and LR2. The sonologists also classified the masses as benign or malignant using subjective assessment. <br/><br>
<br/><br>
Results: <br/><br>
Eighty-four masses were benign, eight borderline and 11 invasively malignant. There was substantial variability within and between observers in the results of measurements included in LR1 and LR2 and some variability also when assessing categorical variables included in the models (agreement 51-100%, Kappa 0.42 -1.00). This resulted in substantial variability in the calculated risk of malignancy, the limits of agreement indicating that the calculated risk of malignancy could vary by a factor of five to twenty within and between observers. The reliability of the calculated risk of malignancy was moderate (LR1) or poor (LR2) when the calculated risk of malignancy was &gt; 10% (intra-class correlation coefficients varying from 0.21 to 0.73). Inter-observer agreement when classifying tumors as benign or malignant using the predetermined risk of malignancy cut-off of 10% was fair to good (agreement 85%, Kappa 0.61 for LR1; agreement 81%, Kappa 0.52 for LR2). Intra- and inter-observer agreements for subjective assessment were 96%, 96% and 96% with Kappa values of 0.89, 0.87 and 0.88. <br/><br>
<br/><br>
Conclusions: <br/><br>
Intra- and inter-observer agreement in classifying tumors as benign or malignant using the risk of malignancy cut off of 10% for LR1 and LR2 was fair or good, while the reproducibility of subjective assessment was excellent. The reliability of calculated risks &gt; 10% was poor, and calculated risk &gt; 10% cannot be used to discriminate between individuals at different risk. These results cannot be extrapolated to real-time ultrasound examinations. Copyright © 2012 ISUOG. Published by John Wiley &amp; Sons, Ltd.}},
  author       = {{Sladkevicius, Povilas and Valentin, Lil}},
  issn         = {{1469-0705}},
  language     = {{eng}},
  number       = {{3}},
  pages        = {{318--327}},
  publisher    = {{John Wiley & Sons Inc.}},
  series       = {{Ultrasound in Obstetrics & Gynecology}},
  title        = {{Intra- and inter-observer agreement when describing adnexal masses using the International Ovarian Tumour Analysis (IOTA) terms and definitions: a study on three-dimensional (3D) ultrasound volumes.}},
  url          = {{http://dx.doi.org/10.1002/uog.12289}},
  doi          = {{10.1002/uog.12289}},
  volume       = {{41}},
  year         = {{2013}},
}