Advanced

Polytomous diagnosis of ovarian tumors as benign, borderline, primary invasive or metastatic: development and validation of standard and kernel-based risk prediction models

Van Calster, Ben; Valentin, Lil LU ; Van Holsbeke, Caroline; Testa, Antonia C.; Bourne, Tom; Van Huffel, Sabine and Timmerman, Dirk (2010) In BMC Medical Research Methodology 10.
Abstract
Background: Hitherto, risk prediction models for preoperative ultrasound-based diagnosis of ovarian tumors were dichotomous (benign versus malignant). We develop and validate polytomous models (models that predict more than two events) to diagnose ovarian tumors as benign, borderline, primary invasive or metastatic invasive. The main focus is on how different types of models perform and compare. Methods: A multi-center dataset containing 1066 women was used for model development and internal validation, whilst another multi-center dataset of 1938 women was used for temporal and external validation. Models were based on standard logistic regression and on penalized kernel-based algorithms (least squares support vector machines and kernel... (More)
Background: Hitherto, risk prediction models for preoperative ultrasound-based diagnosis of ovarian tumors were dichotomous (benign versus malignant). We develop and validate polytomous models (models that predict more than two events) to diagnose ovarian tumors as benign, borderline, primary invasive or metastatic invasive. The main focus is on how different types of models perform and compare. Methods: A multi-center dataset containing 1066 women was used for model development and internal validation, whilst another multi-center dataset of 1938 women was used for temporal and external validation. Models were based on standard logistic regression and on penalized kernel-based algorithms (least squares support vector machines and kernel logistic regression). We used true polytomous models as well as combinations of dichotomous models based on the 'pairwise coupling' technique to produce polytomous risk estimates. Careful variable selection was performed, based largely on cross-validated c-index estimates. Model performance was assessed with the dichotomous c-index (i.e. the area under the ROC curve) and a polytomous extension, and with calibration graphs. Results: For all models, between 9 and 11 predictors were selected. Internal validation was successful with polytomous c-indexes between 0.64 and 0.69. For the best model dichotomous c-indexes were between 0.73 (primary invasive vs metastatic) and 0.96 (borderline vs metastatic). On temporal and external validation, overall discrimination performance was good with polytomous c-indexes between 0.57 and 0.64. However, discrimination between primary and metastatic invasive tumors decreased to near random levels. Standard logistic regression performed well in comparison with advanced algorithms, and combining dichotomous models performed well in comparison with true polytomous models. The best model was a combination of dichotomous logistic regression models. This model is available online. Conclusions: We have developed models that successfully discriminate between benign, borderline, and invasive ovarian tumors. Methodologically, the combination of dichotomous models was an interesting approach to tackle the polytomous problem. Standard logistic regression models were not outperformed by regularized kernel-based alternatives, a finding to which the careful variable selection procedure will have contributed. The random discrimination between primary and metastatic invasive tumors on temporal/external validation demonstrated once more the necessity of validation studies. (Less)
Please use this url to cite or link to this publication:
author
organization
publishing date
type
Contribution to journal
publication status
published
subject
in
BMC Medical Research Methodology
volume
10
publisher
BioMed Central
external identifiers
  • wos:000284323200001
  • scopus:77957988487
ISSN
1471-2288
DOI
10.1186/1471-2288-10-96
language
English
LU publication?
yes
id
c193d9c7-b8d7-4d01-8555-ed8af68f83e8 (old id 1752594)
date added to LUP
2011-01-04 07:36:27
date last changed
2018-05-29 10:29:22
@article{c193d9c7-b8d7-4d01-8555-ed8af68f83e8,
  abstract     = {Background: Hitherto, risk prediction models for preoperative ultrasound-based diagnosis of ovarian tumors were dichotomous (benign versus malignant). We develop and validate polytomous models (models that predict more than two events) to diagnose ovarian tumors as benign, borderline, primary invasive or metastatic invasive. The main focus is on how different types of models perform and compare. Methods: A multi-center dataset containing 1066 women was used for model development and internal validation, whilst another multi-center dataset of 1938 women was used for temporal and external validation. Models were based on standard logistic regression and on penalized kernel-based algorithms (least squares support vector machines and kernel logistic regression). We used true polytomous models as well as combinations of dichotomous models based on the 'pairwise coupling' technique to produce polytomous risk estimates. Careful variable selection was performed, based largely on cross-validated c-index estimates. Model performance was assessed with the dichotomous c-index (i.e. the area under the ROC curve) and a polytomous extension, and with calibration graphs. Results: For all models, between 9 and 11 predictors were selected. Internal validation was successful with polytomous c-indexes between 0.64 and 0.69. For the best model dichotomous c-indexes were between 0.73 (primary invasive vs metastatic) and 0.96 (borderline vs metastatic). On temporal and external validation, overall discrimination performance was good with polytomous c-indexes between 0.57 and 0.64. However, discrimination between primary and metastatic invasive tumors decreased to near random levels. Standard logistic regression performed well in comparison with advanced algorithms, and combining dichotomous models performed well in comparison with true polytomous models. The best model was a combination of dichotomous logistic regression models. This model is available online. Conclusions: We have developed models that successfully discriminate between benign, borderline, and invasive ovarian tumors. Methodologically, the combination of dichotomous models was an interesting approach to tackle the polytomous problem. Standard logistic regression models were not outperformed by regularized kernel-based alternatives, a finding to which the careful variable selection procedure will have contributed. The random discrimination between primary and metastatic invasive tumors on temporal/external validation demonstrated once more the necessity of validation studies.},
  author       = {Van Calster, Ben and Valentin, Lil and Van Holsbeke, Caroline and Testa, Antonia C. and Bourne, Tom and Van Huffel, Sabine and Timmerman, Dirk},
  issn         = {1471-2288},
  language     = {eng},
  publisher    = {BioMed Central},
  series       = {BMC Medical Research Methodology},
  title        = {Polytomous diagnosis of ovarian tumors as benign, borderline, primary invasive or metastatic: development and validation of standard and kernel-based risk prediction models},
  url          = {http://dx.doi.org/10.1186/1471-2288-10-96},
  volume       = {10},
  year         = {2010},
}