Training artificial neural networks directly on the concordance index for censored data using genetic algorithms.

Kalderstam, Jonas; Edén, Patrik; Bendahl, Pär-Ola; Forsare, Carina; Fernö, Mårten; Ohlsson, Mattias

Training artificial neural networks directly on the concordance index for censored data using genetic algorithms.

Mark

Kalderstam, Jonas ^LU ; Edén, Patrik ^LU ; Bendahl, Pär-Ola ^LU ; Forsare, Carina ^LU

; Fernö, Mårten ^LU and Ohlsson, Mattias ^LU

(2013) In Artificial Intelligence in Medicine 58(2). p.125-132

Abstract: OBJECTIVE: The concordance index (c-index) is the standard way of evaluating the performance of prognostic models in the presence of censored data. Constructing prognostic models using artificial neural networks (ANNs) is commonly done by training on error functions which are modified versions of the c-index. Our objective was to demonstrate the capability of training directly on the c-index and to evaluate our approach compared to the Cox proportional hazards model. METHOD: We constructed a prognostic model using an ensemble of ANNs which were trained using a genetic algorithm. The individual networks were trained on a non-linear artificial data set divided into a training and test set both of size 2000, where 50% of the data was... (More); OBJECTIVE: The concordance index (c-index) is the standard way of evaluating the performance of prognostic models in the presence of censored data. Constructing prognostic models using artificial neural networks (ANNs) is commonly done by training on error functions which are modified versions of the c-index. Our objective was to demonstrate the capability of training directly on the c-index and to evaluate our approach compared to the Cox proportional hazards model. METHOD: We constructed a prognostic model using an ensemble of ANNs which were trained using a genetic algorithm. The individual networks were trained on a non-linear artificial data set divided into a training and test set both of size 2000, where 50% of the data was censored. The ANNs were also trained on a data set consisting of 4042 patients treated for breast cancer spread over five different medical studies, 2/3 used for training and 1/3 used as a test set. A Cox model was also constructed on the same data in both cases. The two models' c-indices on the test sets were then compared. The ranking performance of the models is additionally presented visually using modified scatter plots. RESULTS: Cross validation on the cancer training set did not indicate any non-linear effects between the covariates. An ensemble of 30 ANNs with one hidden neuron was therefore used. The ANN model had almost the same c-index score as the Cox model (c-index=0.70 and 0.71, respectively) on the cancer test set. Both models identified similarly sized low risk groups with at most 10% false positives, 49 for the ANN model and 60 for the Cox model, but repeated bootstrap runs indicate that the difference was not significant. A significant difference could however be seen when applied on the non-linear synthetic data set. In that case the ANN ensemble managed to achieve a c-index score of 0.90 whereas the Cox model failed to distinguish itself from the random case (c-index=0.49). CONCLUSIONS: We have found empirical evidence that ensembles of ANN models can be optimized directly on the c-index. Comparison with a Cox model indicates that near identical performance is achieved on a real cancer data set while on a non-linear data set the ANN model is clearly superior. (Less)

Please use this url to cite or link to this publication: https://lup.lub.lu.se/record/3733806

author

Kalderstam, Jonas ^LU ; Edén, Patrik ^LU ; Bendahl, Pär-Ola ^LU ; Forsare, Carina ^LU

; Fernö, Mårten ^LU and Ohlsson, Mattias ^LU

organization

publishing date

2013

type

Contribution to journal

publication status

published

subject

in

Artificial Intelligence in Medicine

volume

58

issue

2

pages

125 - 132

publisher

Elsevier

external identifiers

wos:000320351800006
pmid:23582884
scopus:84878114697
pmid:23582884

ISSN

1873-2860

DOI

10.1016/j.artmed.2013.03.001

language

English

LU publication?

yes

id

4b52eea0-c8a6-4e6f-a065-44346677234d (old id 3733806)

alternative location

http://www.ncbi.nlm.nih.gov/pubmed/23582884?dopt=Abstract

date added to LUP

2016-04-01 10:05:24

date last changed

2025-10-14 12:01:19

@article{4b52eea0-c8a6-4e6f-a065-44346677234d,
  abstract     = {{OBJECTIVE: The concordance index (c-index) is the standard way of evaluating the performance of prognostic models in the presence of censored data. Constructing prognostic models using artificial neural networks (ANNs) is commonly done by training on error functions which are modified versions of the c-index. Our objective was to demonstrate the capability of training directly on the c-index and to evaluate our approach compared to the Cox proportional hazards model. METHOD: We constructed a prognostic model using an ensemble of ANNs which were trained using a genetic algorithm. The individual networks were trained on a non-linear artificial data set divided into a training and test set both of size 2000, where 50% of the data was censored. The ANNs were also trained on a data set consisting of 4042 patients treated for breast cancer spread over five different medical studies, 2/3 used for training and 1/3 used as a test set. A Cox model was also constructed on the same data in both cases. The two models' c-indices on the test sets were then compared. The ranking performance of the models is additionally presented visually using modified scatter plots. RESULTS: Cross validation on the cancer training set did not indicate any non-linear effects between the covariates. An ensemble of 30 ANNs with one hidden neuron was therefore used. The ANN model had almost the same c-index score as the Cox model (c-index=0.70 and 0.71, respectively) on the cancer test set. Both models identified similarly sized low risk groups with at most 10% false positives, 49 for the ANN model and 60 for the Cox model, but repeated bootstrap runs indicate that the difference was not significant. A significant difference could however be seen when applied on the non-linear synthetic data set. In that case the ANN ensemble managed to achieve a c-index score of 0.90 whereas the Cox model failed to distinguish itself from the random case (c-index=0.49). CONCLUSIONS: We have found empirical evidence that ensembles of ANN models can be optimized directly on the c-index. Comparison with a Cox model indicates that near identical performance is achieved on a real cancer data set while on a non-linear data set the ANN model is clearly superior.}},
  author       = {{Kalderstam, Jonas and Edén, Patrik and Bendahl, Pär-Ola and Forsare, Carina and Fernö, Mårten and Ohlsson, Mattias}},
  issn         = {{1873-2860}},
  language     = {{eng}},
  number       = {{2}},
  pages        = {{125--132}},
  publisher    = {{Elsevier}},
  series       = {{Artificial Intelligence in Medicine}},
  title        = {{Training artificial neural networks directly on the concordance index for censored data using genetic algorithms.}},
  url          = {{https://lup.lub.lu.se/search/files/1552988/4023341.pdf}},
  doi          = {{10.1016/j.artmed.2013.03.001}},
  volume       = {{58}},
  year         = {{2013}},
}

Lund University Publications

LUND UNIVERSITY LIBRARIES

Training artificial neural networks directly on the concordance index for censored data using genetic algorithms.