Skip to main content

LUP Student Papers

LUND UNIVERSITY LIBRARIES

Self-supervised representation learning from electrocardiogram data for medical applications

Andersson, Matilda LU (2023) In Master’s Theses in Mathematical Sciences FMAM05 20221
Mathematics (Faculty of Engineering)
Abstract
Cardiovascular diseases are the leading cause of death worldwide, increasing yearly. However, many abnormalities in heart cycles can be discovered and treated years before the onset of diseases. But in most societies, regular health checkups are a concept reserved for cars, not humans. In order to save lives, our healthcare systems must adopt a preventative rather than a reactive approach. To that end, there have been several attempts to produce automated ECG-based heartbeat classification methods over the last few decades. But their performance is hindered by limited access to high-quality labeled data, restricting their usage to secondary diagnostic purposes.

In this regard, a self-supervised learning framework could provide a viable... (More)
Cardiovascular diseases are the leading cause of death worldwide, increasing yearly. However, many abnormalities in heart cycles can be discovered and treated years before the onset of diseases. But in most societies, regular health checkups are a concept reserved for cars, not humans. In order to save lives, our healthcare systems must adopt a preventative rather than a reactive approach. To that end, there have been several attempts to produce automated ECG-based heartbeat classification methods over the last few decades. But their performance is hindered by limited access to high-quality labeled data, restricting their usage to secondary diagnostic purposes.

In this regard, a self-supervised learning framework could provide a viable solution, as it decouples deep learning progress from the dependence on large volumes of annotated data, and instead uses unlabelled samples. In this thesis, we present an assessment of self-supervised representation learning on 12-lead clinical ECG data to examine whether self-supervised learning methods can be applied to electrocardiogram signals to produce meaningful feature representations from only unlabelled data.

We implement the self-supervised learning methods SimCLR, BYOL, and VICReg and compare their performances to a supervised learning method. In doing so, we find that self-supervised learning produces meaningful representations of ECG signals. When following each method’s recommended implementation protocol, the performance equals those of a conventional supervised model, initially suggesting that self-supervised pre-training offers no additional benefits to downstream tasks. However, by increasing the length of the ECG signal and adjusting the data augmentation strategy, self-supervised pre-trained models outperformed their supervised counterparts in all evaluation settings. In light of our experiments, we find that a suitable augmentation protocol is crucial for high downstream classification performance. (Less)
Please use this url to cite or link to this publication:
author
Andersson, Matilda LU
supervisor
organization
course
FMAM05 20221
year
type
H2 - Master's Degree (Two Years)
subject
keywords
Self-supervised learning, Deep learning, Cardiovascular disease, Electrocardiogram, ECG, SimCLR, BYOL, VICReg
publication/series
Master’s Theses in Mathematical Sciences
report number
LUTFMA-3494-2023
ISSN
1404-6342
other publication id
2023:E2
language
English
id
9106593
date added to LUP
2023-05-10 16:34:17
date last changed
2023-05-10 16:34:17
@misc{9106593,
  abstract     = {{Cardiovascular diseases are the leading cause of death worldwide, increasing yearly. However, many abnormalities in heart cycles can be discovered and treated years before the onset of diseases. But in most societies, regular health checkups are a concept reserved for cars, not humans. In order to save lives, our healthcare systems must adopt a preventative rather than a reactive approach. To that end, there have been several attempts to produce automated ECG-based heartbeat classification methods over the last few decades. But their performance is hindered by limited access to high-quality labeled data, restricting their usage to secondary diagnostic purposes. 

In this regard, a self-supervised learning framework could provide a viable solution, as it decouples deep learning progress from the dependence on large volumes of annotated data, and instead uses unlabelled samples. In this thesis, we present an assessment of self-supervised representation learning on 12-lead clinical ECG data to examine whether self-supervised learning methods can be applied to electrocardiogram signals to produce meaningful feature representations from only unlabelled data. 

We implement the self-supervised learning methods SimCLR, BYOL, and VICReg and compare their performances to a supervised learning method. In doing so, we find that self-supervised learning produces meaningful representations of ECG signals. When following each method’s recommended implementation protocol, the performance equals those of a conventional supervised model, initially suggesting that self-supervised pre-training offers no additional benefits to downstream tasks. However, by increasing the length of the ECG signal and adjusting the data augmentation strategy, self-supervised pre-trained models outperformed their supervised counterparts in all evaluation settings. In light of our experiments, we find that a suitable augmentation protocol is crucial for high downstream classification performance.}},
  author       = {{Andersson, Matilda}},
  issn         = {{1404-6342}},
  language     = {{eng}},
  note         = {{Student Paper}},
  series       = {{Master’s Theses in Mathematical Sciences}},
  title        = {{Self-supervised representation learning from electrocardiogram data for medical applications}},
  year         = {{2023}},
}