Skip to main content

LUP Student Papers

LUND UNIVERSITY LIBRARIES

Variational Dual-SimCLR: Probabilistic Self-Supervised Contrastive Learning for Satellite Data

Sterner, Vilma LU (2025) In Master's Thesis in Mathematical Sciences FMAM05 20252
Mathematics (Faculty of Engineering)
Abstract
Self-supervised learning has become an important approach for machine learning in remote sensing, where large amounts of unlabeled satellite data exist but only lim- ited labeled datasets are available. Contrastive methods such as SimCLR have been adapted to multimodal Sentinel-1 and Sentinel-2 imagery, but current SimCLR-based approaches rely on deterministic embeddings and therefore lack the ability to model uncertainty. In this thesis, a multimodal contrastive learning approach, Dual-SimCLR, is implemented for Sentinel-1 and Sentinel-2 data, and the model is extended with a probabilistic component that represents the embeddings as latent distributions rather than deterministic vectors. Building on this formulation, a probabilistic... (More)
Self-supervised learning has become an important approach for machine learning in remote sensing, where large amounts of unlabeled satellite data exist but only lim- ited labeled datasets are available. Contrastive methods such as SimCLR have been adapted to multimodal Sentinel-1 and Sentinel-2 imagery, but current SimCLR-based approaches rely on deterministic embeddings and therefore lack the ability to model uncertainty. In this thesis, a multimodal contrastive learning approach, Dual-SimCLR, is implemented for Sentinel-1 and Sentinel-2 data, and the model is extended with a probabilistic component that represents the embeddings as latent distributions rather than deterministic vectors. Building on this formulation, a probabilistic multimodal contrastive learning model, VDual-SimCLR, is proposed and compared against its de- terministic counterpart. The results show that the probabilistic model performs on a similar level as the deterministic baseline on a downstream classification task, while additionally providing uncertainty estimates. These findings demonstrate that the proposed variational framework can be integrated into multimodal Sentinel-1/2 con- trastive learning without reducing downstream performance, o!ering an alternative to deterministic contrastive methods. (Less)
Popular Abstract
Climate change is one of the biggest challenges we are facing today. To deal with it, we need to understand the current state of the Earth and how it changes over time. This requires continuous monitoring on a global scale. Doing this manually is not possible, so automatic and systematic methods are needed.

Satellite data play an important role in this process. Satellites collect large amounts of data from all over the world and it is often free to use. This makes satellite imagery a valuable source of information about the Earth. However, the amount of data is so large that it is difficult for humans to analyze it directly. Instead, computer models such as machine learning models are needed to help extract useful information from the... (More)
Climate change is one of the biggest challenges we are facing today. To deal with it, we need to understand the current state of the Earth and how it changes over time. This requires continuous monitoring on a global scale. Doing this manually is not possible, so automatic and systematic methods are needed.

Satellite data play an important role in this process. Satellites collect large amounts of data from all over the world and it is often free to use. This makes satellite imagery a valuable source of information about the Earth. However, the amount of data is so large that it is difficult for humans to analyze it directly. Instead, computer models such as machine learning models are needed to help extract useful information from the data.

One major problem is that most satellite data do not have labels. Labels are annotations that describe what an image shows, for example different types of land cover. Creating such labels requires a lot of time and expert knowledge, which makes it very expensive. Because of this, many traditional machine learning methods cannot be used, since they rely on labeled data.

To handle this problem, self-supervised learning can be used. In this approach, models first learn general patterns directly from the data without using manually created labels. These learned patterns can then be used to train classifiers for tasks such as land cover classification using only a small amount of labeled data. In this work, we study how this can be applied to satellite imagery and how uncertainty can be represented to make the results more reliable.

The results show that the proposed method learns useful representations from satellite data without the need for labels. The model performs on a similar level to existing methods on a classification task, while also providing information about uncertainty. This means that the method can be used not only to make predictions, but also to estimate how confident the model is in its results. (Less)
Please use this url to cite or link to this publication:
author
Sterner, Vilma LU
supervisor
organization
course
FMAM05 20252
year
type
H2 - Master's Degree (Two Years)
subject
publication/series
Master's Thesis in Mathematical Sciences
report number
2025:E108
ISSN
1404-6342
other publication id
LUTFMA-3601-2025
language
English
id
9216314
date added to LUP
2026-01-30 14:31:18
date last changed
2026-01-30 14:31:18
@misc{9216314,
  abstract     = {{Self-supervised learning has become an important approach for machine learning in remote sensing, where large amounts of unlabeled satellite data exist but only lim- ited labeled datasets are available. Contrastive methods such as SimCLR have been adapted to multimodal Sentinel-1 and Sentinel-2 imagery, but current SimCLR-based approaches rely on deterministic embeddings and therefore lack the ability to model uncertainty. In this thesis, a multimodal contrastive learning approach, Dual-SimCLR, is implemented for Sentinel-1 and Sentinel-2 data, and the model is extended with a probabilistic component that represents the embeddings as latent distributions rather than deterministic vectors. Building on this formulation, a probabilistic multimodal contrastive learning model, VDual-SimCLR, is proposed and compared against its de- terministic counterpart. The results show that the probabilistic model performs on a similar level as the deterministic baseline on a downstream classification task, while additionally providing uncertainty estimates. These findings demonstrate that the proposed variational framework can be integrated into multimodal Sentinel-1/2 con- trastive learning without reducing downstream performance, o!ering an alternative to deterministic contrastive methods.}},
  author       = {{Sterner, Vilma}},
  issn         = {{1404-6342}},
  language     = {{eng}},
  note         = {{Student Paper}},
  series       = {{Master's Thesis in Mathematical Sciences}},
  title        = {{Variational Dual-SimCLR: Probabilistic Self-Supervised Contrastive Learning for Satellite Data}},
  year         = {{2025}},
}