Variational Dual-SimCLR: Probabilistic Self-Supervised Contrastive Learning for Satellite Data

Sterner, Vilma

Variational Dual-SimCLR: Probabilistic Self-Supervised Contrastive Learning for Satellite Data

Mark

Sterner, Vilma ^LU (2025) In Master's Thesis in Mathematical Sciences FMAM05 20252
Mathematics (Faculty of Engineering)

Abstract: Self-supervised learning has become an important approach for machine learning in remote sensing, where large amounts of unlabeled satellite data exist but only lim- ited labeled datasets are available. Contrastive methods such as SimCLR have been adapted to multimodal Sentinel-1 and Sentinel-2 imagery, but current SimCLR-based approaches rely on deterministic embeddings and therefore lack the ability to model uncertainty. In this thesis, a multimodal contrastive learning approach, Dual-SimCLR, is implemented for Sentinel-1 and Sentinel-2 data, and the model is extended with a probabilistic component that represents the embeddings as latent distributions rather than deterministic vectors. Building on this formulation, a probabilistic... (More); Self-supervised learning has become an important approach for machine learning in remote sensing, where large amounts of unlabeled satellite data exist but only lim- ited labeled datasets are available. Contrastive methods such as SimCLR have been adapted to multimodal Sentinel-1 and Sentinel-2 imagery, but current SimCLR-based approaches rely on deterministic embeddings and therefore lack the ability to model uncertainty. In this thesis, a multimodal contrastive learning approach, Dual-SimCLR, is implemented for Sentinel-1 and Sentinel-2 data, and the model is extended with a probabilistic component that represents the embeddings as latent distributions rather than deterministic vectors. Building on this formulation, a probabilistic multimodal contrastive learning model, VDual-SimCLR, is proposed and compared against its de- terministic counterpart. The results show that the probabilistic model performs on a similar level as the deterministic baseline on a downstream classification task, while additionally providing uncertainty estimates. These findings demonstrate that the proposed variational framework can be integrated into multimodal Sentinel-1/2 con- trastive learning without reducing downstream performance, o!ering an alternative to deterministic contrastive methods. (Less)
Popular Abstract: Climate change is one of the biggest challenges we are facing today. To deal with it, we need to understand the current state of the Earth and how it changes over time. This requires continuous monitoring on a global scale. Doing this manually is not possible, so automatic and systematic methods are needed.

Satellite data play an important role in this process. Satellites collect large amounts of data from all over the world and it is often free to use. This makes satellite imagery a valuable source of information about the Earth. However, the amount of data is so large that it is difficult for humans to analyze it directly. Instead, computer models such as machine learning models are needed to help extract useful information from the... (More); Climate change is one of the biggest challenges we are facing today. To deal with it, we need to understand the current state of the Earth and how it changes over time. This requires continuous monitoring on a global scale. Doing this manually is not possible, so automatic and systematic methods are needed.

Satellite data play an important role in this process. Satellites collect large amounts of data from all over the world and it is often free to use. This makes satellite imagery a valuable source of information about the Earth. However, the amount of data is so large that it is difficult for humans to analyze it directly. Instead, computer models such as machine learning models are needed to help extract useful information from the data.

One major problem is that most satellite data do not have labels. Labels are annotations that describe what an image shows, for example different types of land cover. Creating such labels requires a lot of time and expert knowledge, which makes it very expensive. Because of this, many traditional machine learning methods cannot be used, since they rely on labeled data.

To handle this problem, self-supervised learning can be used. In this approach, models first learn general patterns directly from the data without using manually created labels. These learned patterns can then be used to train classifiers for tasks such as land cover classification using only a small amount of labeled data. In this work, we study how this can be applied to satellite imagery and how uncertainty can be represented to make the results more reliable.

The results show that the proposed method learns useful representations from satellite data without the need for labels. The model performs on a similar level to existing methods on a classification task, while also providing information about uncertainty. This means that the method can be used not only to make predictions, but also to estimate how confident the model is in its results. (Less)

- Open Access
- |
- PDF

Links

Document download statistics

Related Materials

Related object is popular science:
Popular Science Summary

Please use this url to cite or link to this publication: http://lup.lub.lu.se/student-papers/record/9216314

author

Sterner, Vilma ^LU

supervisor

Alexandros Sopasakis

organization

Mathematics (Faculty of Engineering)

course

FMAM05 20252

year

2025

type

H2 - Master's Degree (Two Years)

subject

Mathematics and Statistics

publication/series

Master's Thesis in Mathematical Sciences

report number

2025:E108

ISSN

1404-6342

other publication id

LUTFMA-3601-2025

language

English

id

9216314

date added to LUP

2026-01-30 14:31:18

date last changed

2026-01-30 14:31:18

@misc{9216314,
  abstract     = {{Self-supervised learning has become an important approach for machine learning in remote sensing, where large amounts of unlabeled satellite data exist but only lim- ited labeled datasets are available. Contrastive methods such as SimCLR have been adapted to multimodal Sentinel-1 and Sentinel-2 imagery, but current SimCLR-based approaches rely on deterministic embeddings and therefore lack the ability to model uncertainty. In this thesis, a multimodal contrastive learning approach, Dual-SimCLR, is implemented for Sentinel-1 and Sentinel-2 data, and the model is extended with a probabilistic component that represents the embeddings as latent distributions rather than deterministic vectors. Building on this formulation, a probabilistic multimodal contrastive learning model, VDual-SimCLR, is proposed and compared against its de- terministic counterpart. The results show that the probabilistic model performs on a similar level as the deterministic baseline on a downstream classification task, while additionally providing uncertainty estimates. These findings demonstrate that the proposed variational framework can be integrated into multimodal Sentinel-1/2 con- trastive learning without reducing downstream performance, o!ering an alternative to deterministic contrastive methods.}},
  author       = {{Sterner, Vilma}},
  issn         = {{1404-6342}},
  language     = {{eng}},
  note         = {{Student Paper}},
  series       = {{Master's Thesis in Mathematical Sciences}},
  title        = {{Variational Dual-SimCLR: Probabilistic Self-Supervised Contrastive Learning for Satellite Data}},
  year         = {{2025}},
}

LUP Student Papers

LUND UNIVERSITY LIBRARIES

Variational Dual-SimCLR: Probabilistic Self-Supervised Contrastive Learning for Satellite Data