Skip to main content

LUP Student Papers

LUND UNIVERSITY LIBRARIES

Relative depth estimation from dense matches using Graph Neural Networks

Salnikov, Vladimir LU (2025) NUMK11 20251
Mathematics (Faculty of Sciences)
Centre for Mathematical Sciences
Abstract
This thesis explores the application of Graph Neural Networks (GNNs) for relative depth estimation from dense matches between images. The main goals were to investigate the feasibility of using dense matches alone for this task, assess the performance improvement offered by GNNs over classical methods, and identify the most suitable features. The methodology involved utilizing the matches from Dense Kernelized Feature Matching (DKM) algorithm applied to the ScanNet-1500 and MegaDepth-1500 datasets. A multi-scale Graph Attention Network (MultiScaleGAT) model was developed and evaluated against Uniform-One and K-Nearest Neighbour baselines using Mean Squared Error (MSE) and Success Rates (SR) metrics. Experimental results demonstrate that... (More)
This thesis explores the application of Graph Neural Networks (GNNs) for relative depth estimation from dense matches between images. The main goals were to investigate the feasibility of using dense matches alone for this task, assess the performance improvement offered by GNNs over classical methods, and identify the most suitable features. The methodology involved utilizing the matches from Dense Kernelized Feature Matching (DKM) algorithm applied to the ScanNet-1500 and MegaDepth-1500 datasets. A multi-scale Graph Attention Network (MultiScaleGAT) model was developed and evaluated against Uniform-One and K-Nearest Neighbour baselines using Mean Squared Error (MSE) and Success Rates (SR) metrics. Experimental results demonstrate that dense correspondences provide sufficient information for relative depth estimation, and the MultiScaleGAT model significantly outperforms baseline methods on both indoor and outdoor datasets. A feature ablation study showed that the KNN ratio feature is crucial for cross-domain accuracy. The findings confirm the considerable advantage of using a learnable graph-based architecture for this task. (Less)
Popular Abstract
Computer vision is a field of artificial intelligence that teaches computers to interpret and understand visual information from images and videos. In particular, 3D computer vision aims to understand the three-dimensional structure of the world from these 2D images. Think about how our brains, receiving slightly different images from each eye, can perceive depth. 3D computer vision tries to replicate this ability computationally.
Relative depth estimation is a fundamental task in 3D computer vision. Instead of measuring the exact distance from the camera to every point in a scene, we capture two images from different viewpoints and calculate how much closer or farther each object appears in the second image relative to the first.
My... (More)
Computer vision is a field of artificial intelligence that teaches computers to interpret and understand visual information from images and videos. In particular, 3D computer vision aims to understand the three-dimensional structure of the world from these 2D images. Think about how our brains, receiving slightly different images from each eye, can perceive depth. 3D computer vision tries to replicate this ability computationally.
Relative depth estimation is a fundamental task in 3D computer vision. Instead of measuring the exact distance from the camera to every point in a scene, we capture two images from different viewpoints and calculate how much closer or farther each object appears in the second image relative to the first.
My thesis explores how we can use a powerful type of artificial intelligence called Graph Neural Networks (GNNs) to improve this relative depth estimation process. To understand GNNs, let's first think about graphs. Imagine a social network where people are "nodes" and their friendships are "edges" connecting them. A graph is simply a collection of nodes and edges showing how they relate to each other.
Graph Neural Networks are designed specifically to work with this kind of connected data. Unlike typical AI methods that primarily look at individual data points, GNNs consider the relationships and connections between points. They do this by passing information, or "messages," along the edges of the graph, allowing each node to learn from its neighbors and from the broader structure of the network.
My research shows that using GNNs results in much better estimates of relative depth than simpler methods. (Less)
Please use this url to cite or link to this publication:
author
Salnikov, Vladimir LU
supervisor
organization
course
NUMK11 20251
year
type
M2 - Bachelor Degree
subject
keywords
Relative depth estimation, Relative Scale estimation, Graph Neural Networks, Computer Vision
report number
LUNFNA-4061-2025
ISSN
1654-6229
other publication id
2025:K5
language
English
id
9189402
date added to LUP
2025-06-12 14:59:53
date last changed
2025-06-12 14:59:53
@misc{9189402,
  abstract     = {{This thesis explores the application of Graph Neural Networks (GNNs) for relative depth estimation from dense matches between images. The main goals were to investigate the feasibility of using dense matches alone for this task, assess the performance improvement offered by GNNs over classical methods, and identify the most suitable features. The methodology involved utilizing the matches from Dense Kernelized Feature Matching (DKM) algorithm applied to the ScanNet-1500 and MegaDepth-1500 datasets. A multi-scale Graph Attention Network (MultiScaleGAT) model was developed and evaluated against Uniform-One and K-Nearest Neighbour baselines using Mean Squared Error (MSE) and Success Rates (SR) metrics. Experimental results demonstrate that dense correspondences provide sufficient information for relative depth estimation, and the MultiScaleGAT model significantly outperforms baseline methods on both indoor and outdoor datasets. A feature ablation study showed that the KNN ratio feature is crucial for cross-domain accuracy. The findings confirm the considerable advantage of using a learnable graph-based architecture for this task.}},
  author       = {{Salnikov, Vladimir}},
  issn         = {{1654-6229}},
  language     = {{eng}},
  note         = {{Student Paper}},
  title        = {{Relative depth estimation from dense matches using Graph Neural Networks}},
  year         = {{2025}},
}