Efficient Monocular 3D Localisation Using Machine Learning : with Additional Studies on Pose Estimation and Shape Reconstruction
(2026)- Abstract
- This thesis concerns localisation, shape reconstruction and pose estimation from monocular images. While multiple cameras allows for triangulation to capture depth in scenes, monocular settings require some of relative features, reasonable real-world assumptions, regularisation, Machine Learning (ML) or a combination of these. The accompanying articles to this compilation thesis treats several topics, where monocular localisation is a common theme. First, 3D pose estimation in conjunction with shape reconstruction is considered in traffic settings. The next part concerns sports analytics and localisation of players on a pitch. Here, creation of a Birds-Eye-View (BEV) is an important piece as it allows detection and localisation without... (More)
- This thesis concerns localisation, shape reconstruction and pose estimation from monocular images. While multiple cameras allows for triangulation to capture depth in scenes, monocular settings require some of relative features, reasonable real-world assumptions, regularisation, Machine Learning (ML) or a combination of these. The accompanying articles to this compilation thesis treats several topics, where monocular localisation is a common theme. First, 3D pose estimation in conjunction with shape reconstruction is considered in traffic settings. The next part concerns sports analytics and localisation of players on a pitch. Here, creation of a Birds-Eye-View (BEV) is an important piece as it allows detection and localisation without prior explicit object detection in pixel space. This is aimed at future deployment on embedded devices, why tiling and quantisation is also considered. The tiling allows memory constrained devices to run inference while keeping performance, while quantisation allows for quicker inference on embedded devices but also enabling inference on devices without floating point operations.
The problem of lack of information with monocular images is treated in different ways throughout this thesis. The articles concerning traffic considers scale as one important component as well as a pre-computed ground plane where traffic was assumed to be located at. In the sports analytics problem a flat ground plane was considered to find the locations. This was emphasised by the choice of calculating localisation from a BEV. Another important part of monocular methods are the different feature extraction methods and their different abilities. The articles have used different methods and the thesis continues the discussion.
Monocular localisation is very possible today and the introduction of tiling and quantisation shows promise for continued research as well as use cases in industry. (Less)
Please use this url to cite or link to this publication:
https://lup.lub.lu.se/record/513f650c-ecf5-423e-87f3-c43f0e5bc36f
- author
- Persson, Ivar LU
- supervisor
-
- Mikael Nilsson LU
- Kalle Åström LU
- Magnus Oskarsson LU
- opponent
-
- Assoc. Prof. Mashhadi, Peyman, Halmstad University, Sweden.
- organization
- publishing date
- 2026-02-27
- type
- Thesis
- publication status
- published
- subject
- keywords
- Localisation, Pose Estimation, 3D Reconstruction, Monocular Problems, Sports analytics, Traffic Analytics
- publisher
- Centre of Mathematical Sciences
- defense location
- Lecture Hall MH:Gårding, Centre of Mathematical Sciences, Märkesbacken 4, Faculty of Engineering LTH, Lund University, Lund.
- defense date
- 2026-02-27 09:15:00
- ISBN
- 978-91-8104-783-7
- 978-91-8104-782-0
- language
- English
- LU publication?
- yes
- id
- 513f650c-ecf5-423e-87f3-c43f0e5bc36f
- date added to LUP
- 2026-01-27 14:24:34
- date last changed
- 2026-01-28 11:21:07
@phdthesis{513f650c-ecf5-423e-87f3-c43f0e5bc36f,
abstract = {{This thesis concerns localisation, shape reconstruction and pose estimation from monocular images. While multiple cameras allows for triangulation to capture depth in scenes, monocular settings require some of relative features, reasonable real-world assumptions, regularisation, Machine Learning (ML) or a combination of these. The accompanying articles to this compilation thesis treats several topics, where monocular localisation is a common theme. First, 3D pose estimation in conjunction with shape reconstruction is considered in traffic settings. The next part concerns sports analytics and localisation of players on a pitch. Here, creation of a Birds-Eye-View (BEV) is an important piece as it allows detection and localisation without prior explicit object detection in pixel space. This is aimed at future deployment on embedded devices, why tiling and quantisation is also considered. The tiling allows memory constrained devices to run inference while keeping performance, while quantisation allows for quicker inference on embedded devices but also enabling inference on devices without floating point operations.<br/><br/>The problem of lack of information with monocular images is treated in different ways throughout this thesis. The articles concerning traffic considers scale as one important component as well as a pre-computed ground plane where traffic was assumed to be located at. In the sports analytics problem a flat ground plane was considered to find the locations. This was emphasised by the choice of calculating localisation from a BEV. Another important part of monocular methods are the different feature extraction methods and their different abilities. The articles have used different methods and the thesis continues the discussion.<br/><br/>Monocular localisation is very possible today and the introduction of tiling and quantisation shows promise for continued research as well as use cases in industry.}},
author = {{Persson, Ivar}},
isbn = {{978-91-8104-783-7}},
keywords = {{Localisation; Pose Estimation; 3D Reconstruction; Monocular Problems; Sports analytics; Traffic Analytics}},
language = {{eng}},
month = {{02}},
publisher = {{Centre of Mathematical Sciences}},
school = {{Lund University}},
title = {{Efficient Monocular 3D Localisation Using Machine Learning : with Additional Studies on Pose Estimation and Shape Reconstruction}},
year = {{2026}},
}