Skip to main content

Lund University Publications

LUND UNIVERSITY LIBRARIES

Efficient Monocular 3D Localisation Using Machine Learning : with Additional Studies on Pose Estimation and Shape Reconstruction

Persson, Ivar LU (2026)
Abstract
This thesis concerns localisation, shape reconstruction and pose estimation from monocular images. While multiple cameras allows for triangulation to capture depth in scenes, monocular settings require some of relative features, reasonable real-world assumptions, regularisation, Machine Learning (ML) or a combination of these. The accompanying articles to this compilation thesis treats several topics, where monocular localisation is a common theme. First, 3D pose estimation in conjunction with shape reconstruction is considered in traffic settings. The next part concerns sports analytics and localisation of players on a pitch. Here, creation of a Birds-Eye-View (BEV) is an important piece as it allows detection and localisation without... (More)
This thesis concerns localisation, shape reconstruction and pose estimation from monocular images. While multiple cameras allows for triangulation to capture depth in scenes, monocular settings require some of relative features, reasonable real-world assumptions, regularisation, Machine Learning (ML) or a combination of these. The accompanying articles to this compilation thesis treats several topics, where monocular localisation is a common theme. First, 3D pose estimation in conjunction with shape reconstruction is considered in traffic settings. The next part concerns sports analytics and localisation of players on a pitch. Here, creation of a Birds-Eye-View (BEV) is an important piece as it allows detection and localisation without prior explicit object detection in pixel space. This is aimed at future deployment on embedded devices, why tiling and quantisation is also considered. The tiling allows memory constrained devices to run inference while keeping performance, while quantisation allows for quicker inference on embedded devices but also enabling inference on devices without floating point operations.

The problem of lack of information with monocular images is treated in different ways throughout this thesis. The articles concerning traffic considers scale as one important component as well as a pre-computed ground plane where traffic was assumed to be located at. In the sports analytics problem a flat ground plane was considered to find the locations. This was emphasised by the choice of calculating localisation from a BEV. Another important part of monocular methods are the different feature extraction methods and their different abilities. The articles have used different methods and the thesis continues the discussion.

Monocular localisation is very possible today and the introduction of tiling and quantisation shows promise for continued research as well as use cases in industry. (Less)
Please use this url to cite or link to this publication:
author
supervisor
opponent
  • Assoc. Prof. Mashhadi, Peyman, Halmstad University, Sweden.
organization
publishing date
type
Thesis
publication status
published
subject
keywords
Localisation, Pose Estimation, 3D Reconstruction, Monocular Problems, Sports analytics, Traffic Analytics
publisher
Centre of Mathematical Sciences
defense location
Lecture Hall MH:Gårding, Centre of Mathematical Sciences, Märkesbacken 4, Faculty of Engineering LTH, Lund University, Lund.
defense date
2026-02-27 09:15:00
ISBN
978-91-8104-783-7
978-91-8104-782-0
language
English
LU publication?
yes
id
513f650c-ecf5-423e-87f3-c43f0e5bc36f
date added to LUP
2026-01-27 14:24:34
date last changed
2026-01-28 11:21:07
@phdthesis{513f650c-ecf5-423e-87f3-c43f0e5bc36f,
  abstract     = {{This thesis concerns localisation, shape reconstruction and pose estimation from monocular images. While multiple cameras allows for triangulation to capture depth in scenes, monocular settings require some of relative features, reasonable real-world assumptions, regularisation, Machine Learning (ML) or a combination of these. The accompanying articles to this compilation thesis treats several topics, where monocular localisation is a common theme. First, 3D pose estimation in conjunction with shape reconstruction is considered in traffic settings. The next part concerns sports analytics and localisation of players on a pitch. Here, creation of a Birds-Eye-View (BEV) is an important piece as it allows detection and localisation without prior explicit object detection in pixel space. This is aimed at future deployment on embedded devices, why tiling and quantisation is also considered. The tiling allows memory constrained devices to run inference while keeping performance, while quantisation allows for quicker inference on embedded devices but also enabling inference on devices without floating point operations.<br/><br/>The problem of lack of information with monocular images is treated in different ways throughout this thesis. The articles concerning traffic considers scale as one important component as well as a pre-computed ground plane where traffic was assumed to be located at. In the sports analytics problem a flat ground plane was considered to find the locations. This was emphasised by the choice of calculating localisation from a BEV. Another important part of monocular methods are the different feature extraction methods and their different abilities. The articles have used different methods and the thesis continues the discussion.<br/><br/>Monocular localisation is very possible today and the introduction of tiling and quantisation shows promise for continued research as well as use cases in industry.}},
  author       = {{Persson, Ivar}},
  isbn         = {{978-91-8104-783-7}},
  keywords     = {{Localisation; Pose Estimation; 3D Reconstruction; Monocular Problems; Sports analytics; Traffic Analytics}},
  language     = {{eng}},
  month        = {{02}},
  publisher    = {{Centre of Mathematical Sciences}},
  school       = {{Lund University}},
  title        = {{Efficient Monocular 3D Localisation Using Machine Learning : with Additional Studies on Pose Estimation and Shape Reconstruction}},
  year         = {{2026}},
}