Individual Layer Scaling and Redundancy Reduction of Singleshot Multiplane Images for View Synthesis
(2023) In Master's Theses in Mathematical Sciences FMAM05 20222Mathematics (Faculty of Engineering)
- Abstract
- Image-based rendering for view synthesis is a field within computer vision that has seen a growing number of research activity within the past few years, much due to deep learning techniques emerging, allowing researchers to re-format view synthesis as a learning problem. The multiplane image (MPI) is a recently proposed learning-based layered 3D representation, created from one or a few input images for the purpose of rendering novel views. When rendering with an MPI created from a single image, scaling becomes an inherent problem due to the lack of a distance reference point. The scaling issue prohibits smooth view transitions between different MPI and combining this with the inherently large file size of the representation makes... (More)
- Image-based rendering for view synthesis is a field within computer vision that has seen a growing number of research activity within the past few years, much due to deep learning techniques emerging, allowing researchers to re-format view synthesis as a learning problem. The multiplane image (MPI) is a recently proposed learning-based layered 3D representation, created from one or a few input images for the purpose of rendering novel views. When rendering with an MPI created from a single image, scaling becomes an inherent problem due to the lack of a distance reference point. The scaling issue prohibits smooth view transitions between different MPI and combining this with the inherently large file size of the representation makes multi-image scene applications difficult with single shot multiplane images.
In this thesis, a method is proposed to individually scale the layers of an MPI to improve view synthesis results and view transitions between different MPI in multi-image scenes. The method consists of an optimization algorithm which adjusts the individual layers of an MPI using their alpha values together with extrinsic depth information in the form of a sparse or a dense depth map. For testing, an MPI viewer program was created allowing viewing of multi-MPI scenes as well as free transitioning and blending between different MPI. Evaluations of the algorithm were performed on images from both a synthetic and a real-life SLAM dataset, resulting in higher than baseline scores for the method across the three most commonly used image similarity evaluation metrics. Additionally, the constructed method allows for removal of redundant layers in each MPI, reducing file size by over 73% for the synthetic dataset and 78% for the SLAM dataset. (Less) - Popular Abstract (Swedish)
- Flerplansbilder används för att rendera bilder av 3D scener från nya vyer. I detta arbete gör vi ett djupdyk ner i flerplansbilder för att undersöka hur vi kan justera avståndet mellan lagren, samt dess antal för att rendera bättre bilder och minska deras filstorlek.
Please use this url to cite or link to this publication:
http://lup.lub.lu.se/student-papers/record/9119659
- author
- Bergfelt, Max LU
- supervisor
- organization
- course
- FMAM05 20222
- year
- 2023
- type
- H2 - Master's Degree (Two Years)
- subject
- keywords
- MPI, multiplane images, view synthesis, rendering, computer vision, unity
- publication/series
- Master's Theses in Mathematical Sciences
- report number
- LUTFMA-3500-2023
- ISSN
- 1404-6342
- other publication id
- 2023:E19
- language
- English
- id
- 9119659
- date added to LUP
- 2023-06-27 09:50:39
- date last changed
- 2023-06-27 09:50:39
@misc{9119659, abstract = {{Image-based rendering for view synthesis is a field within computer vision that has seen a growing number of research activity within the past few years, much due to deep learning techniques emerging, allowing researchers to re-format view synthesis as a learning problem. The multiplane image (MPI) is a recently proposed learning-based layered 3D representation, created from one or a few input images for the purpose of rendering novel views. When rendering with an MPI created from a single image, scaling becomes an inherent problem due to the lack of a distance reference point. The scaling issue prohibits smooth view transitions between different MPI and combining this with the inherently large file size of the representation makes multi-image scene applications difficult with single shot multiplane images. In this thesis, a method is proposed to individually scale the layers of an MPI to improve view synthesis results and view transitions between different MPI in multi-image scenes. The method consists of an optimization algorithm which adjusts the individual layers of an MPI using their alpha values together with extrinsic depth information in the form of a sparse or a dense depth map. For testing, an MPI viewer program was created allowing viewing of multi-MPI scenes as well as free transitioning and blending between different MPI. Evaluations of the algorithm were performed on images from both a synthetic and a real-life SLAM dataset, resulting in higher than baseline scores for the method across the three most commonly used image similarity evaluation metrics. Additionally, the constructed method allows for removal of redundant layers in each MPI, reducing file size by over 73% for the synthetic dataset and 78% for the SLAM dataset.}}, author = {{Bergfelt, Max}}, issn = {{1404-6342}}, language = {{eng}}, note = {{Student Paper}}, series = {{Master's Theses in Mathematical Sciences}}, title = {{Individual Layer Scaling and Redundancy Reduction of Singleshot Multiplane Images for View Synthesis}}, year = {{2023}}, }