Low Rank Matrix Factorization and Relative Pose Problems in Computer Vision

Jiang, Fangyuan

Low Rank Matrix Factorization and Relative Pose Problems in Computer Vision

Mark

Jiang, Fangyuan ^LU (2015)

Abstract: This thesis is focused on geometric computer vision problems. The first part of the thesis aims at solving one fundamental problem, namely low-rank matrix factorization. We provide several novel insights into the problem. In brief, we characterize, generate, parametrize and solve the minimal problems associated with low-rank matrix factorization. Beyond that, we give several new algorithms based on the minimal solvers when the measurement matrix is either sparse, noisy or with outliers. The cost function and the algorithm can easily be adapted to several robust norms, for example, the L1-norm and the truncated L1-norm. We demonstrate our approach on several geometric computer vision problems. Another application is in sensor network... (More); This thesis is focused on geometric computer vision problems. The first part of the thesis aims at solving one fundamental problem, namely low-rank matrix factorization. We provide several novel insights into the problem. In brief, we characterize, generate, parametrize and solve the minimal problems associated with low-rank matrix factorization. Beyond that, we give several new algorithms based on the minimal solvers when the measurement matrix is either sparse, noisy or with outliers. The cost function and the algorithm can easily be adapted to several robust norms, for example, the L1-norm and the truncated L1-norm. We demonstrate our approach on several geometric computer vision problems. Another application is in sensor network calibration, which is also explored.

The second part of the thesis deals with the relative pose problem. We solve the minimal problem of estimating the relative pose with unknown focal length and radial distortion. Beyond that, we also propose a brute force approach, which does not suffer from common algorithmic degeneracies. Further, the algorithm achieves a globally optimal solution up to a discretization error and it is easily parallelizable. Finally, we look into the problem of object detection with unknown pose. (Less)
Abstract (Swedish): Popular Abstract in English

The ultimate goal of computer vision is to make computers "see" like humans do. Toward this goal, one essential step is to enable computers to perceive a three-dimensional (3D) space as in the real world. In this thesis, we investigate the problem of reconstructing a 3D scene model from ordinary two-dimensional (2D) images. More specifically, given a set of images of the same scene from varied perspectives, we are interested in developing a computer program that can automatically build a 3D model of the scene, usually in the form of a 3D point cloud that describes the geometry of the scene. In addition, the program should determine the pose of each camera, that is, where each camera is located... (More); Popular Abstract in English

The ultimate goal of computer vision is to make computers "see" like humans do. Toward this goal, one essential step is to enable computers to perceive a three-dimensional (3D) space as in the real world. In this thesis, we investigate the problem of reconstructing a 3D scene model from ordinary two-dimensional (2D) images. More specifically, given a set of images of the same scene from varied perspectives, we are interested in developing a computer program that can automatically build a 3D model of the scene, usually in the form of a 3D point cloud that describes the geometry of the scene. In addition, the program should determine the pose of each camera, that is, where each camera is located relative to the scene and to which direction each camera points. Provided one can solve this problem, there is a wide range of possible applications, for example, to build a 3D map of a city or to reconstruct a 3D model of an object, which in turn can be fed into a 3D printer.

The first part of the thesis contributes to a family of methods that can simultaneously find the 3D model of the scene as well as the camera poses. We start with feature matching, that is, to find the 2D image points which are the projections of the same 3D point in different images. All the 2D points that correspond to a certain scene point form a so-called point track. After collecting all the point tracks and putting them in a matrix, it is well-known that it is possible to retrieve the camera poses and the 3D scene points by decomposing this matrix into two smaller-sized matrices. This decomposition of the matrix into two smaller ones is often referred to as a low-rank factorization problem. We contribute to the factorization problem by developing a new method that is capable of handling missing elements in the matrix, which corresponds to missing or occluded point tracks. For example, consider a set of cameras that surrounds an object. Points on the back side of the object are invisible to the cameras in the front. The proposed method is also very robust in presence of outliers, which means it can still correctly recover the 3D scene points and the camera poses when some of the point tracks are wrong due to inaccurate feature matching.

In the second part of the thesis, we focus on the problem of estimating the camera pose. We are especially interested in the relative pose problem of two views, that is, given two images, estimate how one camera rotates and translates relative to the other camera. Solving the two view problem is fundamental and essential for building a large-scale reconstruction system with more cameras. For some cameras with wide-angle lenses like GoPro series, which allows more of the scene to be included in the photograph, or some cameras with fisheye lenses, which are very common for surveillance cameras, images are distorted in the sense that the projection of a straight line in a 3D scene is no longer straight in the 2D image. This distortion effect, known as radial distortion, is more significant in the border of an image than in the centre of the image. If the distortion is not appropriately modeled, the reconstructed 3D model will look skewed. We explicitly model this effect of radial distortion when estimating the relative pose between two cameras and propose an efficient way of solving the problem. Beyond that, we also present a so-called brute-force algorithm to solve the relative pose problem. It works by systematically enumerating all possible candidates for the solution. By doing that, it is guaranteed to always find the optimal solution. The algorithm is efficient and can run on a graphical card using a parallelized version that simultaneously evaluate different candidate solutions. It is also robust to outliers and can be easily adapted to restricted camera motions, for example when the cameras move within a plane. (Less)

Please use this url to cite or link to this publication: https://lup.lub.lu.se/record/5368358

author

Jiang, Fangyuan ^LU

supervisor

Fredrik Kahl ^LU

opponent

Professor Heikkilä, Janne, University of Oulu, Finland

organization

publishing date

2015

type

Thesis

publication status

published

subject

keywords

Geometric Computer Vision, Low-rank Matrix Factorization, Relative Pose

defense location

Lecture hall MH:C, Centre for Mathematical Sciences, Sölvegatan 18, Lund University, Faculty of Engineering, LTH.

defense date

2015-06-04 13:15:00

language

English

LU publication?

yes

id

e3086f32-1dbd-4af0-89f1-13235fa2aff4 (old id 5368358)

date added to LUP

2016-04-04 09:38:52

date last changed

2025-04-04 14:53:55

@phdthesis{e3086f32-1dbd-4af0-89f1-13235fa2aff4,
  abstract     = {{This thesis is focused on geometric computer vision problems. The first part of the thesis aims at solving one fundamental problem, namely low-rank matrix factorization. We provide several novel insights into the problem. In brief, we characterize, generate, parametrize and solve the minimal problems associated with low-rank matrix factorization. Beyond that, we give several new algorithms based on the minimal solvers when the measurement matrix is either sparse, noisy or with outliers. The cost function and the algorithm can easily be adapted to several robust norms, for example, the L1-norm and the truncated L1-norm. We demonstrate our approach on several geometric computer vision problems. Another application is in sensor network calibration, which is also explored. <br/><br>
<br/><br>
The second part of the thesis deals with the relative pose problem. We solve the minimal problem of estimating the relative pose with unknown focal length and radial distortion. Beyond that, we also propose a brute force approach, which does not suffer from common algorithmic degeneracies. Further, the algorithm achieves a globally optimal solution up to a discretization error and it is easily parallelizable. Finally, we look into the problem of object detection with unknown pose.}},
  author       = {{Jiang, Fangyuan}},
  keywords     = {{Geometric Computer Vision; Low-rank Matrix Factorization; Relative Pose}},
  language     = {{eng}},
  school       = {{Lund University}},
  title        = {{Low Rank Matrix Factorization and Relative Pose Problems in Computer Vision}},
  url          = {{https://lup.lub.lu.se/search/files/5379996/5368400.pdf}},
  year         = {{2015}},
}

Lund University Publications

LUND UNIVERSITY LIBRARIES

Low Rank Matrix Factorization and Relative Pose Problems in Computer Vision