ROBUST ROTATION AND TRANSLATION ESTIMATION IN STRUCTURE FROM MOTION
(2016) Abstract
 Due to advances in technology the amount of images from portable cameras has increased tremendously in recent years. Nowadays, most new mobile phones and cars have multiple cameras. Drones and other robots are often equipped with cameras as well. To be able to capture an image is not the same as to understand the content of it. Computer vision deals with the task of giving machines the ability to make sense of images like humans do.
Animals have no trouble of navigating in the world using visual information but this is still a problem for robots. The subfield of computer vision that builds 3D models from 2D images is called structure from motion.
Structure from motion is a very complex problem, and the best known techniques... (More)  Due to advances in technology the amount of images from portable cameras has increased tremendously in recent years. Nowadays, most new mobile phones and cars have multiple cameras. Drones and other robots are often equipped with cameras as well. To be able to capture an image is not the same as to understand the content of it. Computer vision deals with the task of giving machines the ability to make sense of images like humans do.
Animals have no trouble of navigating in the world using visual information but this is still a problem for robots. The subfield of computer vision that builds 3D models from 2D images is called structure from motion.
Structure from motion is a very complex problem, and the best known techniques use a pipeline structure that solves many subproblems in a sequence. Two of the papers in this thesis focus on the relative pose problem which is solved in most pipelines. In the relative pose step, the camera pose is estimated between two images at a time and the results are then used to construct a complete camera trajectory.
In the first paper we address the problem of merging relative poses.
Each relative pose has information about the relative position and rotation between two images. The paper focuses on the so called multiple rotation averaging problem, where the optimal set of camera rotations is calculated from relative ones. This is an overdetermined problem that often contains bad estimates of some of the pairwise rotations. We construct an algorithm based on Lagrangian duality that also can be used to verify if a solution is globally optimal.
We show by experiments that the algorithm finds the optimal solution for both real and synthetic data, if the amount of noise is reasonable.
In the second paper we focus on the special case of relative pose, where the orientation of the two cameras is known beforehand, by other sensors or previous calculations. The problem is to calculate the relative location of the cameras and we provide two different algorithms that have optimality guarantees. The first algorithm has a low order polynomial time complexity independent of the number of outliers among the set of corresponding 2D points. The second algorithm is a branch and bound approach that is orders of magnitudes faster in practice. We evaluate the algorithms on real and synthetic data and compare it to standard RANSAC, which in some hard instances deteriorates compared to our optimal methods.
The third paper is similar to the second paper, but differs in one fundamental way. We solve the pairwise relative pose with unknown correspondences between the images. This means that we estimate the epipolar geometry and the corresponding points in one step instead of two. The method has optimality guarantees and is built on branch and bound together with bipartite matching. In experiments we show that this procedure can solve harder problems and that more inlier correspondences can be obtained by optimizing over pose and correspondences simultaneously instead of in two steps.
The fourth paper is about matching images to a given 3D model (absolute pose). In this paper we use a novel outlier rejection scheme for the case of known rotation. Our method can identify outlier correspondences between the 3D and 2D points in a fast and efficient way. When most of the outliers are removed the remaining problems are easier to solve. In the paper we evaluate the method on real and synthetic data and the results show improved performance compared to RANSAC.
(Less)
Please use this url to cite or link to this publication:
http://lup.lub.lu.se/record/cab5fafcb4ea4b9991e209f3001a9e8d
 author
 Fredriksson, Johan ^{LU}
 supervisor

 Fredrik Kahl ^{LU}
 Carl Olsson ^{LU}
 opponent

 Professor Pajdla, Tomás, Czech Technical University, Czech Republic
 organization
 publishing date
 2016
 type
 Thesis
 publication status
 published
 subject
 keywords
 computer vision, image analysis
 pages
 151 pages
 publisher
 Lund University, Faculty of Science, Centre for Mathematical Sciences, Mathematics
 defense location
 Lecture hall MH:C, Centre for Mathematical Sciences, Sölvegatan 18, Lund University, Faculty of Engineering LTH, Lund
 defense date
 20161209 13:15
 ISBN
 9789177530510
 9789177530503
 language
 English
 LU publication?
 yes
 id
 cab5fafcb4ea4b9991e209f3001a9e8d
 date added to LUP
 20161107 10:31:20
 date last changed
 20161116 11:16:51
@phdthesis{cab5fafcb4ea4b9991e209f3001a9e8d, abstract = {Due to advances in technology the amount of images from portable cameras has increased tremendously in recent years. Nowadays, most new mobile phones and cars have multiple cameras. Drones and other robots are often equipped with cameras as well. To be able to capture an image is not the same as to understand the content of it. Computer vision deals with the task of giving machines the ability to make sense of images like humans do. <br/>Animals have no trouble of navigating in the world using visual information but this is still a problem for robots. The subfield of computer vision that builds 3D models from 2D images is called structure from motion. <br/><br/>Structure from motion is a very complex problem, and the best known techniques use a pipeline structure that solves many subproblems in a sequence. Two of the papers in this thesis focus on the relative pose problem which is solved in most pipelines. In the relative pose step, the camera pose is estimated between two images at a time and the results are then used to construct a complete camera trajectory.<br/><br/>In the first paper we address the problem of merging relative poses. <br/>Each relative pose has information about the relative position and rotation between two images. The paper focuses on the so called multiple rotation averaging problem, where the optimal set of camera rotations is calculated from relative ones. This is an overdetermined problem that often contains bad estimates of some of the pairwise rotations. We construct an algorithm based on Lagrangian duality that also can be used to verify if a solution is globally optimal.<br/>We show by experiments that the algorithm finds the optimal solution for both real and synthetic data, if the amount of noise is reasonable.<br/><br/>In the second paper we focus on the special case of relative pose, where the orientation of the two cameras is known beforehand, by other sensors or previous calculations. The problem is to calculate the relative location of the cameras and we provide two different algorithms that have optimality guarantees. The first algorithm has a low order polynomial time complexity independent of the number of outliers among the set of corresponding 2D points. The second algorithm is a branch and bound approach that is orders of magnitudes faster in practice. We evaluate the algorithms on real and synthetic data and compare it to standard RANSAC, which in some hard instances deteriorates compared to our optimal methods. <br/><br/>The third paper is similar to the second paper, but differs in one fundamental way. We solve the pairwise relative pose with unknown correspondences between the images. This means that we estimate the epipolar geometry and the corresponding points in one step instead of two. The method has optimality guarantees and is built on branch and bound together with bipartite matching. In experiments we show that this procedure can solve harder problems and that more inlier correspondences can be obtained by optimizing over pose and correspondences simultaneously instead of in two steps.<br/><br/>The fourth paper is about matching images to a given 3D model (absolute pose). In this paper we use a novel outlier rejection scheme for the case of known rotation. Our method can identify outlier correspondences between the 3D and 2D points in a fast and efficient way. When most of the outliers are removed the remaining problems are easier to solve. In the paper we evaluate the method on real and synthetic data and the results show improved performance compared to RANSAC.<br/>}, author = {Fredriksson, Johan}, isbn = {9789177530510}, keyword = {computer vision,image analysis}, language = {eng}, pages = {151}, publisher = {Lund University, Faculty of Science, Centre for Mathematical Sciences, Mathematics}, school = {Lund University}, title = {ROBUST ROTATION AND TRANSLATION ESTIMATION IN STRUCTURE FROM MOTION}, year = {2016}, }