Skip to main content

Lund University Publications

LUND UNIVERSITY LIBRARIES

Pixel-Perfect Structure-From-Motion With Featuremetric Refinement

Sarlin, Paul Edouard ; Lindenberger, Philipp ; Larsson, Viktor LU and Pollefeys, Marc (2025) In IEEE Transactions on Pattern Analysis and Machine Intelligence 47(5). p.3298-3309
Abstract

Finding local features that are repeatable across multiple views is a cornerstone of sparse 3D reconstruction. The classical image matching paradigm detects keypoints per-image once and for all, which can yield poorly-localized features and propagate large errors to the final geometry. In this article, we refine two key steps of structure-from-motion by a direct alignment of low-level image information from multiple views: we first adjust the initial keypoint locations prior to any geometric estimation, and subsequently refine points and camera poses as a post-processing. This refinement is robust to large detection noise and appearance changes, as it optimizes a featuremetric error based on dense features predicted by a neural network.... (More)

Finding local features that are repeatable across multiple views is a cornerstone of sparse 3D reconstruction. The classical image matching paradigm detects keypoints per-image once and for all, which can yield poorly-localized features and propagate large errors to the final geometry. In this article, we refine two key steps of structure-from-motion by a direct alignment of low-level image information from multiple views: we first adjust the initial keypoint locations prior to any geometric estimation, and subsequently refine points and camera poses as a post-processing. This refinement is robust to large detection noise and appearance changes, as it optimizes a featuremetric error based on dense features predicted by a neural network. This significantly improves the accuracy of camera poses and scene geometry for a wide range of keypoint detectors, challenging viewing conditions, and off-the-shelf deep features. Our system easily scales to large image collections, enabling pixel-perfect crowd-sourced localization at scale.

(Less)
Please use this url to cite or link to this publication:
author
; ; and
organization
publishing date
type
Contribution to journal
publication status
published
subject
keywords
Bundle adjustment, feature matching, featuremetric optimization, structure-from-Motion, visual localization
in
IEEE Transactions on Pattern Analysis and Machine Intelligence
volume
47
issue
5
pages
12 pages
publisher
IEEE - Institute of Electrical and Electronics Engineers Inc.
external identifiers
  • scopus:105002984488
  • pmid:37021895
ISSN
0162-8828
DOI
10.1109/TPAMI.2023.3237269
language
English
LU publication?
yes
id
3336ceae-3da4-4656-a4cc-221aa5f62130
date added to LUP
2025-08-29 14:20:14
date last changed
2025-09-26 19:49:48
@article{3336ceae-3da4-4656-a4cc-221aa5f62130,
  abstract     = {{<p>Finding local features that are repeatable across multiple views is a cornerstone of sparse 3D reconstruction. The classical image matching paradigm detects keypoints per-image once and for all, which can yield poorly-localized features and propagate large errors to the final geometry. In this article, we refine two key steps of structure-from-motion by a direct alignment of low-level image information from multiple views: we first adjust the initial keypoint locations prior to any geometric estimation, and subsequently refine points and camera poses as a post-processing. This refinement is robust to large detection noise and appearance changes, as it optimizes a featuremetric error based on dense features predicted by a neural network. This significantly improves the accuracy of camera poses and scene geometry for a wide range of keypoint detectors, challenging viewing conditions, and off-the-shelf deep features. Our system easily scales to large image collections, enabling pixel-perfect crowd-sourced localization at scale.</p>}},
  author       = {{Sarlin, Paul Edouard and Lindenberger, Philipp and Larsson, Viktor and Pollefeys, Marc}},
  issn         = {{0162-8828}},
  keywords     = {{Bundle adjustment; feature matching; featuremetric optimization; structure-from-Motion; visual localization}},
  language     = {{eng}},
  number       = {{5}},
  pages        = {{3298--3309}},
  publisher    = {{IEEE - Institute of Electrical and Electronics Engineers Inc.}},
  series       = {{IEEE Transactions on Pattern Analysis and Machine Intelligence}},
  title        = {{Pixel-Perfect Structure-From-Motion With Featuremetric Refinement}},
  url          = {{http://dx.doi.org/10.1109/TPAMI.2023.3237269}},
  doi          = {{10.1109/TPAMI.2023.3237269}},
  volume       = {{47}},
  year         = {{2025}},
}