Pixel-Perfect Structure-From-Motion With Featuremetric Refinement
(2025) In IEEE Transactions on Pattern Analysis and Machine Intelligence 47(5). p.3298-3309- Abstract
Finding local features that are repeatable across multiple views is a cornerstone of sparse 3D reconstruction. The classical image matching paradigm detects keypoints per-image once and for all, which can yield poorly-localized features and propagate large errors to the final geometry. In this article, we refine two key steps of structure-from-motion by a direct alignment of low-level image information from multiple views: we first adjust the initial keypoint locations prior to any geometric estimation, and subsequently refine points and camera poses as a post-processing. This refinement is robust to large detection noise and appearance changes, as it optimizes a featuremetric error based on dense features predicted by a neural network.... (More)
Finding local features that are repeatable across multiple views is a cornerstone of sparse 3D reconstruction. The classical image matching paradigm detects keypoints per-image once and for all, which can yield poorly-localized features and propagate large errors to the final geometry. In this article, we refine two key steps of structure-from-motion by a direct alignment of low-level image information from multiple views: we first adjust the initial keypoint locations prior to any geometric estimation, and subsequently refine points and camera poses as a post-processing. This refinement is robust to large detection noise and appearance changes, as it optimizes a featuremetric error based on dense features predicted by a neural network. This significantly improves the accuracy of camera poses and scene geometry for a wide range of keypoint detectors, challenging viewing conditions, and off-the-shelf deep features. Our system easily scales to large image collections, enabling pixel-perfect crowd-sourced localization at scale.
(Less)
- author
- Sarlin, Paul Edouard ; Lindenberger, Philipp ; Larsson, Viktor LU and Pollefeys, Marc
- organization
- publishing date
- 2025
- type
- Contribution to journal
- publication status
- published
- subject
- keywords
- Bundle adjustment, feature matching, featuremetric optimization, structure-from-Motion, visual localization
- in
- IEEE Transactions on Pattern Analysis and Machine Intelligence
- volume
- 47
- issue
- 5
- pages
- 12 pages
- publisher
- IEEE - Institute of Electrical and Electronics Engineers Inc.
- external identifiers
-
- scopus:105002984488
- pmid:37021895
- ISSN
- 0162-8828
- DOI
- 10.1109/TPAMI.2023.3237269
- language
- English
- LU publication?
- yes
- id
- 3336ceae-3da4-4656-a4cc-221aa5f62130
- date added to LUP
- 2025-08-29 14:20:14
- date last changed
- 2025-09-26 19:49:48
@article{3336ceae-3da4-4656-a4cc-221aa5f62130, abstract = {{<p>Finding local features that are repeatable across multiple views is a cornerstone of sparse 3D reconstruction. The classical image matching paradigm detects keypoints per-image once and for all, which can yield poorly-localized features and propagate large errors to the final geometry. In this article, we refine two key steps of structure-from-motion by a direct alignment of low-level image information from multiple views: we first adjust the initial keypoint locations prior to any geometric estimation, and subsequently refine points and camera poses as a post-processing. This refinement is robust to large detection noise and appearance changes, as it optimizes a featuremetric error based on dense features predicted by a neural network. This significantly improves the accuracy of camera poses and scene geometry for a wide range of keypoint detectors, challenging viewing conditions, and off-the-shelf deep features. Our system easily scales to large image collections, enabling pixel-perfect crowd-sourced localization at scale.</p>}}, author = {{Sarlin, Paul Edouard and Lindenberger, Philipp and Larsson, Viktor and Pollefeys, Marc}}, issn = {{0162-8828}}, keywords = {{Bundle adjustment; feature matching; featuremetric optimization; structure-from-Motion; visual localization}}, language = {{eng}}, number = {{5}}, pages = {{3298--3309}}, publisher = {{IEEE - Institute of Electrical and Electronics Engineers Inc.}}, series = {{IEEE Transactions on Pattern Analysis and Machine Intelligence}}, title = {{Pixel-Perfect Structure-From-Motion With Featuremetric Refinement}}, url = {{http://dx.doi.org/10.1109/TPAMI.2023.3237269}}, doi = {{10.1109/TPAMI.2023.3237269}}, volume = {{47}}, year = {{2025}}, }