Skip to main content

Lund University Publications

LUND UNIVERSITY LIBRARIES

Design of an Application-specific VLIW Vector Processor for ORB Feature Extraction

Ferreira, Lucas LU ; Malkowsky, Steffen LU ; Persson, Patrik LU orcid ; Karlsson, Sven ; Åström, Kalle LU orcid and Liu, Liang LU orcid (2023) In Journal of Signal Processing Systems 95(7). p.863-875
Abstract

In computer-vision feature extraction algorithms, compressing the image into a sparse set of trackable keypoints, empowers navigation-critical systems such as Simultaneous Localization And Mapping (SLAM) in autonomous robots, and also other applications such as augmented reality and 3D reconstruction. Most of those applications are performed in battery-powered gadgets featuring in common a very stringent power-budget. Near-to-sensor computing of feature extraction algorithms allows for several design optimizations. First, the overall on-chip memory requirements can be lessened, and second, the internal data movement can be minimized. This work explores the usage of an Application Specific Instruction Set Processor (ASIP) dedicated to... (More)

In computer-vision feature extraction algorithms, compressing the image into a sparse set of trackable keypoints, empowers navigation-critical systems such as Simultaneous Localization And Mapping (SLAM) in autonomous robots, and also other applications such as augmented reality and 3D reconstruction. Most of those applications are performed in battery-powered gadgets featuring in common a very stringent power-budget. Near-to-sensor computing of feature extraction algorithms allows for several design optimizations. First, the overall on-chip memory requirements can be lessened, and second, the internal data movement can be minimized. This work explores the usage of an Application Specific Instruction Set Processor (ASIP) dedicated to perform feature extraction in a real-time and energy-efficient manner. The ASIP features a Very Long Instruction Word (VLIW) architecture comprising one RV32I RISC-V and three vector slots. The on-chip memory sub-system implements parallel multi-bank memories with near-memory data shuffling to enable single-cycle multi-pattern vector access. Oriented FAST and Rotated BRIEF (ORB) are thoroughly explored to validate the proposed architecture, achieving a throughput of 140 Frames-Per-Second (FPS) for VGA images for one scale, while reducing the number of memory accesses by 2 orders of magnitude as compared to other embedded general-purpose architectures.

(Less)
Please use this url to cite or link to this publication:
author
; ; ; ; and
organization
publishing date
type
Contribution to journal
publication status
published
subject
keywords
ASIP, Feature extraction, ORB, Vision-based SLAM
in
Journal of Signal Processing Systems
volume
95
issue
7
pages
863 - 875
publisher
Springer
external identifiers
  • scopus:85147007188
ISSN
1939-8018
DOI
10.1007/s11265-022-01833-9
language
English
LU publication?
yes
id
1d39bd4e-ab21-47a3-be19-fd60a5609995
date added to LUP
2023-02-13 12:39:56
date last changed
2024-03-21 17:54:31
@article{1d39bd4e-ab21-47a3-be19-fd60a5609995,
  abstract     = {{<p>In computer-vision feature extraction algorithms, compressing the image into a sparse set of trackable keypoints, empowers navigation-critical systems such as Simultaneous Localization And Mapping (SLAM) in autonomous robots, and also other applications such as augmented reality and 3D reconstruction. Most of those applications are performed in battery-powered gadgets featuring in common a very stringent power-budget. Near-to-sensor computing of feature extraction algorithms allows for several design optimizations. First, the overall on-chip memory requirements can be lessened, and second, the internal data movement can be minimized. This work explores the usage of an Application Specific Instruction Set Processor (ASIP) dedicated to perform feature extraction in a real-time and energy-efficient manner. The ASIP features a Very Long Instruction Word (VLIW) architecture comprising one RV32I RISC-V and three vector slots. The on-chip memory sub-system implements parallel multi-bank memories with near-memory data shuffling to enable single-cycle multi-pattern vector access. Oriented FAST and Rotated BRIEF (ORB) are thoroughly explored to validate the proposed architecture, achieving a throughput of 140 Frames-Per-Second (FPS) for VGA images for one scale, while reducing the number of memory accesses by 2 orders of magnitude as compared to other embedded general-purpose architectures.</p>}},
  author       = {{Ferreira, Lucas and Malkowsky, Steffen and Persson, Patrik and Karlsson, Sven and Åström, Kalle and Liu, Liang}},
  issn         = {{1939-8018}},
  keywords     = {{ASIP; Feature extraction; ORB; Vision-based SLAM}},
  language     = {{eng}},
  number       = {{7}},
  pages        = {{863--875}},
  publisher    = {{Springer}},
  series       = {{Journal of Signal Processing Systems}},
  title        = {{Design of an Application-specific VLIW Vector Processor for ORB Feature Extraction}},
  url          = {{http://dx.doi.org/10.1007/s11265-022-01833-9}},
  doi          = {{10.1007/s11265-022-01833-9}},
  volume       = {{95}},
  year         = {{2023}},
}