Design of an Application-specific VLIW Vector Processor for ORB Feature Extraction
(2023) In Journal of Signal Processing Systems 95(7). p.863-875- Abstract
In computer-vision feature extraction algorithms, compressing the image into a sparse set of trackable keypoints, empowers navigation-critical systems such as Simultaneous Localization And Mapping (SLAM) in autonomous robots, and also other applications such as augmented reality and 3D reconstruction. Most of those applications are performed in battery-powered gadgets featuring in common a very stringent power-budget. Near-to-sensor computing of feature extraction algorithms allows for several design optimizations. First, the overall on-chip memory requirements can be lessened, and second, the internal data movement can be minimized. This work explores the usage of an Application Specific Instruction Set Processor (ASIP) dedicated to... (More)
In computer-vision feature extraction algorithms, compressing the image into a sparse set of trackable keypoints, empowers navigation-critical systems such as Simultaneous Localization And Mapping (SLAM) in autonomous robots, and also other applications such as augmented reality and 3D reconstruction. Most of those applications are performed in battery-powered gadgets featuring in common a very stringent power-budget. Near-to-sensor computing of feature extraction algorithms allows for several design optimizations. First, the overall on-chip memory requirements can be lessened, and second, the internal data movement can be minimized. This work explores the usage of an Application Specific Instruction Set Processor (ASIP) dedicated to perform feature extraction in a real-time and energy-efficient manner. The ASIP features a Very Long Instruction Word (VLIW) architecture comprising one RV32I RISC-V and three vector slots. The on-chip memory sub-system implements parallel multi-bank memories with near-memory data shuffling to enable single-cycle multi-pattern vector access. Oriented FAST and Rotated BRIEF (ORB) are thoroughly explored to validate the proposed architecture, achieving a throughput of 140 Frames-Per-Second (FPS) for VGA images for one scale, while reducing the number of memory accesses by 2 orders of magnitude as compared to other embedded general-purpose architectures.
(Less)
- author
- Ferreira, Lucas LU ; Malkowsky, Steffen LU ; Persson, Patrik LU ; Karlsson, Sven ; Åström, Kalle LU and Liu, Liang LU
- organization
-
- LTH Profile Area: AI and Digitalization
- Integrated Electronic Systems (research group)
- LTH Profile Area: Nanoscience and Semiconductor Technology
- ELLIIT: the Linköping-Lund initiative on IT and mobile communication
- Mathematics (Faculty of Engineering)
- LTH Profile Area: Engineering Health
- eSSENCE: The e-Science Collaboration
- Mathematical Imaging Group (research group)
- publishing date
- 2023
- type
- Contribution to journal
- publication status
- published
- subject
- keywords
- ASIP, Feature extraction, ORB, Vision-based SLAM
- in
- Journal of Signal Processing Systems
- volume
- 95
- issue
- 7
- pages
- 863 - 875
- publisher
- Springer
- external identifiers
-
- scopus:85147007188
- ISSN
- 1939-8018
- DOI
- 10.1007/s11265-022-01833-9
- language
- English
- LU publication?
- yes
- id
- 1d39bd4e-ab21-47a3-be19-fd60a5609995
- date added to LUP
- 2023-02-13 12:39:56
- date last changed
- 2024-03-21 17:54:31
@article{1d39bd4e-ab21-47a3-be19-fd60a5609995, abstract = {{<p>In computer-vision feature extraction algorithms, compressing the image into a sparse set of trackable keypoints, empowers navigation-critical systems such as Simultaneous Localization And Mapping (SLAM) in autonomous robots, and also other applications such as augmented reality and 3D reconstruction. Most of those applications are performed in battery-powered gadgets featuring in common a very stringent power-budget. Near-to-sensor computing of feature extraction algorithms allows for several design optimizations. First, the overall on-chip memory requirements can be lessened, and second, the internal data movement can be minimized. This work explores the usage of an Application Specific Instruction Set Processor (ASIP) dedicated to perform feature extraction in a real-time and energy-efficient manner. The ASIP features a Very Long Instruction Word (VLIW) architecture comprising one RV32I RISC-V and three vector slots. The on-chip memory sub-system implements parallel multi-bank memories with near-memory data shuffling to enable single-cycle multi-pattern vector access. Oriented FAST and Rotated BRIEF (ORB) are thoroughly explored to validate the proposed architecture, achieving a throughput of 140 Frames-Per-Second (FPS) for VGA images for one scale, while reducing the number of memory accesses by 2 orders of magnitude as compared to other embedded general-purpose architectures.</p>}}, author = {{Ferreira, Lucas and Malkowsky, Steffen and Persson, Patrik and Karlsson, Sven and Åström, Kalle and Liu, Liang}}, issn = {{1939-8018}}, keywords = {{ASIP; Feature extraction; ORB; Vision-based SLAM}}, language = {{eng}}, number = {{7}}, pages = {{863--875}}, publisher = {{Springer}}, series = {{Journal of Signal Processing Systems}}, title = {{Design of an Application-specific VLIW Vector Processor for ORB Feature Extraction}}, url = {{http://dx.doi.org/10.1007/s11265-022-01833-9}}, doi = {{10.1007/s11265-022-01833-9}}, volume = {{95}}, year = {{2023}}, }