Skip to main content

Lund University Publications

LUND UNIVERSITY LIBRARIES

NMCu-CNN : a scalable near-memory computing co-processor on a RISC-V MCU in 22 nm FDSOI

Prieto, Arturo LU ; Nouripayam, Masoud LU ; Westring, Kristoffer LU ; Castillo Mohedano, Sergio LU ; Allfjord, Alex LU ; Svensson, Linus LU ; Andersson, Per LU and Rodrigues, Joachim LU (2025)
Abstract
This paper presents NMCu-CNN, a near-memory computing (NMC) co-processor for the hardware acceleration of convolutional neural networks (CNNs) on a low-power, flexible MCU platform. The scalable architecture and the selected platform enable adaptability to the rapidly evolving Edge AIoT landscape, where energy and performance requirements are constrained. Application-tailored NMC units, equipped with an optimized CNN dataflow, are integrated into the shared memory address space of a RISC-V-based MCU. The proposed architecture supports highly flexible runtime configurability, achieving $94 \%$ computational efficiency. Fabricated in 22 nm FDSOI technology, NMCu-CNN delivers a performance of 203 GOPS and an energy efficiency of 1716 GOPS/W... (More)
This paper presents NMCu-CNN, a near-memory computing (NMC) co-processor for the hardware acceleration of convolutional neural networks (CNNs) on a low-power, flexible MCU platform. The scalable architecture and the selected platform enable adaptability to the rapidly evolving Edge AIoT landscape, where energy and performance requirements are constrained. Application-tailored NMC units, equipped with an optimized CNN dataflow, are integrated into the shared memory address space of a RISC-V-based MCU. The proposed architecture supports highly flexible runtime configurability, achieving $94 \%$ computational efficiency. Fabricated in 22 nm FDSOI technology, NMCu-CNN delivers a performance of 203 GOPS and an energy efficiency of 1716 GOPS/W (1.7 TOPS/W) benchmarked on convolutional layers of a CNN model, outperforming the processing capabilities of other state-of-the-art techniques. (Less)
Please use this url to cite or link to this publication:
author
; ; ; ; ; ; and
organization
publishing date
type
Contribution to conference
publication status
published
subject
pages
5 pages
DOI
10.1109/SiPS66314.2025.11261229
language
English
LU publication?
yes
id
65fa6acd-ab06-4bd0-8405-89863f1d4fe8
date added to LUP
2025-10-28 19:11:07
date last changed
2025-12-09 09:59:48
@misc{65fa6acd-ab06-4bd0-8405-89863f1d4fe8,
  abstract     = {{This paper presents NMCu-CNN, a near-memory computing (NMC) co-processor for the hardware acceleration of convolutional neural networks (CNNs) on a low-power, flexible MCU platform. The scalable architecture and the selected platform enable adaptability to the rapidly evolving Edge AIoT landscape, where energy and performance requirements are constrained. Application-tailored NMC units, equipped with an optimized CNN dataflow, are integrated into the shared memory address space of a RISC-V-based MCU. The proposed architecture supports highly flexible runtime configurability, achieving $94 \%$ computational efficiency. Fabricated in 22 nm FDSOI technology, NMCu-CNN delivers a performance of 203 GOPS and an energy efficiency of 1716 GOPS/W (1.7 TOPS/W) benchmarked on convolutional layers of a CNN model, outperforming the processing capabilities of other state-of-the-art techniques.}},
  author       = {{Prieto, Arturo and Nouripayam, Masoud and Westring, Kristoffer and Castillo Mohedano, Sergio and Allfjord, Alex and Svensson, Linus and Andersson, Per and Rodrigues, Joachim}},
  language     = {{eng}},
  month        = {{11}},
  title        = {{NMCu-CNN : a scalable near-memory computing co-processor on a RISC-V MCU in 22 nm FDSOI}},
  url          = {{http://dx.doi.org/10.1109/SiPS66314.2025.11261229}},
  doi          = {{10.1109/SiPS66314.2025.11261229}},
  year         = {{2025}},
}