Low-power Acceleration of Convolutional Neural Networks using Near Memory Computing on a RISC-V SoC

Westring, Kristoffer; Svensson, Linus

Low-power Acceleration of Convolutional Neural Networks using Near Memory Computing on a RISC-V SoC

Mark

Westring, Kristoffer ^LU and Svensson, Linus ^LU (2023) EITM01 20231
Department of Electrical and Information Technology

Abstract: The recent peak in interest for artificial intelligence, partly fueled by language
models such as ChatGPT, is pushing the demand for machine learning and data
processing in everyday applications, such as self-driving cars, where low latency is
crucial and typically achieved through edge computing. The vast amount of data
processing required intensifies the existing performance bottleneck of the data
movement. As a result, reducing data movement and allowing for better data
reuse can significantly improve the efficiency.
Processing the data as closely to the memory as possible, commonly known as
near-memory computing, increases the power efficiency and can significantly re-
duce the bottleneck in the data movement. However,... (More); The recent peak in interest for artificial intelligence, partly fueled by language
models such as ChatGPT, is pushing the demand for machine learning and data
processing in everyday applications, such as self-driving cars, where low latency is
crucial and typically achieved through edge computing. The vast amount of data
processing required intensifies the existing performance bottleneck of the data
movement. As a result, reducing data movement and allowing for better data
reuse can significantly improve the efficiency.
Processing the data as closely to the memory as possible, commonly known as
near-memory computing, increases the power efficiency and can significantly re-
duce the bottleneck in the data movement. However, maintaining a low power
consumption while at the same time being able to process large amounts of data
is a challenge. The RISC-V Instruction Set Architecture (ISA) was designed for
efficient and dense instruction encoding, enabling lower power consumption and
quicker execution time. Extending the simple RISC-V ISA with specific instruc-
tions for applications like image recognition can make a processor energy-efficient
but less versatile than a conventional CISC processor. Codasip, a company
specializing in RISC-V processors, offers a toolset for exploring and customizing
processor architectures, through their proprietary C-based hardware description
language, CodAl, which is used to generate SDK, HDL, and UVM within the Co-
dasip Studio Environment. Codasip provides a selection fully configurable RISC-V
cores, tailored for either low-power, and high-performance application.
In this thesis we use a combination of high-level synthesis tools and EDA soft-
ware to simplify design space exploration of accelerators, allowing for the accel-
erators to be integrated as Near Memory Computing (NMC) accelerators on a
customized RISC-V System on chip (SoC), for both Application Specific Inte-
grated Circuits (ASIC) and Field Programmable Gate Arrays (FPGA). The flow
contains implementation of custom instructions as well as a generic flow from
Register Transfer Level (RTL) to GDSII for reuse in future works. (Less)
Popular Abstract: In recent years, development and adaption of Artificial intelligence (AI) have in-
creased rapidly. As the usage and accuracy of these models increases, the com-
putation needed for inference increases. This opens up the need for architectural
innovation, to reduce both power consumption and latency.
As machine learning becomes an increasingly integral part of our daily lives, the
volume of data being processed in our devices is skyrocketing. Through the use
of cloud computing, our devices are able to transmit and offload the data that
requires heavy computation to servers. However, this method is not without its
drawbacks.
In many instances, transferring data over the internet for computation at a remote
server induces a... (More); In recent years, development and adaption of Artificial intelligence (AI) have in-
creased rapidly. As the usage and accuracy of these models increases, the com-
putation needed for inference increases. This opens up the need for architectural
innovation, to reduce both power consumption and latency.
As machine learning becomes an increasingly integral part of our daily lives, the
volume of data being processed in our devices is skyrocketing. Through the use
of cloud computing, our devices are able to transmit and offload the data that
requires heavy computation to servers. However, this method is not without its
drawbacks.
In many instances, transferring data over the internet for computation at a remote
server induces a latency. For real-time applications such as self-driving cars this
delay could be problematic. Being able to perform the computation locally, what
is commonly known as edge computing, is essential to many critical applications
but introduces a new set of hurdles to overcome, namely performance and power
efficiency. Computing large volumes of data in a battery powered units poses im-
mense challenges in comparison to a server hall with a virtually limitless supply of
power. Recent AI-models, especially large language models, like ChatGPT, consist
of billions of parameters that should be moved between memory and processing
units. Moving all these data imposes a bottleneck for efficient AI inference.
By shifting towards a more data-centric architecture like NMC, which involves
moving the data-intensive calculations closer to the origin of the data, latency
and power consumption can be reduced. Through strategic optimizations like
these, NMC promises a more efficient and responsive computing architecture for
data-intensive applications. Improving energy efficiency and processing power are
crucial obstacles that must be overcome for the continuous evolution of technology.
Near memory computation is one of the solutions that has been showing promising
results in recent studies. (Less)

Please use this url to cite or link to this publication: http://lup.lub.lu.se/student-papers/record/9140333

author

Westring, Kristoffer ^LU and Svensson, Linus ^LU

supervisor

Joachim Rodrigues ^LU

organization

Department of Electrical and Information Technology

alternative title

Energieffektiv Acceleration av Neurala Nätverk genom Optimerad Minneshantering i RISC-V SoC

course

EITM01 20231

year

2023

type

H2 - Master's Degree (Two Years)

subject

Technology and Engineering

keywords

FPGA, ASIC, Near Memory Computing, RISC-V, Convolutional Neural Network

report number

LU/LTH-EIT 2023-955

language

English

id

9140333

date added to LUP

2023-10-24 13:21:36

date last changed

2023-10-24 13:21:36

@misc{9140333,
  abstract     = {{The recent peak in interest for artificial intelligence, partly fueled by language
models such as ChatGPT, is pushing the demand for machine learning and data
processing in everyday applications, such as self-driving cars, where low latency is
crucial and typically achieved through edge computing. The vast amount of data
processing required intensifies the existing performance bottleneck of the data
movement. As a result, reducing data movement and allowing for better data
reuse can significantly improve the efficiency.
Processing the data as closely to the memory as possible, commonly known as
near-memory computing, increases the power efficiency and can significantly re-
duce the bottleneck in the data movement. However, maintaining a low power
consumption while at the same time being able to process large amounts of data
is a challenge. The RISC-V Instruction Set Architecture (ISA) was designed for
efficient and dense instruction encoding, enabling lower power consumption and
quicker execution time. Extending the simple RISC-V ISA with specific instruc-
tions for applications like image recognition can make a processor energy-efficient
but less versatile than a conventional CISC processor. Codasip, a company
specializing in RISC-V processors, offers a toolset for exploring and customizing
processor architectures, through their proprietary C-based hardware description
language, CodAl, which is used to generate SDK, HDL, and UVM within the Co-
dasip Studio Environment. Codasip provides a selection fully configurable RISC-V
cores, tailored for either low-power, and high-performance application.
In this thesis we use a combination of high-level synthesis tools and EDA soft-
ware to simplify design space exploration of accelerators, allowing for the accel-
erators to be integrated as Near Memory Computing (NMC) accelerators on a
customized RISC-V System on chip (SoC), for both Application Specific Inte-
grated Circuits (ASIC) and Field Programmable Gate Arrays (FPGA). The flow
contains implementation of custom instructions as well as a generic flow from
Register Transfer Level (RTL) to GDSII for reuse in future works.}},
  author       = {{Westring, Kristoffer and Svensson, Linus}},
  language     = {{eng}},
  note         = {{Student Paper}},
  title        = {{Low-power Acceleration of Convolutional Neural Networks using Near Memory Computing on a RISC-V SoC}},
  year         = {{2023}},
}

LUP Student Papers

LUND UNIVERSITY LIBRARIES

Low-power Acceleration of Convolutional Neural Networks using Near Memory Computing on a RISC-V SoC