Skip to main content

LUP Student Papers

LUND UNIVERSITY LIBRARIES

ASIC implementation exploration for EfficientNet optimizations

Fondelius, Filip LU and Davidsson, Albin LU (2024) EITM01 20241
Department of Electrical and Information Technology
Abstract
Neural networks are effective for solving many complex human-level tasks, one type of neural network often used in imaging tasks is the convolutional neural network (CNN). They can effectively solve problems in areas like image recognition, object detection, and classification tasks. Using a CNN for one of these tasks requires a huge computational cost relative to what a typical algorithm would have used. However, while the processors running the CNNs can handle the computational load, they get extremely hot and draw a lot of power.
This thesis explores different hardware techniques to optimize a fixed con- volutional neural network to decrease its power consumption while keeping an acceptable output result. One technique used is... (More)
Neural networks are effective for solving many complex human-level tasks, one type of neural network often used in imaging tasks is the convolutional neural network (CNN). They can effectively solve problems in areas like image recognition, object detection, and classification tasks. Using a CNN for one of these tasks requires a huge computational cost relative to what a typical algorithm would have used. However, while the processors running the CNNs can handle the computational load, they get extremely hot and draw a lot of power.
This thesis explores different hardware techniques to optimize a fixed con- volutional neural network to decrease its power consumption while keeping an acceptable output result. One technique used is precision scaling where the preci- sion of weights and tensors is reduced to a lower number of bits. It reduces both the amount of bits used in calculations and data transfer. Another technique used was approximate computing. These approximate circuits are designed to have fewer gates and therefore draw less power.
In this thesis an algorithm and a simulator was created to choose at which location bits should be removed. By introducing several error metrics such as mean squared error and F1 score, the results could be compared between different metrics and an almost optimal solution could be found.
With our precision scaling method, we have been able to save almost half the energy while maintaining an acceptable F1 score of 0.8. When evaluating approx- imate multipliers for the same task, the results were more difficult to interpret. The approximate multipliers work well in some cases while they work significantly worse in other cases.
Our results show that using hardware techniques like precision scaling and approximate circuits can be used to reduce power consumption. (Less)
Popular Abstract
In recent years AI has become a powerful tool for solving complex tasks and there are lots of different use cases for it. This thesis uses a convolutional neural network that takes an input image and outputs a class for each pixel in the image. Each pixel is divided into one of three classes, based on which class it is the most likely to belong to. This is called semantic segmentation. The problem with using neu- ral networks are that they are computational heavy which leads to a high power consumption.
There are many ways to reduce the power consumption for neural networks. This thesis focuses on precision scaling where the normally 8-bit long weights and input tensors are reduced to fewer bits. Since fewer bits have to be both... (More)
In recent years AI has become a powerful tool for solving complex tasks and there are lots of different use cases for it. This thesis uses a convolutional neural network that takes an input image and outputs a class for each pixel in the image. Each pixel is divided into one of three classes, based on which class it is the most likely to belong to. This is called semantic segmentation. The problem with using neu- ral networks are that they are computational heavy which leads to a high power consumption.
There are many ways to reduce the power consumption for neural networks. This thesis focuses on precision scaling where the normally 8-bit long weights and input tensors are reduced to fewer bits. Since fewer bits have to be both transported to the memory as well as used in computations, less power will be consumed. Lowering the precision of the weights and input tensors affects the ac- curacy of the result. We have developed a method that non-uniformly reduces the bit precision while maintaining an acceptable result. Our method was evaluated with a simulation program, written in Python and C++, which was also developed during this thesis.
The thesis also discusses the possibility of replacing the multiplier in parts of the neural network with approximate multipliers. Approximate multipliers are circuits that behave like normal multipliers but consumes less power at the cost none exact outputs. There are many different approximate multipliers with differ- ent levels of error and power consumption.
Using only precision scaling we have been able to cut the power consumption in half. (Less)
Please use this url to cite or link to this publication:
author
Fondelius, Filip LU and Davidsson, Albin LU
supervisor
organization
course
EITM01 20241
year
type
H2 - Master's Degree (Two Years)
subject
report number
LU/LTH-EIT 2024-1007
language
English
id
9170114
date added to LUP
2024-07-04 15:12:36
date last changed
2024-07-04 15:12:36
@misc{9170114,
  abstract     = {{Neural networks are effective for solving many complex human-level tasks, one type of neural network often used in imaging tasks is the convolutional neural network (CNN). They can effectively solve problems in areas like image recognition, object detection, and classification tasks. Using a CNN for one of these tasks requires a huge computational cost relative to what a typical algorithm would have used. However, while the processors running the CNNs can handle the computational load, they get extremely hot and draw a lot of power.
This thesis explores different hardware techniques to optimize a fixed con- volutional neural network to decrease its power consumption while keeping an acceptable output result. One technique used is precision scaling where the preci- sion of weights and tensors is reduced to a lower number of bits. It reduces both the amount of bits used in calculations and data transfer. Another technique used was approximate computing. These approximate circuits are designed to have fewer gates and therefore draw less power.
In this thesis an algorithm and a simulator was created to choose at which location bits should be removed. By introducing several error metrics such as mean squared error and F1 score, the results could be compared between different metrics and an almost optimal solution could be found.
With our precision scaling method, we have been able to save almost half the energy while maintaining an acceptable F1 score of 0.8. When evaluating approx- imate multipliers for the same task, the results were more difficult to interpret. The approximate multipliers work well in some cases while they work significantly worse in other cases.
Our results show that using hardware techniques like precision scaling and approximate circuits can be used to reduce power consumption.}},
  author       = {{Fondelius, Filip and Davidsson, Albin}},
  language     = {{eng}},
  note         = {{Student Paper}},
  title        = {{ASIC implementation exploration for EfficientNet optimizations}},
  year         = {{2024}},
}