Doubly-Block Circulant Kernel Matrix Exploitation in Convolutional Accelerators
(2023) 2023 IEEE 66th International Midwest Symposium on Circuits and Systems, MWSCAS 2023 p.236-240- Abstract
In this paper, we present a novel algorithmic and hardware co-design approach specifically tailored for efficient 2D convolution implementations, a crucial operation in convolutional neural networks (CNNs). Our method addresses the limitations of existing software-based solutions and hardware-based architectures, delivering significant improvements in asymptotic behavior for generic convolution cases. By leveraging the distinctive geometry of doubly block circulant unrolled kernel matrices, our approach eliminates the need for input and weight buffers, optimizes output memory usage, and minimizes redundant memory accesses. A comprehensive comparative analysis with state-of-the-art techniques showcases the key advantages and superior... (More)
In this paper, we present a novel algorithmic and hardware co-design approach specifically tailored for efficient 2D convolution implementations, a crucial operation in convolutional neural networks (CNNs). Our method addresses the limitations of existing software-based solutions and hardware-based architectures, delivering significant improvements in asymptotic behavior for generic convolution cases. By leveraging the distinctive geometry of doubly block circulant unrolled kernel matrices, our approach eliminates the need for input and weight buffers, optimizes output memory usage, and minimizes redundant memory accesses. A comprehensive comparative analysis with state-of-the-art techniques showcases the key advantages and superior performance of our proposed method, achieving substantial reductions in memory requirements and high throughput.
(Less)
- author
- Ferreira, Lucas
LU
; Malkowsky, Steffen
LU
; Persson, Patrik
LU
; Astrom, Karl LU
and Liu, Liang LU
- organization
-
- Department of Electrical and Information Technology
- Integrated Electronic Systems (research group)
- LTH Profile Area: AI and Digitalization
- ELLIIT: the Linköping-Lund initiative on IT and mobile communication
- LTH Profile Area: Nanoscience and Semiconductor Technology
- Mathematics (Faculty of Engineering)
- LU Profile Area: Light and Materials
- LU Profile Area: Natural and Artificial Cognition
- LTH Profile Area: Engineering Health
- Mathematical Imaging Group (research group)
- eSSENCE: The e-Science Collaboration
- publishing date
- 2023
- type
- Chapter in Book/Report/Conference proceeding
- publication status
- published
- subject
- keywords
- 2D Convolution, Doubly-Blocked Circulant Matrix, Systolic Array, Unrolled Kernel Matrix
- host publication
- 2023 IEEE 66th International Midwest Symposium on Circuits and Systems, MWSCAS 2023
- pages
- 5 pages
- publisher
- IEEE - Institute of Electrical and Electronics Engineers Inc.
- conference name
- 2023 IEEE 66th International Midwest Symposium on Circuits and Systems, MWSCAS 2023
- conference location
- Tempe, United States
- conference dates
- 2023-08-06 - 2023-08-09
- external identifiers
-
- scopus:85185371236
- ISBN
- 9798350302103
- DOI
- 10.1109/MWSCAS57524.2023.10406059
- language
- English
- LU publication?
- yes
- id
- 75f5ede4-3210-4e97-8eeb-b97f5ffdc838
- date added to LUP
- 2024-03-18 16:09:25
- date last changed
- 2025-04-04 14:01:11
@inproceedings{75f5ede4-3210-4e97-8eeb-b97f5ffdc838, abstract = {{<p>In this paper, we present a novel algorithmic and hardware co-design approach specifically tailored for efficient 2D convolution implementations, a crucial operation in convolutional neural networks (CNNs). Our method addresses the limitations of existing software-based solutions and hardware-based architectures, delivering significant improvements in asymptotic behavior for generic convolution cases. By leveraging the distinctive geometry of doubly block circulant unrolled kernel matrices, our approach eliminates the need for input and weight buffers, optimizes output memory usage, and minimizes redundant memory accesses. A comprehensive comparative analysis with state-of-the-art techniques showcases the key advantages and superior performance of our proposed method, achieving substantial reductions in memory requirements and high throughput.</p>}}, author = {{Ferreira, Lucas and Malkowsky, Steffen and Persson, Patrik and Astrom, Karl and Liu, Liang}}, booktitle = {{2023 IEEE 66th International Midwest Symposium on Circuits and Systems, MWSCAS 2023}}, isbn = {{9798350302103}}, keywords = {{2D Convolution; Doubly-Blocked Circulant Matrix; Systolic Array; Unrolled Kernel Matrix}}, language = {{eng}}, pages = {{236--240}}, publisher = {{IEEE - Institute of Electrical and Electronics Engineers Inc.}}, title = {{Doubly-Block Circulant Kernel Matrix Exploitation in Convolutional Accelerators}}, url = {{http://dx.doi.org/10.1109/MWSCAS57524.2023.10406059}}, doi = {{10.1109/MWSCAS57524.2023.10406059}}, year = {{2023}}, }