Skip to main content

Lund University Publications

LUND UNIVERSITY LIBRARIES

A CNN-Specific Integrated Processor

Malki, Suleyman LU and Spaanenburg, Lambert LU (2009) In Eurasip Journal on Advances in Signal Processing
Abstract
Integrated Processors (IP) are meant to supply algorithm-specific cores to a micro-electronic system. They are usually developed separately, and upon success turned into a part, that either by programming or by configuration can be re-used within many systems. This paper looks at architectures for Cellular Neural Networks (CNN) to become realized as IP. First current digital implementations are reviewed, and the memory-processor bandwidth issues are analyzed. Then a generic view is taken on the structure of the network, and a new intra-communication protocol based on rotating wheels is proposed. It is shown that this provides for guaranteed high-performance with a minimal network interfaces. The resulting node is small and supports... (More)
Integrated Processors (IP) are meant to supply algorithm-specific cores to a micro-electronic system. They are usually developed separately, and upon success turned into a part, that either by programming or by configuration can be re-used within many systems. This paper looks at architectures for Cellular Neural Networks (CNN) to become realized as IP. First current digital implementations are reviewed, and the memory-processor bandwidth issues are analyzed. Then a generic view is taken on the structure of the network, and a new intra-communication protocol based on rotating wheels is proposed. It is shown that this provides for guaranteed high-performance with a minimal network interfaces. The resulting node is small and supports multi-level CNN designs, giving the system a 30-fold increase in capacity compared to classical designs. As it facilitates multiple operations on a single image, and single operations on multiple images, with minimal access to the external image memory, balancing the internal and external data transfer requirements can optimize the system operation. Special consideration is given to the treatment of the boundary nodes. In conventional digital CNN designs, such requires additional logic to handle the CNN value propagation scheme. In the new architecture, only a slight modification of the existing cells is necessary to model the boundary effect. A typical prototype for visual pattern recognition will house 4096 CNN cells with a 2% overhead for making it an IP. (Less)
Please use this url to cite or link to this publication:
author
and
organization
publishing date
type
Contribution to journal
publication status
published
subject
keywords
Cellular Neural Network, Integrated Processors, Memory Bandwidth, Design Exploration, Tiled Architecture
in
Eurasip Journal on Advances in Signal Processing
article number
854241
publisher
Hindawi Limited
external identifiers
  • wos:000269045100001
  • scopus:68949119543
ISSN
1687-6172
DOI
10.1155/2009/854241
language
English
LU publication?
yes
id
bede7568-f7da-43fd-a776-e8dc8f67f714 (old id 1301218)
date added to LUP
2016-04-01 13:43:07
date last changed
2022-02-26 22:42:26
@article{bede7568-f7da-43fd-a776-e8dc8f67f714,
  abstract     = {{Integrated Processors (IP) are meant to supply algorithm-specific cores to a micro-electronic system. They are usually developed separately, and upon success turned into a part, that either by programming or by configuration can be re-used within many systems. This paper looks at architectures for Cellular Neural Networks (CNN) to become realized as IP. First current digital implementations are reviewed, and the memory-processor bandwidth issues are analyzed. Then a generic view is taken on the structure of the network, and a new intra-communication protocol based on rotating wheels is proposed. It is shown that this provides for guaranteed high-performance with a minimal network interfaces. The resulting node is small and supports multi-level CNN designs, giving the system a 30-fold increase in capacity compared to classical designs. As it facilitates multiple operations on a single image, and single operations on multiple images, with minimal access to the external image memory, balancing the internal and external data transfer requirements can optimize the system operation. Special consideration is given to the treatment of the boundary nodes. In conventional digital CNN designs, such requires additional logic to handle the CNN value propagation scheme. In the new architecture, only a slight modification of the existing cells is necessary to model the boundary effect. A typical prototype for visual pattern recognition will house 4096 CNN cells with a 2% overhead for making it an IP.}},
  author       = {{Malki, Suleyman and Spaanenburg, Lambert}},
  issn         = {{1687-6172}},
  keywords     = {{Cellular Neural Network; Integrated Processors; Memory Bandwidth; Design Exploration; Tiled Architecture}},
  language     = {{eng}},
  publisher    = {{Hindawi Limited}},
  series       = {{Eurasip Journal on Advances in Signal Processing}},
  title        = {{A CNN-Specific Integrated Processor}},
  url          = {{http://dx.doi.org/10.1155/2009/854241}},
  doi          = {{10.1155/2009/854241}},
  year         = {{2009}},
}