3D-design exploration of CNN algorithms
(2011) Conference on VLSI Circuits and Systems V 8067.- Abstract
- Multi-dimensional algorithms are hard to implement on classical platforms. Pipelining may exploit instruction-level parallelism, but not in the presence of simultaneous data; threads optimize only within the given restrictions. Tiled architectures do add a dimension to the solution space. With locally a large register store, data parallelism is handled, but only to a dimension. 3-D technologies are meant to add a dimension in the realization. Applied on the device level, it makes each computational node smaller. The interconnections become shorter and hence the network will be condensed. Such advantages will be easily lost at higher implementation levels unless 3-D technologies as multi-cores or chip stacking are also introduced. 3-D... (More)
- Multi-dimensional algorithms are hard to implement on classical platforms. Pipelining may exploit instruction-level parallelism, but not in the presence of simultaneous data; threads optimize only within the given restrictions. Tiled architectures do add a dimension to the solution space. With locally a large register store, data parallelism is handled, but only to a dimension. 3-D technologies are meant to add a dimension in the realization. Applied on the device level, it makes each computational node smaller. The interconnections become shorter and hence the network will be condensed. Such advantages will be easily lost at higher implementation levels unless 3-D technologies as multi-cores or chip stacking are also introduced. 3-D technologies scale in space, where (partial) reconfiguration scales in time. The optimal selection over the various implementation levels is algorithm dependent. The paper discusses such principles while applied on the scaling of cellular neural networks (CNN). It illustrates how stacking of reconfigurable chips supports many algorithmic requirements in a defect-insensitive manner. Further the paper explores the potential of chip stacking for multi-modal implementations in a reconfigurable approach to heterogeneous architectures for algorithm domains. (Less)
Please use this url to cite or link to this publication:
https://lup.lub.lu.se/record/2160611
- author
- Spaanenburg, Lambert LU and Malki, Suleyman LU
- organization
- publishing date
- 2011
- type
- Chapter in Book/Report/Conference proceeding
- publication status
- published
- subject
- keywords
- Behavioral Synthesis, Algorithm-specific Architecture, Cellular Neural, Network, Field-Programmable Gate-Array, multi-core benchmark
- host publication
- VLSI Circuits and Systems V
- volume
- 8067
- publisher
- SPIE
- conference name
- Conference on VLSI Circuits and Systems V
- conference location
- Prague, Czech Republic
- conference dates
- 2011-04-18 - 2011-04-20
- external identifiers
-
- wos:000292763900002
- scopus:79958027360
- ISSN
- 1996-756X
- 0277-786X
- DOI
- 10.1117/12.887440
- language
- English
- LU publication?
- yes
- id
- 72245677-4f3d-4902-aee0-5f86f0697fc3 (old id 2160611)
- date added to LUP
- 2016-04-01 10:56:01
- date last changed
- 2024-01-07 04:33:41
@inproceedings{72245677-4f3d-4902-aee0-5f86f0697fc3, abstract = {{Multi-dimensional algorithms are hard to implement on classical platforms. Pipelining may exploit instruction-level parallelism, but not in the presence of simultaneous data; threads optimize only within the given restrictions. Tiled architectures do add a dimension to the solution space. With locally a large register store, data parallelism is handled, but only to a dimension. 3-D technologies are meant to add a dimension in the realization. Applied on the device level, it makes each computational node smaller. The interconnections become shorter and hence the network will be condensed. Such advantages will be easily lost at higher implementation levels unless 3-D technologies as multi-cores or chip stacking are also introduced. 3-D technologies scale in space, where (partial) reconfiguration scales in time. The optimal selection over the various implementation levels is algorithm dependent. The paper discusses such principles while applied on the scaling of cellular neural networks (CNN). It illustrates how stacking of reconfigurable chips supports many algorithmic requirements in a defect-insensitive manner. Further the paper explores the potential of chip stacking for multi-modal implementations in a reconfigurable approach to heterogeneous architectures for algorithm domains.}}, author = {{Spaanenburg, Lambert and Malki, Suleyman}}, booktitle = {{VLSI Circuits and Systems V}}, issn = {{1996-756X}}, keywords = {{Behavioral Synthesis; Algorithm-specific Architecture; Cellular Neural; Network; Field-Programmable Gate-Array; multi-core benchmark}}, language = {{eng}}, publisher = {{SPIE}}, title = {{3D-design exploration of CNN algorithms}}, url = {{http://dx.doi.org/10.1117/12.887440}}, doi = {{10.1117/12.887440}}, volume = {{8067}}, year = {{2011}}, }