Skip to main content

Lund University Publications

LUND UNIVERSITY LIBRARIES

3D Integration Technology and Near-Memory Computing for Edge AI

Prieto, Arturo LU (2025) In Arturo Prieto
Abstract
In the era of artificial intelligence of things (AIoT), distributed processing on local devices is growing in popularity. This stems from the need to reduce data transfer to and from central servers, with concerns about privacy and latency encouraging the processing of AI applications on edge devices. However, the computational demands of these applications require an increase in the processing capabilities of resource-constrained edge devices with reduced memory and energy deployment. In this thesis, solutions for improving edge AI implementations are evaluated considering two approaches: technology integration and hardware architecture design.

Higher performance through increased technology integration has focused on scaling... (More)
In the era of artificial intelligence of things (AIoT), distributed processing on local devices is growing in popularity. This stems from the need to reduce data transfer to and from central servers, with concerns about privacy and latency encouraging the processing of AI applications on edge devices. However, the computational demands of these applications require an increase in the processing capabilities of resource-constrained edge devices with reduced memory and energy deployment. In this thesis, solutions for improving edge AI implementations are evaluated considering two approaches: technology integration and hardware architecture design.

Higher performance through increased technology integration has focused on scaling transistor dimensions. However, the manufacturing process is increasingly expensive and faces technical challenges in the development of new breakthroughs. Evaluation of the third dimension has emerged as a promising alternative to scaling, which enables stacking of semiconductor components with 3D interconnections. Different technologies present different integration strategies, where 3D sequential integration (3DSI) enables small pitch for 3D contacts, allowing for high-integration circuits. A library of standard cells has been designed and characterized according to 3DSI, enhancing the high-integration capabilities of the technology for digital designs. This library compiles the required predefined logic cells that can be used in the design of a digital integrated circuit (IC).

The design of ICs as a foundation for edge AI is focused on enhancing memory and computing resources to improve the processing capabilities of such platforms. Computing architectures are traditionally based on the concept of von Neumann architecture, which distinguishes computing and memory units as two independent entities. However, near-memory computing (NMC) is presented as a viable alternative to the von Neumann architecture that brings computation closer to memory. NMC is non-intrusive to the conventional low-level structure of SRAM and enhances memory bandwidth for hardware acceleration. The integration of accelerators into resource-constrained platforms has been evaluated, expanding the functionality with custom hardware tailored for computation-intensive AI workloads. Furthermore, flexibility has been achieved by providing modularity to the design architecture.

The proposed architectures are evaluated by programs that highlight the performance of integrated AI hardware accelerators into edge devices, emphasizing the importance of software and hardware co-design. The contributions of this thesis focus on 3DSI technology circuit design and NMC architectures evaluating performance, energy and area efficiency. (Less)
Please use this url to cite or link to this publication:
author
supervisor
opponent
  • Prof. Fey, Dietmar, Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany.
organization
publishing date
type
Thesis
publication status
published
subject
keywords
3D Integration Technology, Near-Memory Computing, Edge Computing, Artificial Intelligence, More-than-Moore, Convolutional Neural Network, Hardware Acceleration, SRAM
in
Arturo Prieto
issue
190
pages
180 pages
publisher
Electrical and Information Technology, Lund University
defense location
Lecture Hall E:1406, building E, Ole Römers väg 3, Faculty of Engineering LTH, Lund University, Lund. The dissertation will be live streamed, but part of the premises is to be excluded from the live stream.
defense date
2025-12-12 09:15:00
ISSN
1654-790X
1654-790X
ISBN
978-91-8104-745-5
978-91-8104-746-2
language
English
LU publication?
yes
id
02fe7be3-5bf0-40db-8423-cb82fe47d4e4
date added to LUP
2025-11-18 17:38:56
date last changed
2025-11-19 11:08:01
@phdthesis{02fe7be3-5bf0-40db-8423-cb82fe47d4e4,
  abstract     = {{In the era of artificial intelligence of things (AIoT), distributed processing on local devices is growing in popularity. This stems from the need to reduce data transfer to and from central servers, with concerns about privacy and latency encouraging the processing of AI applications on edge devices. However, the computational demands of these applications require an increase in the processing capabilities of resource-constrained edge devices with reduced memory and energy deployment. In this thesis, solutions for improving edge AI implementations are evaluated considering two approaches: technology integration and hardware architecture design.<br/><br/>Higher performance through increased technology integration has focused on scaling transistor dimensions. However, the manufacturing process is increasingly expensive and faces technical challenges in the development of new breakthroughs. Evaluation of the third dimension has emerged as a promising alternative to scaling, which enables stacking of semiconductor components with 3D interconnections. Different technologies present different integration strategies, where 3D sequential integration (3DSI) enables small pitch for 3D contacts, allowing for high-integration circuits. A library of standard cells has been designed and characterized according to 3DSI, enhancing the high-integration capabilities of the technology for digital designs. This library compiles the required predefined logic cells that can be used in the design of a digital integrated circuit (IC).<br/><br/>The design of ICs as a foundation for edge AI is focused on enhancing memory and computing resources to improve the processing capabilities of such platforms. Computing architectures are traditionally based on the concept of von Neumann architecture, which  distinguishes computing and memory units as two independent entities. However, near-memory computing (NMC) is presented as a viable alternative to the von Neumann architecture that brings computation closer to memory. NMC is non-intrusive to the conventional low-level structure of SRAM and enhances memory bandwidth for hardware acceleration. The integration of accelerators into resource-constrained platforms has been evaluated, expanding the functionality with custom hardware tailored for computation-intensive AI workloads. Furthermore, flexibility has been achieved by providing modularity to the design architecture.<br/><br/>The proposed architectures are evaluated by programs that highlight the performance of integrated AI hardware accelerators into edge devices, emphasizing the importance of software and hardware co-design. The contributions of this thesis focus on 3DSI technology circuit design and NMC architectures evaluating performance, energy and area efficiency.}},
  author       = {{Prieto, Arturo}},
  isbn         = {{978-91-8104-745-5}},
  issn         = {{1654-790X}},
  keywords     = {{3D Integration Technology; Near-Memory Computing; Edge Computing; Artificial Intelligence; More-than-Moore; Convolutional Neural Network; Hardware Acceleration; SRAM}},
  language     = {{eng}},
  month        = {{11}},
  number       = {{190}},
  publisher    = {{Electrical and Information Technology, Lund University}},
  school       = {{Lund University}},
  series       = {{Arturo Prieto}},
  title        = {{3D Integration Technology and Near-Memory Computing for Edge AI}},
  url          = {{https://lup.lub.lu.se/search/files/233355086/PhD_Thesis_Arturo_Prieto.pdf}},
  year         = {{2025}},
}