Skip to main content

LUP Student Papers

LUND UNIVERSITY LIBRARIES

Evaluation of LLMs for Hardware Test Generation

Lidbäck, Albin LU (2024) EITM01 20242
Department of Electrical and Information Technology
Abstract
Testing electrical units and systems is an important part of the production process.
The development and generation of hardware tests are a labor-intensive process
that requires expertise and deep understanding of the system’s behavior. This
thesis explores the application of Large Language Models (LLMs) in generating
hardware test steps.

The research employs a comparative analysis of several LLMs including GPT-
3.5-turbo, GPT-4-turbo, Meta-Llama-3.1-8B-Instruct, and Mixtral-8x7B-Instruct-
v0.1, assessing their performance in terms of accuracy and usability. An evaluation
of the LLMs will be performed in three different electrical cases with increasing
complexity of the PCB. In addition to this, an assessment of the importance... (More)
Testing electrical units and systems is an important part of the production process.
The development and generation of hardware tests are a labor-intensive process
that requires expertise and deep understanding of the system’s behavior. This
thesis explores the application of Large Language Models (LLMs) in generating
hardware test steps.

The research employs a comparative analysis of several LLMs including GPT-
3.5-turbo, GPT-4-turbo, Meta-Llama-3.1-8B-Instruct, and Mixtral-8x7B-Instruct-
v0.1, assessing their performance in terms of accuracy and usability. An evaluation
of the LLMs will be performed in three different electrical cases with increasing
complexity of the PCB. In addition to this, an assessment of the importance of
prompt engineering and how the structure of data impacts the generated test steps.

The results indicate that the LLMs have a varying performance when evaluated
for the different cases. The accuracy decreases drastically when the complexity of
the test cases is increased. The results also indicate that the structure of prompts
and data are important in the generated test steps’ quality.

This thesis contributes to the field of hardware test generation by providing an
initial study of how Artificial Intelligence (AI) and LLM may be used to automate
and ease the development of hardware tests. (Less)
Please use this url to cite or link to this publication:
author
Lidbäck, Albin LU
supervisor
organization
course
EITM01 20242
year
type
H2 - Master's Degree (Two Years)
subject
report number
LU/LTH-EIT 2024-1033
language
English
id
9178235
date added to LUP
2024-11-28 09:28:32
date last changed
2024-11-28 09:28:32
@misc{9178235,
  abstract     = {{Testing electrical units and systems is an important part of the production process.
The development and generation of hardware tests are a labor-intensive process
that requires expertise and deep understanding of the system’s behavior. This
thesis explores the application of Large Language Models (LLMs) in generating
hardware test steps.

The research employs a comparative analysis of several LLMs including GPT-
3.5-turbo, GPT-4-turbo, Meta-Llama-3.1-8B-Instruct, and Mixtral-8x7B-Instruct-
v0.1, assessing their performance in terms of accuracy and usability. An evaluation
of the LLMs will be performed in three different electrical cases with increasing
complexity of the PCB. In addition to this, an assessment of the importance of
prompt engineering and how the structure of data impacts the generated test steps.

The results indicate that the LLMs have a varying performance when evaluated
for the different cases. The accuracy decreases drastically when the complexity of
the test cases is increased. The results also indicate that the structure of prompts
and data are important in the generated test steps’ quality.

This thesis contributes to the field of hardware test generation by providing an
initial study of how Artificial Intelligence (AI) and LLM may be used to automate
and ease the development of hardware tests.}},
  author       = {{Lidbäck, Albin}},
  language     = {{eng}},
  note         = {{Student Paper}},
  title        = {{Evaluation of LLMs for Hardware Test Generation}},
  year         = {{2024}},
}