Evaluation of LLMs for Hardware Test Generation
(2024) EITM01 20242Department of Electrical and Information Technology
- Abstract
- Testing electrical units and systems is an important part of the production process.
The development and generation of hardware tests are a labor-intensive process
that requires expertise and deep understanding of the system’s behavior. This
thesis explores the application of Large Language Models (LLMs) in generating
hardware test steps.
The research employs a comparative analysis of several LLMs including GPT-
3.5-turbo, GPT-4-turbo, Meta-Llama-3.1-8B-Instruct, and Mixtral-8x7B-Instruct-
v0.1, assessing their performance in terms of accuracy and usability. An evaluation
of the LLMs will be performed in three different electrical cases with increasing
complexity of the PCB. In addition to this, an assessment of the importance... (More) - Testing electrical units and systems is an important part of the production process.
The development and generation of hardware tests are a labor-intensive process
that requires expertise and deep understanding of the system’s behavior. This
thesis explores the application of Large Language Models (LLMs) in generating
hardware test steps.
The research employs a comparative analysis of several LLMs including GPT-
3.5-turbo, GPT-4-turbo, Meta-Llama-3.1-8B-Instruct, and Mixtral-8x7B-Instruct-
v0.1, assessing their performance in terms of accuracy and usability. An evaluation
of the LLMs will be performed in three different electrical cases with increasing
complexity of the PCB. In addition to this, an assessment of the importance of
prompt engineering and how the structure of data impacts the generated test steps.
The results indicate that the LLMs have a varying performance when evaluated
for the different cases. The accuracy decreases drastically when the complexity of
the test cases is increased. The results also indicate that the structure of prompts
and data are important in the generated test steps’ quality.
This thesis contributes to the field of hardware test generation by providing an
initial study of how Artificial Intelligence (AI) and LLM may be used to automate
and ease the development of hardware tests. (Less)
Please use this url to cite or link to this publication:
http://lup.lub.lu.se/student-papers/record/9178235
- author
- Lidbäck, Albin LU
- supervisor
-
- Erik Larsson LU
- organization
- course
- EITM01 20242
- year
- 2024
- type
- H2 - Master's Degree (Two Years)
- subject
- report number
- LU/LTH-EIT 2024-1033
- language
- English
- id
- 9178235
- date added to LUP
- 2024-11-28 09:28:32
- date last changed
- 2024-11-28 09:28:32
@misc{9178235, abstract = {{Testing electrical units and systems is an important part of the production process. The development and generation of hardware tests are a labor-intensive process that requires expertise and deep understanding of the system’s behavior. This thesis explores the application of Large Language Models (LLMs) in generating hardware test steps. The research employs a comparative analysis of several LLMs including GPT- 3.5-turbo, GPT-4-turbo, Meta-Llama-3.1-8B-Instruct, and Mixtral-8x7B-Instruct- v0.1, assessing their performance in terms of accuracy and usability. An evaluation of the LLMs will be performed in three different electrical cases with increasing complexity of the PCB. In addition to this, an assessment of the importance of prompt engineering and how the structure of data impacts the generated test steps. The results indicate that the LLMs have a varying performance when evaluated for the different cases. The accuracy decreases drastically when the complexity of the test cases is increased. The results also indicate that the structure of prompts and data are important in the generated test steps’ quality. This thesis contributes to the field of hardware test generation by providing an initial study of how Artificial Intelligence (AI) and LLM may be used to automate and ease the development of hardware tests.}}, author = {{Lidbäck, Albin}}, language = {{eng}}, note = {{Student Paper}}, title = {{Evaluation of LLMs for Hardware Test Generation}}, year = {{2024}}, }