Skip to main content

LUP Student Papers

LUND UNIVERSITY LIBRARIES

Generative AI for Synthetic Data

Belfrage, Esaias and Borna, August (2023)
Department of Automatic Control
Abstract
Synthetic data generation has emerged as a valuable technique for addressing data scarcity and privacy concerns and improving machine learning algorithms. This thesis focuses on progressing the field of synthetic data generation, which may play a crucial role in AI-heavy industries such as telecommunications. Generative Adversarial Networks successfully generate various types of synthetic data but fall short when modelling the temporal patterns and conditional distributions of time series data. State-of-the-art TimeGAN has shown promise, but there is potential for refinement. We propose T2GAN, utilising TimeGAN’s novel framework of combining unsupervised and supervised training and extending it using state-of-the-art machine learning... (More)
Synthetic data generation has emerged as a valuable technique for addressing data scarcity and privacy concerns and improving machine learning algorithms. This thesis focuses on progressing the field of synthetic data generation, which may play a crucial role in AI-heavy industries such as telecommunications. Generative Adversarial Networks successfully generate various types of synthetic data but fall short when modelling the temporal patterns and conditional distributions of time series data. State-of-the-art TimeGAN has shown promise, but there is potential for refinement. We propose T2GAN, utilising TimeGAN’s novel framework of combining unsupervised and supervised training and extending it using state-of-the-art machine learning techniques, such as Transformers. Through experimental evaluation, we quantify the effectiveness of T2GAN using various benchmark data sets and find that the T2GAN model significantly surpasses the TimeGAN in both discriminative and predictive capacities. Our results demonstrate a 38% enhancement in similarity measures and a 55% reduction in relative prediction error when using synthetic training data. Furthermore, the thesis presents a comprehensive literature study and analysis of generative models, detailing the potential of T2GAN in various domains by enabling privacy-preserving data analysis, facilitating research and development, and enhancing machine learning algorithms. (Less)
Please use this url to cite or link to this publication:
author
Belfrage, Esaias and Borna, August
supervisor
organization
year
type
H3 - Professional qualifications (4 Years - )
subject
report number
TFRT-6200
other publication id
0280-5316
language
English
id
9136467
date added to LUP
2023-09-06 14:13:55
date last changed
2023-09-06 14:13:55
@misc{9136467,
  abstract     = {{Synthetic data generation has emerged as a valuable technique for addressing data scarcity and privacy concerns and improving machine learning algorithms. This thesis focuses on progressing the field of synthetic data generation, which may play a crucial role in AI-heavy industries such as telecommunications. Generative Adversarial Networks successfully generate various types of synthetic data but fall short when modelling the temporal patterns and conditional distributions of time series data. State-of-the-art TimeGAN has shown promise, but there is potential for refinement. We propose T2GAN, utilising TimeGAN’s novel framework of combining unsupervised and supervised training and extending it using state-of-the-art machine learning techniques, such as Transformers. Through experimental evaluation, we quantify the effectiveness of T2GAN using various benchmark data sets and find that the T2GAN model significantly surpasses the TimeGAN in both discriminative and predictive capacities. Our results demonstrate a 38% enhancement in similarity measures and a 55% reduction in relative prediction error when using synthetic training data. Furthermore, the thesis presents a comprehensive literature study and analysis of generative models, detailing the potential of T2GAN in various domains by enabling privacy-preserving data analysis, facilitating research and development, and enhancing machine learning algorithms.}},
  author       = {{Belfrage, Esaias and Borna, August}},
  language     = {{eng}},
  note         = {{Student Paper}},
  title        = {{Generative AI for Synthetic Data}},
  year         = {{2023}},
}