Skip to main content

LUP Student Papers

LUND UNIVERSITY LIBRARIES

Synthesizing Training Data for Object Detection Using Generative Adversarial Networks

Astermark, Jonathan LU (2018) In Master's Theses in Mathematical Sciences FMAM05 20182
Mathematics (Faculty of Engineering)
Abstract
Object detection is an important tool in computer vision and a popular application of machine learning. One of the main challenges in object detection, and machine learning in general, is acquiring sufficient training data. Many types of data can be hard or expensive to collect and label, or be subject to privacy concerns and regulations such as the General Data Protection Regulation (GDPR). This is particularly true for many object detection tasks, such as face detection where the training data consists of images depicting faces. Using synthetic data for training has been attempted before, but no consensus exists on how to best utilize it. This work focuses on using a priori trained Generative Adversarial Networks (GANs) to produce... (More)
Object detection is an important tool in computer vision and a popular application of machine learning. One of the main challenges in object detection, and machine learning in general, is acquiring sufficient training data. Many types of data can be hard or expensive to collect and label, or be subject to privacy concerns and regulations such as the General Data Protection Regulation (GDPR). This is particularly true for many object detection tasks, such as face detection where the training data consists of images depicting faces. Using synthetic data for training has been attempted before, but no consensus exists on how to best utilize it. This work focuses on using a priori trained Generative Adversarial Networks (GANs) to produce synthetic images of faces, and using them to train detectors based on Haar-like features. Experiments were conducted on both replacing real images with synthetic, and introducing synthetic variance by augmenting real images using image-to-image translating GANs. It was found that GAN-generated images can indeed be useful for detector training. Although real images consistently performs better, the amount of data plays a role as well, and a priori trained GANs can easily produce a lot of synthetic data with good variation. If real data is hard to collect, synthetic data produced by a GAN could be a viable option. It was also found that image-to-image translating GANs can be useful for data augmentation, especially when real data is scarce. Future work should focus on the impact of variance and bias in the synthetic data and how it can be controlled for optimal performance. (Less)
Please use this url to cite or link to this publication:
author
Astermark, Jonathan LU
supervisor
organization
course
FMAM05 20182
year
type
H2 - Master's Degree (Two Years)
subject
keywords
Generative Adversarial Networks, Object Detection, Synthetic Training Data, Data Augmentation, Deep Learning
publication/series
Master's Theses in Mathematical Sciences
report number
LUTFMA-3370-2018
ISSN
1404-6342
other publication id
2018:E75
language
English
id
8966020
date added to LUP
2019-02-28 10:53:05
date last changed
2019-02-28 10:53:05
@misc{8966020,
  abstract     = {{Object detection is an important tool in computer vision and a popular application of machine learning. One of the main challenges in object detection, and machine learning in general, is acquiring sufficient training data. Many types of data can be hard or expensive to collect and label, or be subject to privacy concerns and regulations such as the General Data Protection Regulation (GDPR). This is particularly true for many object detection tasks, such as face detection where the training data consists of images depicting faces. Using synthetic data for training has been attempted before, but no consensus exists on how to best utilize it. This work focuses on using a priori trained Generative Adversarial Networks (GANs) to produce synthetic images of faces, and using them to train detectors based on Haar-like features. Experiments were conducted on both replacing real images with synthetic, and introducing synthetic variance by augmenting real images using image-to-image translating GANs. It was found that GAN-generated images can indeed be useful for detector training. Although real images consistently performs better, the amount of data plays a role as well, and a priori trained GANs can easily produce a lot of synthetic data with good variation. If real data is hard to collect, synthetic data produced by a GAN could be a viable option. It was also found that image-to-image translating GANs can be useful for data augmentation, especially when real data is scarce. Future work should focus on the impact of variance and bias in the synthetic data and how it can be controlled for optimal performance.}},
  author       = {{Astermark, Jonathan}},
  issn         = {{1404-6342}},
  language     = {{eng}},
  note         = {{Student Paper}},
  series       = {{Master's Theses in Mathematical Sciences}},
  title        = {{Synthesizing Training Data for Object Detection Using Generative Adversarial Networks}},
  year         = {{2018}},
}