Skip to main content

LUP Student Papers

LUND UNIVERSITY LIBRARIES

Social Distancing AI: Using super-resolution to train an object detection model on low resolution images

Hein, Dennis LU (2020) In Bachelor's Theses in Mathematical Sciences MASK11 20202
Mathematical Statistics
Abstract
This paper addresses the following hypothetical situation: suppose that we want to train an object detection algorithm but that only low resolution data is available. As a tentative solution to this problem, this paper suggest to first super-resolve the low resolution data to obtain higher resolution data that is then fed to the object detection algorithm. Super-resolution is a technique that takes a lower resolution image and “paints” in extra details in a very convincing way to produce a higher resolution output. To evaluate this approach, this paper trains the object detection model on to separate datasets: 4x down-sampled and subsequently super-resolved images and the original images. For super-resolution we use ESRGAN which is based... (More)
This paper addresses the following hypothetical situation: suppose that we want to train an object detection algorithm but that only low resolution data is available. As a tentative solution to this problem, this paper suggest to first super-resolve the low resolution data to obtain higher resolution data that is then fed to the object detection algorithm. Super-resolution is a technique that takes a lower resolution image and “paints” in extra details in a very convincing way to produce a higher resolution output. To evaluate this approach, this paper trains the object detection model on to separate datasets: 4x down-sampled and subsequently super-resolved images and the original images. For super-resolution we use ESRGAN which is based on RaGAN. Quantitative comparison suggests that there is little difference between the two networks and thus provides some support for the approach explored in this paper. Qualitative comparison indicates that this technique only works in certain cases–when super-resolution results in images that can properly be annotated. If the contrast between the objects of interest and the background is great enough, then super-resolution will result in much clearer images that are easily annotated. In this particular scenario, using super-resolution to train object detection algorithms represents a feasible solution to the hypothetical situation assessed in this paper. This approach is tested in the context of detecting social distancing violations in restaurants using areal footage. This is of high relevance in 2020 due to the ongoing Coronavirus pandemic. (Less)
Popular Abstract
This paper addresses the following situation: suppose that we want to train an object detection model but we only have access to low resolution data. This situation can emerge if we wish to use satellite data and detect “smaller,” for instance table sized, objects. The particular application of interest is to use satellite data to monitor violations of social distancing guideline imposed during the COVID-19 pandemic. As a tentative solution to the above problem, we suggest to first super-resolve the low resolution data to obtain a higher resolution dataset. This dataset is subsequently fed to an object detection model. Super-resolving these low resolution images might allow for the human eye to detect the presence of the objects of... (More)
This paper addresses the following situation: suppose that we want to train an object detection model but we only have access to low resolution data. This situation can emerge if we wish to use satellite data and detect “smaller,” for instance table sized, objects. The particular application of interest is to use satellite data to monitor violations of social distancing guideline imposed during the COVID-19 pandemic. As a tentative solution to the above problem, we suggest to first super-resolve the low resolution data to obtain a higher resolution dataset. This dataset is subsequently fed to an object detection model. Super-resolving these low resolution images might allow for the human eye to detect the presence of the objects of interest. If this is the case then the images can be annotated and used as training data for an object detection model. To assess the suggested approach, this paper uses drone footage shot over Lund, Sweden. The drone footage is of very high resolution compared to satellite imagery. To simulate the hypothetical scenario stated above these images are down-sampled to produce low quality images. Down-sampling will reduce the amount of pixels in the images and thus make it more blurry. To assess the merit of the suggested approach we train an object detection model on two separate datasets: the original images and images that have been down-sampled and subsequently super-resolved. The relative performance is the assessed both qualitatively, by visually observing the out, and quantitatively, by using some commonly used performance metrics from computer vision. In this paper we also show that it is possible to use the down-sampled and then super-resolved images the train an object detection algorithm to detect tables in the outdoor seating area of restaurants. To further show the utility of this application we create a program that will detect the distance between the center of the detected tables. The purpose of this additional feature is to detect violations of social distancing regulations imposed as a measure to curb the spread of COVID-19.
The key mathematical component in this paper is the Generative Adversarial Network (GAN). The original GAN was introduced in 2014 and have since received a lot of attention. GANs are essentially trying to solve the problem: given that we have a dataset of objects with a degree of consistency, can we generate similar objects artificially? The GAN trains a generator which generates the artificial objects by pitting it against a discriminator in a minimax game. Perhaps the best way to build some intuition is with the following analogy. The generator can be thought of as a team of counterfeiters trying to pro- duce some fake product, for instance currency. The discriminator can be thought of as the police, that is trying to catch the counterfeit currency but at the same time allowing genuine currency to be circulated. Competition between the counterfeiters and police will result in iterative improvement until the counterfeit currency is indistinguishable from the genuine currency. GANs have several interesting applications and in this paper we are concerned with super-resolution. Super-resolution is an image transformation task which involves estimating a higher resolution image from it’s low resolution counterpart. This is a fundamentally ill-posed problem as there are myriad of high resolution images that could be associated with the low resolution counterpart. GANs are useful in super-resolution as the generator can produce a higher resolution image by “painting” details into the lower resolution image in a perceptually convincing way.
This project shows that it is possible to use the suggested approach to train a model that is able to detect violations of social distancing regulations using down-sampled drone footage. For this method to work it is crucial that the super-resolved images can be properly annotated. Annotating involves marking bounding boxes around the objects of interest and giving them a label. If these objects are not identifiable, then they cannot be annotated and the object detection model will be penalized during training for correctly predicting objects at these locations. For super-resolution to produce clearer images that are easily annotated it is important that the contrast between the object of interest and the background is suciently high. (Less)
Please use this url to cite or link to this publication:
author
Hein, Dennis LU
supervisor
organization
course
MASK11 20202
year
type
M2 - Bachelor Degree
subject
publication/series
Bachelor's Theses in Mathematical Sciences
report number
LUNFMS-4047-2020
ISSN
1654-6229
other publication id
2020:K23
language
English
id
9030612
date added to LUP
2020-10-07 13:44:47
date last changed
2021-06-03 15:47:26
@misc{9030612,
  abstract     = {{This paper addresses the following hypothetical situation: suppose that we want to train an object detection algorithm but that only low resolution data is available. As a tentative solution to this problem, this paper suggest to first super-resolve the low resolution data to obtain higher resolution data that is then fed to the object detection algorithm. Super-resolution is a technique that takes a lower resolution image and “paints” in extra details in a very convincing way to produce a higher resolution output. To evaluate this approach, this paper trains the object detection model on to separate datasets: 4x down-sampled and subsequently super-resolved images and the original images. For super-resolution we use ESRGAN which is based on RaGAN. Quantitative comparison suggests that there is little difference between the two networks and thus provides some support for the approach explored in this paper. Qualitative comparison indicates that this technique only works in certain cases–when super-resolution results in images that can properly be annotated. If the contrast between the objects of interest and the background is great enough, then super-resolution will result in much clearer images that are easily annotated. In this particular scenario, using super-resolution to train object detection algorithms represents a feasible solution to the hypothetical situation assessed in this paper. This approach is tested in the context of detecting social distancing violations in restaurants using areal footage. This is of high relevance in 2020 due to the ongoing Coronavirus pandemic.}},
  author       = {{Hein, Dennis}},
  issn         = {{1654-6229}},
  language     = {{eng}},
  note         = {{Student Paper}},
  series       = {{Bachelor's Theses in Mathematical Sciences}},
  title        = {{Social Distancing AI: Using super-resolution to train an object detection model on low resolution images}},
  year         = {{2020}},
}