Non-destructive anonymization of training data for object detection
(2025) In Master's thesis in Mathematical Sciences FMAM05 20251Mathematics (Faculty of Engineering)
- Abstract
- The rapid advancement of computer vision, powered by large-scale visual datasets and deep learning, has raised pressing concerns about privacy, particularly when human faces are involved. This work explores how facial anonymization affects the performance of human detection models, aiming to balance identity protection with model utility. A range of anonymization techniques are applied, including Gaussian blurring, black-boxing, and diffusion-based inpainting, on both a COCO subset and a dataset tailored towards surveillance related use cases. An EfficientNet-based object detector is used to measure detection performance, serving as a benchmark for model utility. To evaluate the effectiveness of anonymization independently, a... (More)
- The rapid advancement of computer vision, powered by large-scale visual datasets and deep learning, has raised pressing concerns about privacy, particularly when human faces are involved. This work explores how facial anonymization affects the performance of human detection models, aiming to balance identity protection with model utility. A range of anonymization techniques are applied, including Gaussian blurring, black-boxing, and diffusion-based inpainting, on both a COCO subset and a dataset tailored towards surveillance related use cases. An EfficientNet-based object detector is used to measure detection performance, serving as a benchmark for model utility. To evaluate the effectiveness of anonymization independently, a similarity-based machine learning method is used along with human evaluation to assess how much identity remains visible after anonymization. This enables a quantified measure of the trade-off between privacy preservation and detection performance. By combining technical evaluation of model accuracy with both automated and human assessments of identity concealment, this work provides a comprehensive analysis of privacy-preserving strategies in computer vision, with implications for the development of ethical and responsible AI systems. The results show that classic anonymization techniques, such as black-boxing and Gaussian blurring, have minimal impact on human detection performance—achieving over 98\% relative AP50—while significantly degrading face detection capabilities. This indicates that object detectors may rely largely on non-facial cues. Diffusion-based inpainting methods offer more nuanced trade-offs: while full mask inpainting preserves strong detection performance and enhances privacy, partial mask inpainting retains more facial detail, resulting in higher face detection scores but weaker anonymization. These findings highlight the importance of method selection depending on the privacy-utility balance required by a given application, but that facial anonymization on training data is a possibility without significant drawback. (Less)
Please use this url to cite or link to this publication:
http://lup.lub.lu.se/student-papers/record/9188081
- author
- Tufvesson, Sebastian LU and Josefsson, Kalle LU
- supervisor
-
- Karl Åström LU
- organization
- course
- FMAM05 20251
- year
- 2025
- type
- H2 - Master's Degree (Two Years)
- subject
- keywords
- Anonymization, Object Detection, Machine Learning, Computer Vision, Diffusion Models.
- publication/series
- Master's thesis in Mathematical Sciences
- report number
- LUTFMA-3573-2025
- ISSN
- 1404-6342
- other publication id
- 2025:E17
- language
- English
- id
- 9188081
- date added to LUP
- 2025-05-22 13:04:04
- date last changed
- 2025-05-22 13:04:04
@misc{9188081, abstract = {{The rapid advancement of computer vision, powered by large-scale visual datasets and deep learning, has raised pressing concerns about privacy, particularly when human faces are involved. This work explores how facial anonymization affects the performance of human detection models, aiming to balance identity protection with model utility. A range of anonymization techniques are applied, including Gaussian blurring, black-boxing, and diffusion-based inpainting, on both a COCO subset and a dataset tailored towards surveillance related use cases. An EfficientNet-based object detector is used to measure detection performance, serving as a benchmark for model utility. To evaluate the effectiveness of anonymization independently, a similarity-based machine learning method is used along with human evaluation to assess how much identity remains visible after anonymization. This enables a quantified measure of the trade-off between privacy preservation and detection performance. By combining technical evaluation of model accuracy with both automated and human assessments of identity concealment, this work provides a comprehensive analysis of privacy-preserving strategies in computer vision, with implications for the development of ethical and responsible AI systems. The results show that classic anonymization techniques, such as black-boxing and Gaussian blurring, have minimal impact on human detection performance—achieving over 98\% relative AP50—while significantly degrading face detection capabilities. This indicates that object detectors may rely largely on non-facial cues. Diffusion-based inpainting methods offer more nuanced trade-offs: while full mask inpainting preserves strong detection performance and enhances privacy, partial mask inpainting retains more facial detail, resulting in higher face detection scores but weaker anonymization. These findings highlight the importance of method selection depending on the privacy-utility balance required by a given application, but that facial anonymization on training data is a possibility without significant drawback.}}, author = {{Tufvesson, Sebastian and Josefsson, Kalle}}, issn = {{1404-6342}}, language = {{eng}}, note = {{Student Paper}}, series = {{Master's thesis in Mathematical Sciences}}, title = {{Non-destructive anonymization of training data for object detection}}, year = {{2025}}, }