Skip to main content

LUP Student Papers

LUND UNIVERSITY LIBRARIES

Is it possible to reconcile a complete data set with data minimisation? - Examining Legal Tensions Between the AIA’s Data Requirements and the GDPR’s Data Minimisation Principle

Hansson, Johanna LU (2025) HARN63 20251
Department of Business Law
Abstract
The current, seemingly uninhibited, growth of AI technologies presents one of the most critical modern day quandaries. The magnitude of its potential to develop, necessitates a deeper look into the legal implications associated which this technology. This thesis aims to describe and analyse the obligations of complete data sets in Article 10 AIA and its relationship with article 5(c) GDPR, using the EU legal method.
When developing an AI system, large quantities of data are processed to ensure the proper functioning of the system. The data could be labelled or unlabelled, contain non-personal data or personal data. It is important that the data is of high quality, to counteract biases and discrimination. Therefore, the AIA require the... (More)
The current, seemingly uninhibited, growth of AI technologies presents one of the most critical modern day quandaries. The magnitude of its potential to develop, necessitates a deeper look into the legal implications associated which this technology. This thesis aims to describe and analyse the obligations of complete data sets in Article 10 AIA and its relationship with article 5(c) GDPR, using the EU legal method.
When developing an AI system, large quantities of data are processed to ensure the proper functioning of the system. The data could be labelled or unlabelled, contain non-personal data or personal data. It is important that the data is of high quality, to counteract biases and discrimination. Therefore, the AIA require the data to be relevant, sufficiently representative, and to the best extent possible, free of errors and complete in view of the intended purpose. To achieve that requirement, processing of personal data could be vital. However, this seemingly conflicts with the principle of data minimisation in the GDPR, requiring the data to adequate, relevant and limited to what is necessary.
It is established by the EDPB that whilst needing to follow the principle of data minimisation, personal data can be used in AI training. When stated in the purpose that processing of personal data is needed to reduce the inherent biases, and this cannot be achieved any other way, then personal data is allowed. This necessitates a legitimate, specific and explicit purpose for the processing. Furthermore, if additional data is beneficial, relative to the purpose of the processing should be allowed, if the additional risk to the data subject does not outweigh this benefit. In conclusion, the demand for a complete data set could be seen as being the limit for what is considered necessary for the purpose of the processing, making the regulation compatible. However, this needs to be settled by the CJEU, to establish clear boundaries. (Less)
Please use this url to cite or link to this publication:
author
Hansson, Johanna LU
supervisor
organization
course
HARN63 20251
year
type
H1 - Master's Degree (One Year)
subject
keywords
AI Act, GDPR, Data minimisation, Complete data sets, AI-training, High-risk AI systems, Bias
language
English
id
9192049
date added to LUP
2025-06-03 12:41:03
date last changed
2025-06-03 12:41:03
@misc{9192049,
  abstract     = {{The current, seemingly uninhibited, growth of AI technologies presents one of the most critical modern day quandaries. The magnitude of its potential to develop, necessitates a deeper look into the legal implications associated which this technology. This thesis aims to describe and analyse the obligations of complete data sets in Article 10 AIA and its relationship with article 5(c) GDPR, using the EU legal method. 
When developing an AI system, large quantities of data are processed to ensure the proper functioning of the system. The data could be labelled or unlabelled, contain non-personal data or personal data. It is important that the data is of high quality, to counteract biases and discrimination. Therefore, the AIA require the data to be relevant, sufficiently representative, and to the best extent possible, free of errors and complete in view of the intended purpose. To achieve that requirement, processing of personal data could be vital. However, this seemingly conflicts with the principle of data minimisation in the GDPR, requiring the data to adequate, relevant and limited to what is necessary. 
It is established by the EDPB that whilst needing to follow the principle of data minimisation, personal data can be used in AI training. When stated in the purpose that processing of personal data is needed to reduce the inherent biases, and this cannot be achieved any other way, then personal data is allowed. This necessitates a legitimate, specific and explicit purpose for the processing. Furthermore, if additional data is beneficial, relative to the purpose of the processing should be allowed, if the additional risk to the data subject does not outweigh this benefit. In conclusion, the demand for a complete data set could be seen as being the limit for what is considered necessary for the purpose of the processing, making the regulation compatible. However, this needs to be settled by the CJEU, to establish clear boundaries.}},
  author       = {{Hansson, Johanna}},
  language     = {{eng}},
  note         = {{Student Paper}},
  title        = {{Is it possible to reconcile a complete data set with data minimisation? - Examining Legal Tensions Between the AIA’s Data Requirements and the GDPR’s Data Minimisation Principle}},
  year         = {{2025}},
}