This article explores one of the ongoing criticisms made of the GDPR and seeks to explain why one specific emerging technology – Artificial Intelligence (AI) – is causing so much fuss in the field of data protection. Is regulatory reform the only answer to the never-ending technological progress?
When the father begins to doubt his own son
Axel Voss, one of the fathers of the GDPR, believes the GDPR is already out of date and needs to be revised. Voss suggested to the Financial Times the GDPR needs “some type of surgery” to consider the emergence of new technologies, such as AI, blockchain and facial or voice recognition. As a member of the European Parliament, he is not confident that the GDPR is ready to effectively address many of the challenges associated with these evolving technologies.
Let us focus on AI and look for the meaning of this concept and the challenges it presents from a GDPR perspective.
AI Conceptual framework
Although this is a concept that we have been hearing about more regularly, the field of AI is not new. It arose in the 50s with the mathematician and computer scientist Alan Turing who speculated about the possibility of machines that could “think”. To pass the Turing test, a computer would have to demonstrate behaviour indistinguishable from that of a human. More recently, the field of AI was revigorated with the growth of the internet and the enormous computing power and relatively low cost of hyperscale public cloud computing, resulting in an increased amount of data that was rapidly available for analysis. Companies such as Amazon and Google had access to data from billions of consumers and needed modern techniques to interpret the enormous volume of information. Companies like Microsoft are also making AI and Machine Learning (ML) APIs available to customers large and small, meaning AI is no longer only accessible to governments or large multinationals, but even small and medium-sized enterprises.
The European Commission, in its 2018 Communication on Artificial Intelligence, defined AI as “systems that display intelligent behaviour by analysing their environment and taking actions—with some degree of autonomy—to achieve specific goals.” For example, AI algorithms capable of optimising inventory levels, predict customer churn or identity fraud.
While AI has many subfields (such as robotics, vision recognition and speech processing), the most powerful advances are happening in ML.
ML is a subfield of AI based on the idea that computers can learn from data without being explicitly programmed. ML algorithms employ various statistical techniques on the data they are fed to make conclusions about the data. These algorithms improve as the amount of data increases and the conclusions they generate are either confirmed or disconfirmed. As an example, a ML algorithm for detecting fraud transactions becomes more accurate as it is fed more transaction data and its predictions are correct or incorrect.
ML can use supervised or unsupervised learning. The first requires the use of training data in the form of labelled inputs and outputs. If trained enough, the algorithm can be fed new input data it has not seen before and generate answers by applying the inference function to the new inputs. On the opposite side, unsupervised techniques operate without labels, meaning that they are not trying to predict specific outcomes. Instead, their purpose is to find patterns within data sets and identify them in meaningful ways (e.g. customers who are similar and therefore may represent new segments for marketing campaigns) or identify irregular patterns, such as detecting banking transaction behaviour that could reveal money laundering.
Data protection hurdles for AI
As you might have noticed, the above explanation does not focus on the technical attributes of AI. The idea it seeks to convey is that regardless of the designation adopted, there is an element that always prevails: AI’s heavy reliance on data – including personal data as companies seek to create added value and loyalty with personalised services.
As stated by the European Data Protection Board (EDPB), any “processing of personal data through an algorithm falls within the scope of the GDPR”. Thus, whenever an AI system uses personal data, all the standard provisions of the GDPR apply.
The GDPR sets high standards for collecting, sharing, and using data in the EU and creates specific rules for how individuals may access, rectify, transfer, and delete personal data held by third parties.
The transparency principle
According to this principle, the controller must provide certain information to the data subject “in a concise, transparent, intelligible and easily accessible form, using clear and plain language”. Furthermore, the controller must notify the data subject of the employment of automated decision-making and to provide information “about the logic involved, as well as the significance and the envisaged consequences of such processing for the data subject”. Although the EU’s guidelines explain that the above requirements do not necessarily require a complete disclosure of the algorithm, the information provided should be “sufficiently comprehensive for the data subject to understand the reasons for the decision.”
The technical side of ML technologies makes it hard for controllers to provide transparent information concerning the logic behind these systems. In its report “Big Data, Artificial Intelligence, Machine Learning and Data Protection”, the UK supervisory authority mentions that the complexity of big data analytics can mean that the processing is opaque to citizens and consumers whose data is being used.
Indeed, it is still not possible to explain why some AI systems behave in a particular way. Some algorithms are very accurate, but they are completely opaque, often referred to as black boxes Even the experts may not understand exactly how they work, they “only know that they do.”
The right to erasure
The GDPR provides data subjects with the right to require the data controller to erase personal data without undue delay. Complying with this right is not too difficult when personal data is used in a structured and traceable way, for example by an online store that keeps client information and uses it to send marketing promotions. In this case, if the person requests erasure, the provider simply needs to purge all data relating to this individual. The situation is less clear when the personal data concerned has been used for more advanced data processing activities.
Remember ML and how it learns and improves. To achieve any initial results and then more accurate results, ML needs to be trained: the input data are analysed by the system to find patterns enabling it to output results relevant to the system’s purpose – the algorithms improve as the amount of data increases and the conclusions they generate are either confirmed or improved upon. The initial data are no longer accessed by the system but the patterns found due to the use of that data persist and “are used to make better predictions on new data sets.” This is a problem from an erasure perspective, as this right implies that after erasure the use of the data ends and the data subject is forgotten.
When the ML system is operating, it makes use of all data that originally have been injected into the system, including the data that have been erased from the original source system. Data that have already been processed, analysed, and concluded on by the ML system will continue to influence the decision making of the system after the deletion of the data from the original source. Unless, the ML system is fully “retrained” using a new clean data-set, deleted master data will continue to cast a shadow in the ML-system that had access to the data to begin with.
The automated decision making restriction
The GDPR restricts how organisations use personal data to make automated decisions about individuals by establishing a right for individuals “not to be subject to a decision based solely on automated processing, including profiling” which produces legal or similar effects. This way, whenever companies use AI to make decisions about individuals, the data subject has the right to have a human review that decision. This assisted AI requirement can make it impractical for many companies to use AI to automate processes as it requires them to create a manual process – where humans review the automated decisions – for data subjects who opt-out of the automated one. This can be a financial burden for companies and a deterrent to the implementation of AI-driven solutions, and consequently innovation.
The EU’s guidelines also state that organisations “must ensure that any oversight of the decision is meaningful, rather than just a token gesture. It should be carried out by someone who has the authority and competence to change the decision.” This condition limits the feasibility of using AI to automate many processes involving personal data.
The fairness principle
This principle requires the processing of personal information to be done with respect for the data subject’s interests and to be used in accordance with what data subjects reasonably expect. It obliges the data controller to implement measures to prevent the arbitrary discriminatory treatment of data subjects. The Norwegian supervisory authority emphasises that algorithms and models are no more objective than the people who devise and build them, and the personal data that is used for training. “The model’s result may be incorrect or discriminatory if the training data renders a biased picture reality, or if it has no relevance to the area in question. Such use of personal data would be in contravention of the fairness principle.”
The purpose limitation principle
According to this GDPR principle, data subjects must be informed about the purpose of data collection and processing. This can an impediment for AI-driven solutions. As mentioned, ML uses data to analyse and identify patterns and gain new insights, that is not necessarily compatible with the original purpose of the original data collection (e.g. use of social media data for calculating a user’s insurance rate). The GDPR mandates that data can only be processed further if any new purpose is compatible with the original unless the data controller get further consent from the data subject.
Business benefits of AI
AI technologies deliver real business and societal benefits across multiple sectors. The European enterprise survey on the use of technologies based on artificial intelligence has found that 42% of enterprises currently use at least one AI technology, a quarter of them use at least two types, and 18% have plans to adopt AI technologies in 2021 and 2022.
Some of the most established applications for AI delivering concrete business solutions are in product and service recommendations, advertising placement and online search, but AI goes far beyond tech companies like Google, Amazon and Microsoft. The banking and the telecom industry has already started using AI to enhance customer experience, improve network reliability and predictive maintenance.
Plenty of opportunity exist for the healthcare industry to use ML to support the development of preventative medicine, patient care and new drug discovery.
Industrial and manufacturing companies are also using AI for predictive maintenance and advance optimisation across entire supply chains. Energy companies have transformed operations using AI. Other cases include recruiting and personal management. The list goes on.
What’s next for AI now?
The growing importance of AI in modern societies is undeniable. Microsoft CEO Satya Nadella has even described AI as “the defining technology of our times.” However, in a world where technologies are evolving at a breakneck pace, is the correct response to continually revise the GDPR?
Contrary to Alex Voss, Sophie in’t Veld (a Dutch Member of the European Parliament) states that although no law is ever perfect for every individual and union, the “GDPR is also a very general piece of legislation that leaves lots of flexibility for implementation.” and the EDPB advocates that “the GDPR is built in a technologically neutral manner in order to be able to face any technological change or revolution.”
In this sense, instead of turning the GDPR into a “Frankenstein” Regulation, changing it with every technological innovation, a more suitable alternative may be the essential role that the supervisory authorities of the different Member States could play, together with the EDPB.
There have been several guidelines produced to support the implementation of AI-driven solutions in accordance with GDPR principles. One such example is the comprehensive guidelines from the French supervisory authority, examining how the GDPR already regulates AI systems and detailing how it applies generally to AI systems in the same manner as any other processing of personal data.
Another example is the third pilar of the EDPB Work Programme 2021/2022, which is a fundamental rights approach to new technologies. The EDPB undertakes to monitor new and emerging technologies and their potential impact on the fundamental rights and daily lives of individuals. To achieve this, the EDPB will issue guidelines on the legal implications of AI, but also on blockchain, cloud computing and on the use of facial recognition technology.
If you would like to hear how WLC can help you assess the data protection implications of the deployment of artificial intelligence and machine learning to processing of personal data, please reach out to us.