Jakub Lewandowski, Global Data Governance Officer at Commvault, discusses the impact of AI and large language models on our existing frameworks.
The past year has seen a phenomenal evolution in the capabilities and impact of AI, in particular large language models (LLMs), such as ChatGPT. ChatGPT was launched just last November and has sparked countless anxious articles, not to mention concerns over its ability to take over skilled white-collar jobs.
You might even wonder whether an article like this was written by a person or an AI – and you’d be right to do so (but I can confirm that a human wrote this).
LLMs are AI algorithms which use deep learning techniques and draw on vast datasets to understand prompts and generate the corresponding content. For some, this is merely ‘auto-predict on steroids’; for others, the results are genuinely breathtaking, potentially transforming how we work and create. When it comes to risk and regulation, however, what are the implications?
The impact of AI on risk
The most pressing area of focus concerns the data being fed into the LLM itself. These models can gather, retain, and analyse data, including personal and confidential information, on an unprecedented scale and with remarkable granularity. Who is responsible for ensuring the authenticity of the data used to train generative AI products?
How can we ensure that AI models are exclusively trained on data that meets certain quality standards, preventing potential litigation or penalties resulting from using proprietary or personal data without the necessary legal basis for processing? The adage ‘garbage in, garbage out’ holds particularly true in this context.
A secondary consideration about the impact of AI concerns what actually happens within the ‘black box’ of the LLM. As humans, can we truly understand the operations’ complexity? The process of recurring tokenisation, embedding, attention weighing, and data completion plays a crucial role in generating specific outcomes.
However, how can we proactively address the emergence of problematic content, such as AI hallucinations, observed biases, the proliferation of deepfakes, or instances of blatant discrimination? Are we limited to merely reacting to such incidents, or is there a way to consistently categorise and prevent them? Can LLMs alone serve as a preventive measure against these issues, or do we need additional measures?
Finally, we must look at what roles LLMs will fulfil and ask whether there are any existing restrictions on using LLMs, AI and automated decision-making. How will the data be used? Who defines the purpose? In terms of data flows, who or what will be granted access to the results generated by AI? Should we believe companies that insist the AI embedded in their product will minimise our attack surface or defend against cybersecurity threats more effectively? And how will your company meet privacy-related demands when using AI and LLMs when requesting data access or deletion?
The existing regulatory environment
A complicated data protection landscape already exists across Europe and beyond, with more expected in the wake of the rise of LLMs.
General Data Protection Regulation (GDPR)
Celebrating its fifth anniversary earlier this year, GDPR already feels like part of the furniture, but that’s not to say it is outdated. Fast-moving technological developments are showing that GDPR can stand up to new challenges, even if it was not designed with AI specifically in mind.
One of GDPR’s key principles is the right for individuals not to be subject to decisions with legal consequences or similarly significant effects solely based on automated processing. This is an important safeguard, ensuring that while LLMs may support decisions, the final determination, with certain exceptions, should involve human judgment.
GDPR has also introduced a valuable and influential mechanism for data protection impact assessments. These assessments require organisations to evaluate the risks posed to the rights and freedoms of individuals affected using LLMs. Additionally, organisations can seek guidance from supervisory authorities, although few currently do.
As a result, some supervisory authorities, such as France’s CNIL, are modelling themselves as regulators and enforcers within the framework of anticipated AI legislation. At the same time, businesses are undertaking similar positioning. Privacy professionals who have cut their teeth on GDPR are now naturally well-placed to tackle the challenges thrown up by the impact of AI.
GDPR already comprises in-built mechanisms to assess its effectiveness, and an expected review of its performance will include a special report due this year. There is also a newly launched dedicated EU taskforce on ChatGPT and revised drafts of the future AI Act, which was voted for overwhelmingly in the EU parliament in June of this year.
UK Data Protection and Digital Information Bill (DPDI)
This bill is now being reviewed by the House of Commons, and it aims to minimise the administrative burden on organisations, bolster international trade, and reduce red tape. The DPDI Bill is markedly more comprehensive when it comes to automated decision-making and, specifically, applying safeguards, including the right to be informed about such decisions and to involve humans regarding the outcome.
Earlier this year, the UK government also published a white paper on the impact of AI, which seeks to guide the usage of the technology within the UK. It recognises that businesses can be put off using AI to its full potential because inconsistent legal guidelines cause confusion and add financial and administrative weight to organisations trying to play by the rules.
Following on from this, in autumn 2023, the UK government is holding an AI summit at Downing Street to grapple with the new technology, learning from experts across the world how to harness the benefits and control the risks.
Preparing for an AI-enabled future
Clearly, AI regulations and data privacy have a great degree of overlap and organisations that deploy AI must steer a course through an increasingly complicated legislative and regulatory environment. In the coming months, businesses will need to stay up to date with new developments both in technology and legislation.
However, we should now focus on ensuring the right compliance instruments are deployed to minimise the impact of AI. This will ensure that all data is gathered and processed in line with our existing legal obligations.