General-purpose AI models and copyright concerns under the AI Act
by Lex Keukens and Nicky Willemsen
The European Union's Artificial Intelligence Act (AI Act) includes relevant copyright obligations for general-purpose AI (GPAI) models. These obligations are intended to complement the text and data mining (TMD) exceptions outlined in the Copyright in the Single Market Directive (CDSMD). This article will discuss the interplay between the copyright obligations of the AI Act and the TMD exceptions of the CDSMD.
Copyright infringement constituted through TMD
GPAI models use TDM to train on large datasets from websites and publications, raising copyright concerns. Recital 105 of the AI Act addresses this, stating that reproducing copyrighted materials during training without permission constitutes a copyright infringement, unless exceptions apply.
The AI Act imposes two copyright-related obligations on GPAI model providers in Article 53(1)(c) and (d). For the purpose of this article, only Article 53(1)(c) will be discussed. Article 53(1)(c) requires GPAI model providers to respect EU copyright law, particularly to identify and comply with, including through state-of-the-art technologies, the reservations of rights (“opt-out”) under Article 4(3) of the CDSMD. These opt-outs, which must be machine-readable, allow rights holders to prevent their works from being used for TDM without explicit permission.
Legislative issues
At first glance, it seems that the exceptions laid down in Article 53(1)(c) are aligned with EU copyright law. However, there are grey areas not covered by this provision.
Firstly, in cases where the relevant TMD acts (the scraping) take place outside EU territory, the applicable copyright law to those acts is the law of the state where those reproductions and extractions take place. If the GPAI model is placed on the EU market after completion of the TMD acts, this cannot result in an infringement of Article 4(3) CDSMD.
Secondly, if the entity carrying out the scraping is not a GPAI model provider but a third party, it is not bound to Article 4(3) of the CDSMD and Article 53(1)(c) of the AI Act.
Therefore, it is difficult to foresee how GPAI model providers can ensure an effective opt-out when data scraping is prepared by third parties. However, it is argued that the AI Act should still apply to these acts, even when the scraping is essential to the GPAI model that is made available in the EU.
Possible solution and way forward
In November 2024 the first draft of the General-Purpose AI Code of Practice (Code) was published. The Code suggests that the GPAI model provider should agree to only make models available on the EU market that comply with EU copyright law throughout the value chain or lifecycle.
It remains to be seen whether this suggestion will make it into the final version of the Code and what role this Code (as a soft law instrument) will play in the overall framework of the new European legislation surrounding AI.
Lex Keukens is an experienced tech lawyer (partner), co-head of the tech team, and a tech enthusiast.
Nicky Willemsen specialises in advising and litigating in the field of intellectual property law, including matters at the intersection of intellectual property law and AI.