Web scraping and Generative Models training in the Directive 790/19

Authors

DOI:

https://doi.org/10.6092/issn.1825-1927/18871

Keywords:

web scraping, AI training, copyright law, intellectual property, generative models, text mining, data mining

Abstract

With the rapid development of large generative models, the lack of clarity regarding the possibility of legally scraping the web and using the data set to train AI models, in particular generative ones, has become an urgent issue. Although in the EU web scraping is regulated by Directive 790/2019, AI training is not explicitly mentioned in the text of law.
While for scientific research and teaching web scraping is permitted without exceptions, for other purposes it is allowed if the data is lawfully acquired and if the owner of the copyright did not prohibit so. The Directive allows web scraping for text and data mining for the purpose of gaining new knowledge from the data, but it is not clear if AI training can be considered to fall within this definition. This article aims to analyze the legal dilemma surrounding this topic.

Downloads

Published

2024-01-12

How to Cite

Gallese, C. (2023) “Web scraping and Generative Models training in the Directive 790/19”, i-lex. Bologna, Italy, 16(2), pp. 1–16. doi: 10.6092/issn.1825-1927/18871.

Issue

Section

Articles