Web scraping and Generative Models training in the Directive 790/19
DOI:
https://doi.org/10.6092/issn.1825-1927/18871Keywords:
web scraping, AI training, copyright law, intellectual property, generative models, text mining, data miningAbstract
With the rapid development of large generative models, the lack of clarity regarding the possibility of legally scraping the web and using the data set to train AI models, in particular generative ones, has become an urgent issue. Although in the EU web scraping is regulated by Directive 790/2019, AI training is not explicitly mentioned in the text of law.
While for scientific research and teaching web scraping is permitted without exceptions, for other purposes it is allowed if the data is lawfully acquired and if the owner of the copyright did not prohibit so. The Directive allows web scraping for text and data mining for the purpose of gaining new knowledge from the data, but it is not clear if AI training can be considered to fall within this definition. This article aims to analyze the legal dilemma surrounding this topic.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2023 Chiara Gallese
This work is licensed under a Creative Commons Attribution 4.0 International License.