Web scraping and Generative Models training in the Directive 790/19

Chiara Gallese

doi:10.6092/issn.1825-1927/18871

Authors

Chiara Gallese University of Turin, Department of Law https://orcid.org/0000-0002-1825-0097

DOI:

https://doi.org/10.6092/issn.1825-1927/18871

Keywords:

web scraping, AI training, copyright law, intellectual property, generative models, text mining, data mining

Abstract

With the rapid development of large generative models, the lack of clarity regarding the possibility of legally scraping the web and using the data set to train AI models, in particular generative ones, has become an urgent issue. Although in the EU web scraping is regulated by Directive 790/2019, AI training is not explicitly mentioned in the text of law.
While for scientific research and teaching web scraping is permitted without exceptions, for other purposes it is allowed if the data is lawfully acquired and if the owner of the copyright did not prohibit so. The Directive allows web scraping for text and data mining for the purpose of gaining new knowledge from the data, but it is not clear if AI training can be considered to fall within this definition. This article aims to analyze the legal dilemma surrounding this topic.

Web scraping and Generative Models training in the Directive 790/19

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

License

Language

Make a Submission

Information

Current Issue