Generative AI Copyright Concerns & 3 Best Practices in 2023

Generative AI, Copyright and the AI Act

The courts could ultimately decide, for example, that the scanning and digitization of books for the purpose of online search is fundamentally different from a machine ingesting those same books in order to refine its capabilities. “There’s a universe in which there’s a difference [and] it may seem subtle, but I think it’s a subtlety with a potentially significant variation,” says Balganesh. The June lawsuits against OpenAI and Meta are just two examples from what is already becoming a long string of such cases made against the companies that are building generative AI. As generative AI becomes increasingly powerful and accessible, a growing number of voices are rising in protest to what they view as the technology’s flagrant disregard for copyright law.

generative ai copyright

In these cases, copyright protection only applies to the human-authored aspects of the work. AI developers, for one, should ensure that they are in compliance with the law in regards to their acquisition of data being used to train their models. This should involve licensing and compensating those individuals who own the IP that developers seek to add to their training data, whether by licensing it or sharing in revenue generated by the AI tool. There has been a flood of media coverage of the intersection of copyright and generative artificial intelligence, a subset of the broader discussion of the challenges posed by AI generally. This coverage has been stimulated in part by lawsuits brought by authors, including Sarah Silverman, and open letters signed by artists.


The Copyright Office has launched an initiative to study generative AI and copyright, and today issued a notice of inquiry to solicit input on the issues involved. The Senate Judiciary Committee has also held multiple hearings on IP rights in AI-generated works, including one last month focused on copyright. And of course there are numerous lawsuits pending over its legality, based on theories ranging from copyright infringement to to privacy to defamation. It’s also clear that there is little agreement about a one-size-fits-all rule for AI-generated works that applies across industries. More recently, in July 2023, Mona Awad and Paul Tremblay accused OpenAI of using their copyrighted books to train ChatGPT, as the chatbot produced highly accurate summaries of their works.

generative ai copyright

Copyright Office that there is no copyright protection for works created by non-humans, including machines. At the moment, works created solely by artificial intelligence — even if produced from a text prompt written by a human — are not protected by copyright. However, when AI-generated works are copyrighted and then used for training sets, a legal conundrum can arise if the original creator did not license its use in such a way. To ensure that laws around copyright and fair use are respected, producers of generative AI content should demonstrate due diligence in obtaining proper licenses when possible.

From Data to Decisions

Copyright, Samuelson noted, automatically protects works of authorship, such as written content, illustrations, photographs, videos and musical compositions, from the moment they are fixed in a tangible medium, and it vests in those authors certain exclusive rights, including the ability to control reproductions and displays. However, copyright has limitations, including the fair use doctrine, which allows the unlicensed use of copyrighted works in criticism, news reporting, research and other specific circumstances. Clearly, if the goal is for generative AI providers to list all or most of the copyrighted material they are including in their training data sets in an itemized manner with clear identification of rights ownership claims, etc, then this provision is impossible to comply with. The low threshold of originality, the territorial fragmentation of copyright and its ownership, the absence of a registration requirement for works, and in general the poor state of rights ownership metadata (see e.g. here) demonstrate this impossibility.

Compliance and AI – Jonathan Armstrong on Unleashing … – JD Supra

Compliance and AI – Jonathan Armstrong on Unleashing ….

Posted: Fri, 01 Sep 2023 07:00:00 GMT [source]

That risk exists even with content that is not generated by AI, as humans can overlook certain facts that would lead to confusion. The NOI is an integral next step for the Office’s AI initiative, which was launched in early 2023. So far this year, the Office has held four public listening sessions and two webinars. This NOI builds on the feedback and questions the Office has received so far and seeks public input from the broadest audience to date in the initiative.

Creators and Companies Alike Take Action

Yakov Livshits
Founder of the DevEducation project
A prolific businessman and investor, and the founder of several large companies in Israel, the USA and the UAE, Yakov’s corporation comprises over 2,000 employees all over the world. He graduated from the University of Oxford in the UK and Technion in Israel, before moving on to study complex systems science at NECSI in the USA. Yakov has a Masters in Software Development.

However, ongoing policy discussions signal the possibility that the UK TDM exception may soon be expanded to include commercial purposes. The changing policy surrounding the use of copyrighted works in large data applications (including AI) shows the recognition by governments that updated legal directives are necessary to contend with the fast-developing AI industry. Generative artificial intelligence (ChatGPT, for example) has become very promising for researchers and content creators of all kinds. However, it also poses some risks yet to be explored, such as copyright infringement, among others. As AI tools crawl the Internet and other digital sources for information to respond to users’ queries, the information that is collected often belongs to other content creators. As of July 2023 there are several law suits brought against AI image and text generation tools that have used visual and text content created or owned by others as training material.

generative ai copyright

Specifically, it could help determine whether companies can claim fair use when their models scrape copyrighted material. “I’m not going to call the outcome on this question,” Sag says of Silverman’s lawsuit. “But it seems to be the most compelling of all of the cases that have been filed.” OpenAI did not respond to requests Yakov Livshits for comment. Another variable in judging fair use is whether or not the training data and model have been created by academic researchers and nonprofits. So, for example, Stability AI, the company that distributes Stable Diffusion, didn’t directly collect the model’s training data or train the models behind the software.


Bypassing that issue for now and focusing on the proposed text, the question that arises is what it exactly means to document the use of training data protected under copyright law, and to provide a summary thereof. The impact of generative artificial intelligence (AI) has quickly caught the attention of technologists and policymakers around the world. Among others, policymakers in Washington are scrambling to apply intellectual property (IP) laws and concepts in response. Indeed, just this month, the Senate Subcommittee on Intellectual Property held its second hearing on AI and its implications for copyright law. Congressional attention to copyright and AI matches a growing public interest in understanding how AI – and generative AI, in particular – uniquely affects what it means to be an author and how ownership of expression of ideas is determined. The United States Copyright Office denied the registration, spawning the case Thaler v. Perlmutter, naming the copyright examiner that rendered the decision.

  • We’ve probably lost at least one play by Shakespeare (there’s evidence he wrote a play called Love’s Labors Won); we’ve lost all but one of the plays of Thomas Kyd; and there are other playwrights known through playbills, reviews, and other references for whom there are no surviving works.
  • That said, the ruling could stifle AI firms from claiming any copyright protection over their content pieces, as seen in Thaler’s case.
  • It would require developers to distinguish between code they wrote with and without generative AI, which is often impractical.
  • 💡 A registry for AI-generated content and authors gains traction as a potential solution.

We plan to submit a comment sharing our perspective, and are eager to learn about the diversity of views on this important issue. Similarly, some have called for collective licensing legislation for copyrighted content used to train generative AI models, potentially as an amendment to the Copyright Act itself. We believe that this would not serve the creators it is designed to protect and we strongly oppose it. Similar efforts several years ago were proposed and rejected in the context of mass digitization based on similar concerns.

The Copyright Office said in a statement that it believes the court reached the correct result and the office is reviewing the decision. The static nature of these systems poses the risk of inaccuracy, as they are trained with time-limited datasets. For example, if you use GenAI to translate a manual on how to treat a disease and it makes mistakes in the details that nobody detects because the result seems human, however discreet these mistakes may be, who is responsible if it results in damage? All current GenAI tools explicitly waive any guarantee of suitability in their results, leaving all responsibility in the hands of the person who uses them.

The AI is the one that “determines the expressive elements of its output,” so the resulting work “is not protected by copyright and must be disclaimed in a registration application.” Id. It is possible that certain AI generated output infringes the rights of the creators of works used during the training of the model. Generative models are able to “memorize” content they are trained on, i.e. producing identity between output and input works. Although cases of identity are theoretically possible and have been reported, they are rare.

Publications similaires

Laisser un commentaire