The EU AI act was officially published on the 12th of July, 2024. This means the countdown for compliance has begun. These regulatory discussions are occurring in various jurisdictions with equal significance to AI's future and responsible training data. The focus is broadly centered around copyright and licensing of content and data. Meanwhile, the music industry has been front and center in advocating for solutions and regulations that protect the copyright of its artists and their works.
The US as a jurisdiction is significant because some of the largest AI companies and some of the most considerable copyright lawsuits are centered in its legal system. Three regulations have come to light in contributing to the regulatory discussion surrounding copyright and training data. First, is the ELVIS Act, which was signed into Tennessee law which updates personal likeness rights to include voice. Second, is the proposal of the COPIED Act which formalises the requirement for provenance and gives rights holders legal avenues to take control of their property. Finally, the Generative AI Copyright Disclosure Act 2024 proposes a push for copyrighted inventory transparency from AI developers and deployers. Each of these regulations will be briefly discussed weighing up their pros, cons, and impact. Finally, the solutions that will facilitate and support these regulations will be briefly examined.
On March 21, 2024, the Ensuring Likeness Voice and Image Security Act (ELVIS Act) was signed into law by the Governor of Tennessee Bill Lee. The purpose of the act is to legally protect musicians from unauthorised usage of their likeness in deepfakes generated using AI tools. The law achieves this by prohibiting people using AI to mimic a person’s voice without their permission. Violations can be criminally enforced as a class A misdemeanour which can potentially land an offender a year in prison or a fine or $2500 in the state of Tennessee. This comes among a rise in the accessibility of said AI tools to recreate the likeness of an Artist for works they had never been involved with. One example that is continually mentioned in relation to this case is of the anonymous TikTok user ‘Ghostwriter977’, who recreated the voices of Drake and The Weeknd in a song amassing over 11 million views without ever involving either party in the creation of this work. This example shows very clearly how developed and accessible these AI tools have become. Another example brings to light the controversy of OpenAI coming very close to mimicking the voice of Scarlett Johansson from the movie ‘her’. Both examples clearly delineate the need to protect personal likeness against a new suite of generative AI cloning models and tools that enable robust human impersonation.
To summarise, the ELVIS act directly regulates using someone's likeness in AI generated materials. It gives rights holders clear legal recourse for violators, but the act falls short because it is only applicable within the state of Tennessee but could serve as an strong example of rights protections moving forward.
Moving up to the federal level, the Content Origin Protection and Integrity form Edited and Deepfaked Media Act (COPIED Act), takes a slightly different approach to ELVIS. COPIED seeks to protect artists, songwriters and journalists from having their content used to train AI models or generate AI content without their consent. This is distinctly different from the ELVIS act because it is targeting a rights holder's works and not their likeness. COPIED **sets out to achieve this by making it easier to identify AI generated content and combat the rise of harmful deepfakes or reproduced works by enforcing mandatory content provenance information of pre-training data that cannot be removed. The bill itself “requires developers and deployers of AI systems and publications used to generate covered content (any digital representation of someone or somethings work) to give users the option to attach content provenance information within 2 years”. The act prohibits any removal or tampering of this content provenance information, with a limited exception for security research purposes. It also calls upon the National Institute of Standards and Technology (NIST) to create guidelines and standards for content provenance information, watermarking and synthetic content detection.
In Summary, the act is a strong attempt to modernise IP protection and enforcement through provenance information. However, the bill seems to have been developed outside of the traditional and developed copyright laws which could prove difficult moving into the future regarding attribution of AI generated content.
The generative AI copyright disclosure Act of 2024 is brutally simplistic in its design but is incredibly effective in its intended output. This act “Would require a notice to be submitted to the register of Copyrights to the release of a new generative AI system with regard to all copyrighted works used in building or altering the training dataset for that system”. Future models that are set to be released publicly will need to submit a list of their copyrighted works 30 days before the models are made publicly available. Existing models copyrighted works must be filed by 30 days after the Act goes into effect. The Copyright Register will establish and maintain a publicly available online database that contains each notice.
In Summary, the Generative AI Disclosure Act 2024 forces AI developers to be transparent about what works are going into their systems and use’s existing legal infrastructure to do so. However, the complexity of rights management in generative AI might require some fine-tuning of the act or the definitions of what constitutes copyrighted works that have been used.
If we zoom out and look at the ELVIS, COPIED acts, all three of the acts support each other in a complimentary fashion in attempting to put the power back into the hands of the rights holders. ELVIS specifically targets the reproduction of an individual's likeness, COPIED allows rights holders to provide provenance information to their works, and Generative AI Disclosure Act 2024 forces AI developers to be transparent. Covering all three of these bases creates a very strong foundation for the future of rights holders in generative AI, but it is abundantly clear that there will be a period of fine-tuning and redrafting regulation to make it efficient. It would seem across all the Acts the biggest winners are the enterprise level creative work conglomerates (Record Labels, Art and Image Conglomerates) and the biggest losers will be SME and low budget AI developers that will struggle with innovation or will be hindered by compliance.
Provenance graphs illustrate the supply chain of training data from source to end user, promoting accountability within the data supply chain by ensuring transparency from both data owners and consumers. This accountability incentivises data owners to clearly disclose their content for licensing and allows AI developers to understand their training inputs and outcomes, ensuring rights holders are adequately compensated. Platforms like Valyu’s exchange offer provenance for datasets, supporting copyright and transparency regulations.
AI copyright regulation globally is still in an incredibly juvenile stage and it is expected that there will be redrafting and fine-tuning of most laws as best practices emerge for each use case and industry. For the time being both rights holder and developers should be watching the regulation closely to understand how it can effect that and understand what solutions are being built to assist their pain points. For now, the future is looking promising.