A bipartisan group of senators introduced a new bill to make it easier to authenticate and detect artificial intelligence-generated content and protect journalists and artists from having their work gobbled up by AI models without their permission.
The Content Origin Protection and Integrity from Edited and Deepfaked Media Act (COPIED Act) would direct the National Institute of Standards and Technology (NIST) to create standards and guidelines that help prove the origin of content and detect synthetic content, like through watermarking. It also directs the agency to create security measures to prevent tampering and requires AI tools for creative or journalistic content to let users attach information about their origin and prohibit that information from being removed. Under the bill, such content also could not be used to train AI models.
Content owners, including broadcasters, artists, and newspapers, could sue companies they believe used their materials without permission or tampered with authentication markers. State attorneys general and the Federal Trade Commission could also enforce the bill, which its backers say prohibits anyone from “removing, disabling, or tampering with content provenance information” outside of an exception for some security research purposes.
(A copy of the bill is in he article, here is the important part imo:
Prohibits the use of “covered content” (digital representations of copyrighted works) with content provenance to either train an AI- /algorithm-based system or create synthetic content without the express, informed consent and adherence to the terms of use of such content, including compensation)
This is essentially regulatory capture. The article is very lax on calling it what it is.
A few things to consider:
Laws can’t be applied retroactively, this would essentially close the door behind Openai, Google and Microsoft. Openai with sora in conjunction with the big Hollywood companies will be the only ones able to do proper video generation.
Individuals will not be getting paid, databrokers will.
They can easily pay pennies to a third world artist to build them a dataset copying a style. Styles are not copyrightable.
The open source scene is completely dead in the water and so is fine tuning for individuals.
Edit: This isn’t entirely true, there is more leeway for non commercial models, see comments below.
AI isn’t going away, all this does is force us and the economy into a subscription model.
Companies like Disney, Getty and Adobe reap everything.
In a perfect world, this bill would be aiming to make all models copyleft instead but sadly, no one is lobbying for that in Washington and money talks.
Yup, I fucking knew it. I knew this is what would happen with everyone bitching about copyright this and that. I knew any legislation that came as a result was going be bastardized and dressed up to make it look like it’s for everyone when in reality it’s going to mostly benefit big corps that can afford licensing fees and teams of lawyers.
People could not/would not understand how these AI models actually processes images/text or the concept of “If you post publicly, expect it to be used publicly” and here we are…
As always, the anprims/luddites/ecofashies (who downvoted me) are like an anvil to left-wing ideas of progress, we’re too busy arguing amongst ourselves to make a stand to protect open source AI from regulation.
Honestly I blame Hbomberguy personally. People were a lot more open-minded before he tacked on that shitty little AI snark at the end of his plagiarism video.
Why do you think that? The existing data sets won’t be going anywhere. Fine tuning doesn’t require nearly the same amount of training images and it’s not infeasible to get them from individual artists.
Not that that actually matters to open source developers, though, as the developer obligations only apply if you’re making the product available for a commercial purpose, so they’re not relevant to developers of gratis solutions - and most libre developers are also gratis developers. If your platform is not commercial and doesn’t have at least 25 Million monthly active users, you don’t need to allow users to add content provenance information in the first place. If it’s not for a commercial purpose, you aren’t prohibited from training on content containing content provenance information, or from removing it and training on it.
I’ll be honest, I read it too fast and didn’t see the “for commercial use part”. I still think this is problematic because a lot of fine tuners and some companies putting out models either have a Patreon or offer their model for individual use but not to host on generating services without compensation (a good example of this is pony for fine tuners or codestal(I think) for general model providers). It also means any one building models can’t then commercialize models on their end while still offering it for free to the community, it puts them in a tough position. I don’t know how Metas llama could survive this or Google’s gemma. I’m also curious how this affects huggingface since I’m not sure if they are making it available like it says in the bill by hosting it.
It does put the bill in a better light though and I will edit my comment.