It’s not over and done with. Pass regulation saying every AI accessible w/in the country has to have a publicly available dataset. That way people can see if their works have been stolen or not. When we inevitably see works recreated wholesale without proper copyright, the AI creators can be sued or fined.
Couple of things here - what do you do with the open source models already published? There’s terabytes of data encapsulated in those. Some have published corpora, some don’t. How do you plan to determine that a work comes from an unregistered AI?
Also, with respect to “within the country” - VPNs exist. TOR exists. SD cards exist. What’s your plan to control the flow of trained models without violating civil rights?
If they don’t publish what their training data is, they should be considered violating copyright. The world governments can block sites if they want. It’s hard to swat down all of the random wikis and such but major AI competitors wouldn’t be a big problem.
That way people can see if their works have been stolen or not.
Firstly, nothing at all is being “stolen.” The words you’re looking for are “copyright violation.”
Secondly, it does not currently appear that training an AI model on published material is a copyright violation. You’re going to have to point to some actual law indicating that. Currently that sort of thing is generally covered by fair use.
It’s not over and done with. Pass regulation saying every AI accessible w/in the country has to have a publicly available dataset. That way people can see if their works have been stolen or not. When we inevitably see works recreated wholesale without proper copyright, the AI creators can be sued or fined.
Couple of things here - what do you do with the open source models already published? There’s terabytes of data encapsulated in those. Some have published corpora, some don’t. How do you plan to determine that a work comes from an unregistered AI?
Also, with respect to “within the country” - VPNs exist. TOR exists. SD cards exist. What’s your plan to control the flow of trained models without violating civil rights?
This is a teflon slope covered in oil. (IMO)
If they don’t publish what their training data is, they should be considered violating copyright. The world governments can block sites if they want. It’s hard to swat down all of the random wikis and such but major AI competitors wouldn’t be a big problem.
“Innocent until proven guilty” is a rather important foundation for most justice systems. You’re proposing the exact opposite.
Firstly, nothing at all is being “stolen.” The words you’re looking for are “copyright violation.”
Secondly, it does not currently appear that training an AI model on published material is a copyright violation. You’re going to have to point to some actual law indicating that. Currently that sort of thing is generally covered by fair use.