Which of the following sounds more reasonable?
-
I shouldn’t have to pay for the content that I use to tune my LLM model and algorithm.
-
We shouldn’t have to pay for the content we use to train and teach an AI.
By calling it AI, the corporations are able to advocate for a position that’s blatantly pro corporate and anti writer/artist, and trick people into supporting it under the guise of a technological development.
I’ll note that there are plenty of models out there that aren’t LLMs and that are also being trained on large datasets gathered from public sources.
Image generation models, music generation models, etc.
Heck, it doesn’t even need to be about generation. Music recognition and image recognition models can also be trained on the same sort of datasets, and arguably come with similar IP right questions.
It’s definitely a broader topic than just LLMs, and attempting to enumerate exhaustively the flavors of AIs/models/whatever that should be part of this discussion is fairly futile given the fast evolving nature of the field.
Fair enough.