Training "AI" On Public Data Is Totally Fine And Not Stealing.

31337@sh.itjust.works · 9 months ago

Training "AI" On Public Data Is Totally Fine And Not Stealing.

wildncrazyguy138@fedia.io · edit-2 9 months ago

To expand on what you wrote, I’d equate the LLM output as similar to me reading a book. From here on out until I become senile, the book is part of memory. I may reference it, I may parrot some of its details that I can remember to a friend. My own conversational style and future works may even be impacted by it, perhaps even subconsciously.

In other words, it’s not as if a book enters my brain and then is completely gone once I’m finished reading it.

So I suppose then, that the question is moreso one of volume. How many works consumed are considered too many? At what point do we shift from the realm of research to the one of profiteering?

There are a certain subset of people in the AI field who believe that our brains are biological forms of LLMs, and that, if we feed an electronic LLM enough data, it’ll essentially become sentient. That may be for better or worse to civilization, but I’m not one to get in the way of wonder building.

Hamartiogonic · 9 months ago

A neural network (the machine learning technology) aims to imitate the function to normal neurons in a human brain. If you have lots of these neurons, all sorts of interesting phenomena begin to emerge, and consciousness might be one of them. If/when we get to that point, we’ll also have to address several of legal and philosophical questions. It’s going to be a wild ride.