Guess we’ve finally reached the moment where letting the giant intellectual property cartels monopolize human culture is going to cause serious economic side effects for other big corporations rather than simply screwing over the general public.
The future of AI innovation may hinge on the outcome of a global copyright debate.
Meh, US is not the world.
Nothing. It was pirated for free.
Some have allegedly paid.
“We’ve provided about 20-30 companies/teams with our entire dataset. It’s the same data as on our torrents page, but they get access to high-speed SFTP servers.”
“Usually, this is in exchange for a large monetary donation or, on occasion, in exchange for good datasets they acquired,” ‘Anna’s Archivist’ adds, noting that all data they obtain is shared publicly.
The fact that Anna’s Archive is accepting additional datasets as “payment” makes me comfortable that they’re not in this for the money but rather for ideological reasons.
“We cleaned 860K English and 180K Chinese e-books from Anna’s Archive,” a DeepSeek VL paper, published last March, states.
Hmm.
Bibliotik baybeeeee