It seems pretty clear that all of Huffman’s recent decisions are driven by Reddit’s hoped-for IPO. On one front is the ugly fact that Reddit’s valuation is sinking.
They could probably have turned a great profit if they:
- Scaled down all of the wild “become tiktok” attempts and let the site be as it was (leaving image and video hosting to other sites)
- Focused on incremental performance and quality-of-life improvements for users, moderators and client developers
- Opened up the new GraphQL API.
- Created different services / plans depending on usage. For example:
- For end-user frontends it would be free but with a requirement that they show ads (or the user could pay a monthly fee to avoid ads).
- For people who just want to use the data in bulk the pricing would be similar to what Facebook charges when it sells information (no idea what that is). But that would be a different service, possibly not even an API.
- A bot plan, which should be free at least if it is tied to a subreddit and a moderator is running it, since it’s in the service of Reddit.
Can’t understand why they don’t just pass through the ads, I don’t want them, but I could live with them. It’s this alone that makes me think they just want to kill third party apps and nothing else.
Eh unless they made the ads indistinguishable from regular posts then 3rd party apps would just filter them out. I guess Reddit could get around that by disallowing it as part of their API TOS but then they would need to make sure the popular 3rd party apps don’t do it and also possibly sue them when they do.
It’s going to be fairly obvious fairly quickly if the big apps are nixing ads, no need to sue, just block access to the API for violation.
Ah that’s true, then really what is the problem? Do they want more analytics that they get from official clients? Or is Steve Huffman actually just an idiot?
Edit: Actually it seems like they do just want the tracking data from the app since they are now blocking people from using the website on mobile https://lemmy.world/post/57306
It was and is a series of poor decisions by Reddit. Destroying the hand that feeds, the community itself, and your 3P devs. I’m withholding any rationalization of these decisions by Reddit. Whatever the reason(s), they reveal an out of touch and tone deaf CEO and executive team.
It feels like they don’t understand what it is about their own product that is valuable. Reddit ultimately is (or has been) a community hosting repository. The core interaction they need to be supporting, fostering, and ultimately monetizing, is community creation. The reason mods have been so reliant on third party apps was that reddit has neglected the tools necessary to curate communities, instead electing to focus their energy on a new user interface that reduced communities’ ability to differentiate themselves, made the site generally less readable and functional, and also made it take much longer to load.
I even think there was a way to monetize the API that could have helped them run the site better by paying for infrastructure costs, or even pushing the site into profitability. But to do that, they needed an API price anyone was willing to pay. Reddit’s decisions the past few years have been nearly aggressively tone deaf and short sighted.
I am wondering if Reddit’s content (and content on other social media sites too) is actually as valuable as everyone seems to think. OpenAI, Microsoft and Google already have enormous amounts of text data to train their LLMs and I would honestly be surprised if the barrier to making these LLMs better is needing even more data. So who is actually going to pay for this? AI startups don’t have the money, maybe if Amazon wants to get into the business?
I work with LLMs, and yes the barrier currently is needing more data. These models get better when they are larger and trained on more data, so you really need all the data you can get your hands on.
The recent document leaked by Google kinda claims otherwise though. It’s less about quantity now and more about quality.
https://www.semianalysis.com/p/google-we-have-no-moat-and-neither
That’s actually where Reddit is useful as a training corpus, because different subreddits are at different levels of quality. It’s pretty easy to identify the high quality ones for training answers, and the low quality ones are excellent for training basic transforms (making sense out of an input that is niche and flawed in some way).
There are very few other sources of lightly structured training data that span all of humanity broken down into topics, graded to different levels of quality. Over time, the data will become less relevant as society moves on, so a living training set is important.
Having said that, Lemmy could prove to be an even better training source for expert system LLMs, as there could be curated instances of high quality with the ability to pull in more federated data as needed.
Fuck Reddit. I hope their egos and greed kill the site.