Reddit Is Fun was the only way I browsed Reddit on mobile. The mobile app is crap and doesn’t actually properly load posts. I get down a few scrolls and it ends like there isn’t anything else to view.

  • RarePepeCollector@lemmy.world
    link
    fedilink
    English
    arrow-up
    2
    ·
    1 year ago

    Too easy to scrape that data, the replies don’t update much after 2 days, and even then it’s pretty easy to re-scrape and check. And the data is not owned by reddit, its actually owned by their users. So if MS wants to scrape it, they need copyright permissions from the user, not reddit.

    • PrimalAnimist@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      ·
      edit-2
      1 year ago

      True, users do maintain copyright of anything they write, but they also give reddit license to use it how it wants, including sub-licensing it to others. That means the corps absolutely DO NOT need the permission of users to train their AI. They just buy the rights to use the data from reddit.

      When Your Content is created with or submitted to the Services, you grant us a worldwide, royalty-free, perpetual, irrevocable, non-exclusive, transferable, and sublicensable license to use, copy, modify, adapt, prepare derivative works of, distribute, store, perform, and display Your Content and any name, username, voice, or likeness provided in connection with Your Content in all media formats and channels now known or later developed anywhere in the world. This license includes the right for us to make Your Content available for syndication, broadcast, distribution, or publication by other companies, organizations, or individuals who partner with Reddit. You also agree that we may remove metadata associated with Your Content, and you irrevocably waive any claims and assertions of moral rights or attribution with respect to Your Content.

      This includes images and videos that are uploaded to the reddit servers directly.

      Reddit has the right to use the data and sell that data to others. Also, some data you can scrape, but there’s additional data that is available only through the API. Web scraping is not reliable, especially if reddit actively flags your spider and blocks it. They are not the idiots we want to believe they are. No mega corp is going to risk not having competitive access to data to feed their AIs when the cost for them to just pay is insignificant.