• whenigrowup356@lemmy.world
    link
    fedilink
    English
    arrow-up
    5
    ·
    1 year ago

    Assuming said data scraping is a real concern for both Twitter and Reddit, are Fediverse servers at similar risk from scrapers and various automated API hits? I don’t really know enough about networks to answer.

    • vinzen@lemmy.world
      link
      fedilink
      English
      arrow-up
      15
      ·
      1 year ago

      I think the data scraping problem is more of an opportunity cost (they think AIs should pay them more to use their content) than a concern for the traffic they account for. If traffic, and not profit, was a problem, Wikipedia would start saying they can’t support AIs either.

      • bluueberry@lemmy.ml
        link
        fedilink
        English
        arrow-up
        2
        ·
        edit-2
        1 year ago

        You make a great point about Wikipedia - it’s laughable to me that scraping is actually why Twitter is doing this. They’re just trying to find a convenient reason for why they’re failing that doesn’t stem from their own incompetence.

        • fubo@lemmy.world
          link
          fedilink
          English
          arrow-up
          2
          ·
          1 year ago

          The idea that “AI scraping” is any more expensive than search engine indexing is flatly nonsense, only credible to people who have never run any network service at scale.

          Folks need to learn about Common Crawl. https://commoncrawl.org/

        • Billiam@lemmy.world
          link
          fedilink
          English
          arrow-up
          1
          ·
          1 year ago

          If you were feeling generous, you could grant that scraping Twitter is a problem.

          Of course, I’m sure jacking up the API rates had absolutely no effect on that though. Which means either way, the problem was caused by Elon.