When visiting this link:

http://web.archive.org/web/https://www.businessinsider.nl/cloudflare-ceo-suggests-people-who-report-online-abuse-use-fake-names-2017-5

the #WaybackMachine said the page is not archived. But in fact it was (and still is) here:

https://web.archive.org/web/20171024040313/www.businessinsider.com/cloudflare-ceo-suggests-people-who-report-online-abuse-use-fake-names-2017-5

This attribute got added to the end of the URL at one point in the process: ?international=true&r=US so it seems the #WaybackMachine is not wise about useless parameters. Perhaps it’s too complex for it to drop params and compare.

  • hydroptic
    link
    fedilink
    arrow-up
    1
    arrow-down
    1
    ·
    6 months ago

    It can’t know which URL parameters are useless for which site, since it depends on how each specific site was implemented

    • freedomPusherOPM
      link
      fedilink
      arrow-up
      1
      ·
      edit-2
      6 months ago

      It should know, since the parameters are added by archive.org. If you visit the 1st link (with no timestamp), every time it says it was not archived. If you force a save, the parameters apparently get added so archive.org (who is a mitm in this case) would have to know that. It’s just not making use of the info. I’m not sure if javascript is part of the issue… if it’s just passing along js that does this then it would complicate it. But it’s a bug nonetheless because it breaks the ability to see a calendar of saves.

      (EDIT) if you visit the 2nd link, the parameters still don’t appear. So it’s unclear why this is failing at all. Those params may just be a red herring from a search engine. When you visit an untimestamped link in archive.org, it should redirect to the most recent timestamped archive and for this particular article that’s not happening.