- cross-posted to:
- techtakes@awful.systems
["ideas guy" tier shite]
If I had a site in 2024 like I had in 2004 I’d almost certainly set up some honeypot against those.It would be a single page. The content would be, funnily enough, generated by ChatGPT; the text would be the output of the prompt “provide me eight paragraphs of grammatically correct but meaningless babble”. If you access that page, you’re unable to access any other resource in my site.
And that page would be linked by every single other page of my site. In a way that humans cannot reasonably click it, but bots would all the time. Perhaps white text on a white background at the start of the real text?
[/"ideas guy" tier shite]
Don’t get me wrong. I’m not opposed to the generative A"I" technology itself. My issue is this “might makes right” mindset that permeates the companies behind it - “we have GoOD InTeNsHuNs so this filth complaining about us being a burden might lick a cactus lol lmao haha”.
From the comments in “Reddit LARPs as h4x0rz News”:
If they ignore robots.txt there should be some kind of recourse :(
Error 403 is your only recourse.
Bingo - forbid them from accessing your content.
And in the meantime make robots.txt legally enforceable across multiple countries.
For antisocial scrapers, there’s a Wordpress plugin, https://kevinfreitas.net/tools-experiments/
This looks fun. I wish I thought about it before writing my shitty idea.