Fuck AI LLM scrapers

Wellp, my whack-a-mole approach finally got to be too much to maintain. The last day or so my server has been absolutely inundated with traffic from thousands of IP blocks, all coming from China, and I got sick of trying to keep up with it myself.

I looked into setting up Anubis and preparing to just whitelist a lot of IndieWeb things, but it’s all just so very overwhelming and for now I’ve gone with Cloudflare, problematic as they are, because the amount of energy I can put into this shrinks every day and sometimes I just want things to stop sucking for a while.

All of my DNS has propagated but of course it’ll be a while before the bots decide to update their own DNS caches, so my server is still getting absolutely hammered, but hopefully things will subside, and in the meantime things are at least responsive.

I guess at some point I’ll have to figure out how to actually set up TLS with Cloudflare (since I’ve been using Letsencrypt wildcard certs but obviously those don’t work anymore when Cloudflare is handling my DNS) but that’s a problem for future me. Also I’ll definitely be on the lookout to make sure that Cloudflare is properly honoring my login cookies. It’d definitely be unfortunate if it gets confused about logins, which is one of the more common failure modes with HTTP proxies.

I’m also super worried that this will interfere with IndieWeb stuff, because of course most of the anti-bot things assume that any traffic coming from data centers or from headless/scriptless user agents is abusive. Which is, y'know, 99.99% accurate, but that 0.01% is stuff I really care about (namely interop).

Anyway. I resent that this is the state of the Internet right now. It’s getting really difficult for me to find anything positive about AI when this is how the industry treats everyone.