Post-Cloudflare update

It’s been nearly a week since I removed Cloudflare from my sites. As a quick followup, I did get a slight surge in traffic that lasted for a day or so after a bunch of bots' DNS caches expired, but they seem to have all given up after the Cloudflare “managed challenge” interstitial turned into an HTTP 401 error for them.

Read more…

This site now Cloudflare-free

About a year ago I set up Cloudflare as a fronting CDN for this site and my music site because it was the most expedient way of dealing with an AI bot onslaught. It helped a bit but the bots very quickly figured out how to get around all that and while Cloudflare gave me some slightly-better management tools for some stuff, I figured out better approaches to the bot mitigation.

Cloudflare was also super aggressive about caching some stuff that I didn’t want to be cached, and of course, there are many, many political and ideological reasons to not want to use Cloudflare. So my plan was always to switch back to not being under Cloudflare, but the longer I waited the harder it seemed like it would be, due to how SSL certificates work. In particular, I use wildcard Let’s Encrypt certificates, which require DNS to be current, and a big thing that Cloudflare does is… take over your DNS.

But tonight I got a hair up my butt and switched back to my own termination, and it wasn’t too hard to do, with just a little bit of DNS and TLS juggling, and I wanted to minimize my website downtime.

Read more…

Trying out T-Mobile 5G Home Internet

I’d been with CenturyLink Fiber since I moved into this house in early 2021, and was mostly happy with it. I never had any major outages aside from occasional drops due to their crappy provided router, which I replaced with a Linux SBC running OpenWRT. The only real complaints I had was that they used PPPoE + VLAN tagging (which was annoying to set up) and that for IPv6 they only provided 6rd which is a bit of a half-measure. But I was able to host services to my heart’s content, and they were pretty hands-off with a lot of things.

Unfortunately, a month ago I was switched over to Quantum Fiber, which is sort of a rebrand but sort of a separate company, and I’d heard nothing but horror stories about Quantum, and unfortunately, I experienced two of them myself:

  1. When I first switched, my Internet went out for a few days early on, when CenturyLink shut down the old account; apparently this caused a misconfiguration on their end which led to my network being shut off, and it took a few days (and several tech support calls) for them to figure it out.

  2. I ended up getting a DMCA notice for some activities on my network, and while normally it’s just a thing you can click through to acknowledge that you received the notice (after which time your service gets immediately restored), this time the notice kept on coming back every 10 minutes (killing all my web connections each time, although thankfully VPN and ssh sessions were mostly unaffected), and it was coming from CenturyLink, not Quantum.

    I spent hours on the phone with both companies' support, each one blaming the other company for the issue, and it took days before my connection was stable again. I had a bodge in place that made it mostly reliable (it was super easy to run a script that would check for the notification and then click the button) but it still made things kind of unreliable, and both companies' tech support was bafflingly awful in ways I can only describe as “Kafka-esque.”

The second issue finally cleared up after about four days, but by that time I decided it was time to try another ISP, and the only other broadband options where I live are Comcast XFinity, who are awful and expensive, and T-Mobile 5G, which costs about the same as Quantum but have other tech concerns to worry about. But I’ve had plenty of experience with T-Mobile as a company and I figured I’d give them a try for this, especially with how many people I know who sing their praises.

The access point arrived today and I’ve been putting the service through its paces. My opinion is… mixed, but generally positive.

Read more…

Fuck AI LLM scrapers

Wellp, my whack-a-mole approach finally got to be too much to maintain. The last day or so my server has been absolutely inundated with traffic from thousands of IP blocks, all coming from China, and I got sick of trying to keep up with it myself.

I looked into setting up Anubis and preparing to just whitelist a lot of IndieWeb things, but it’s all just so very overwhelming and for now I’ve gone with Cloudflare, problematic as they are, because the amount of energy I can put into this shrinks every day and sometimes I just want things to stop sucking for a while.

All of my DNS has propagated but of course it’ll be a while before the bots decide to update their own DNS caches, so my server is still getting absolutely hammered, but hopefully things will subside, and in the meantime things are at least responsive.

I guess at some point I’ll have to figure out how to actually set up TLS with Cloudflare (since I’ve been using Letsencrypt wildcard certs but obviously those don’t work anymore when Cloudflare is handling my DNS) but that’s a problem for future me. Also I’ll definitely be on the lookout to make sure that Cloudflare is properly honoring my login cookies. It’d definitely be unfortunate if it gets confused about logins, which is one of the more common failure modes with HTTP proxies.

I’m also super worried that this will interfere with IndieWeb stuff, because of course most of the anti-bot things assume that any traffic coming from data centers or from headless/scriptless user agents is abusive. Which is, y'know, 99.99% accurate, but that 0.01% is stuff I really care about (namely interop).

Anyway. I resent that this is the state of the Internet right now. It’s getting really difficult for me to find anything positive about AI when this is how the industry treats everyone.

Yes I’ve heard about iocaine Notes

No I will not be running it

It does absolutely nothing to slow crawlers down (it’s not like they’re going to wait for a page to finish loading before they move on to the next one, crawlers are super optimized to just constantly grab as much bandwidth as possible in parallel), there’s already so much AI slop on the web that it’s not going to contribute meaningfully to model collapse, and all you’re doing by running it is wasting even more resources. Giving the LLM crawlers more content to slurp up just gives them more reasons to waste even more resources, and only continues the death spiral of making the Internet an even worse place.

This isn’t like interfering with scammer call centers through scambaiting or the like. Computers have no problem with having their time wasted.

And meanwhile it does nothing to actually solve the problem.

Some thoughts on comments

You might have noticed that I’ve made a slight change to the comments on this site: the comment threads are only visible to those who are signed in. This is a temporary experiment just to see if it cuts out the spam I’ve been getting and also if it increases the quality of what comments do come in.

I’ve been thinking about how I can go about improving comments in general, in ways which would also satisfy some of my other general long-term plans around Publ.

Read more…

Spammers are relentless and weird

Lately I’ve been getting a bunch of attempted spam comments on random blog entries. Okay, nothing unusual about that, right?

Well, it’s a little unusual in that I use isso, an obscure comment system that requires Javascript to work, so at the very least there’s some sort of browser-based automation, if not outright sweatshop laboring happening.

But today I just got the weirdest fucking spam comment ever. Not weird because of the content (it was for a list of dental clinics in India, which I guess is pretty weird), but because of where it was posted:

On an entry that requires login.

Read more…