Post-Cloudflare update

It’s been nearly a week since I removed Cloudflare from my sites. As a quick followup, I did get a slight surge in traffic that lasted for a day or so after a bunch of bots' DNS caches expired, but they seem to have all given up after the Cloudflare “managed challenge” interstitial turned into an HTTP 401 error for them.

Read more…

Building a lyric search engine

Y'all probably know that my views on AI are somewhat nuanced. I’m not 100% “AI BAD!!!” but I’m also hesitant to rely on AI for a lot of things, and generally do not care for generative AI or any situation where you need AI to “reason” on things.

But, recently I’ve wanted to remember the name of a song that I listened to a lot, and where the lyrics I can remember don’t come up in any of the major lyrics databases. I listen to a lot of obscure indie music that tends to get lost by the major platforms, and I’ve been packratting music for decades now.

Further, it’s only fairly recently that music started to get lyrics embedded into the id3 tags (thanks to bandcamp really pushing for that) and even the streaming platforms have taken forever to pick it up. So a lot of the music I listen to has never had its lyrics entered in any sort of machine-searchable way.

But, hey, there are plenty of AI models for vocal extraction and text transcription… so why not actually use them?

Read more…

Fuck AI LLM scrapers

Wellp, my whack-a-mole approach finally got to be too much to maintain. The last day or so my server has been absolutely inundated with traffic from thousands of IP blocks, all coming from China, and I got sick of trying to keep up with it myself.

I looked into setting up Anubis and preparing to just whitelist a lot of IndieWeb things, but it’s all just so very overwhelming and for now I’ve gone with Cloudflare, problematic as they are, because the amount of energy I can put into this shrinks every day and sometimes I just want things to stop sucking for a while.

All of my DNS has propagated but of course it’ll be a while before the bots decide to update their own DNS caches, so my server is still getting absolutely hammered, but hopefully things will subside, and in the meantime things are at least responsive.

I guess at some point I’ll have to figure out how to actually set up TLS with Cloudflare (since I’ve been using Letsencrypt wildcard certs but obviously those don’t work anymore when Cloudflare is handling my DNS) but that’s a problem for future me. Also I’ll definitely be on the lookout to make sure that Cloudflare is properly honoring my login cookies. It’d definitely be unfortunate if it gets confused about logins, which is one of the more common failure modes with HTTP proxies.

I’m also super worried that this will interfere with IndieWeb stuff, because of course most of the anti-bot things assume that any traffic coming from data centers or from headless/scriptless user agents is abusive. Which is, y'know, 99.99% accurate, but that 0.01% is stuff I really care about (namely interop).

Anyway. I resent that this is the state of the Internet right now. It’s getting really difficult for me to find anything positive about AI when this is how the industry treats everyone.

Comma 3X: Initial impressions

About a week ago I bought a Comma 3X from comma.ai, based on seeing a bunch of quite glowing reviews of it (and other FSD systems) from a number of car and tech reviewers I trust. In particular, since Kate of Transport Evolved has one and also has the exact same car as mine (2019 Kia Niro EV EX Premium in Galaxy Blue) and speaks highly of it, I decided that this might be a useful thing for handling my ongoing driving anxiety and vertigo issues.

Luckily enough it happened to be during a flash sale, where they included the harness for free ($99 off from usual), so my total cost was $999 (shipping was included and there was no sales tax either).

It arrived last Wednesday, and I installed and calibrated it soon after. I didn’t really get a chance to try it out until Sunday, but so far I’m very impressed with it.

Read more…

DuckDuckGo has DuckDuckWent all-in on AI

DuckDuckGo has been slowly rolling out AI “features,” and now they’ve decided to triple down on them.

So now for me they’re DuckDuckGone.

I’m using Startpage for now. The search results are Okay. Not as good as DDG’s were, but, sigh.

Rabbit R1

The tech world is abuzz with the announcement of the Rabbit R1, a little handheld AI assistant thing that has an interesting goal.

The tl;dr is that it’s a ChatGPT model that will run little AI agents (called “rabbits”) on your behalf to make complex API requests for you. I actually think it’s a pretty cool idea and one of the few things that I don’t hate about the modern AI push (ethics of ChatGPT aside, of course).

At $200 for the hardware it’s obvious that the LLM is running in the cloud somewhere, and it’s not like the other stuff wouldn’t also require cloud to operate anyway, though, and that raises the one big question I have about it: who foots the bill for the actual backend services? Because at $200 it’s probably being sold at-cost or for a small profit, and operating the necessary cloud services ain’t free.

Read more…

hashtag NoAI

In yet another conversation about AI art on Discord, someone mentioned that most sites have the #NoAI hashtag to indicate that something should be off-limits to an AI art bot.

So, I installed a stable diffusion GUI and decided to see what the various AI models would generate for a prompt of just “NoAI.”

Read more…

“In this new era of AI”

There’s been a lot of discussion about a puff piece by Marc Andreessen (formerly of Netscape fame, now of being-yet-another-also-ran-tech-billionaire-who-is-into-the-self-aggrandizing-fad-of-the-moment fame) talking about how AI will save the world.

I am not going to link to it (it’s easy enough to find anyway) but I just bothered to read it and oh my god the privilege and blinders are so obvious.

Read more…