ActivityPub hot take

2018/07/30 1:47 PM 2 months ago

Most use cases of ActivityPub would be better off as Atom or RSS feeds.

There has been a lot of attention given to ActivityPub as of late, to the extent that any new network software in the F/OSS community absolutely must be done as ActivityPub if anyone is going to care about it at all. For the sake of “federation” and being “distributed.” But what does this buy us? For the most part it means that every piece of content everyone writes needs to be replicated to every point in the federation mesh to be useful. And each node ends up taking a lot of resources (bandwidth, memory, even CPU) just to exist.

This isn’t creating a distributed network, this is creating a whole bunch of massive points of failure. And these failures grow exponentially as more failures happen. Its very nature also makes it so that it’s much more difficult for this data to be migrated; look at what happened when witches.town went down, or when scifi.fyi’s certificate expired, and the effects this had on all other instances regardless of the users of those instances. And spinning up an instance is expensive, and that cost rises with the number of instances out there.

Meanwhile, what does this buy you? Immediacy of updates? Okay, does it really matter how quickly you see someone’s new content, if it’s worth seeing? Sure, there’s a lot to be said about being “in the now” and having realtime interaction with folks, and there’s definitely a place for that sort of interaction. IRC had this back in 1988, with its own federated protocol. And IRC still exists. (It’s also pretty flawed and there are certainly reasons to not want to use it now. But things like Matrix exist, too.) And it’s something that really benefits the Twitter-style of interaction (e.g. Mastodon) as well. But those aren’t the only ways of interacting with the Internet!

Okay, so ActivityPub is push-based, and this can theoretically reduce the amount of bandwidth and resources used. But! There are very simple mechanisms – already built into HTTP – that allow RSS/Atom to also be pretty bandwidth-preserving. For example, user-agents and RSS providers can use the If-Modified-Since: header to only get the feed updates if the content has changed since the last fetch. And given that most feeds' activity is bursty – changing a lot a couple times in the day but mostly quiescent – surely this would be a much better approach, even if it seems inelegant that all 200 followers of a blog would be checking for updates once an hour? (And smarter fetch intervals can also help; user agents can adaptively fetch based on when a content feed is known to update more frequently, or better yet, only when a user is more likely to be checking for updates.)

(9/18/18 update: I just noticed I also forgot to mention PubSubHubBub WebSub, which is a push notification protocol that can easily work with Atom/RSS as well. This is definitely something I meant to include in the original rant, since it serves as a very nice optional extension to Atom that gives you push notifications of content update.)

NOTE: To clarify, I am not saying that nothing should be done as realtime push! Just that most things don’t actually benefit from it. (For things that do benefit from realtime push, absolutely you should use ActivityPub! But please, please think about your use case for it before you go down that rabbit hole. And at least consider supporting Atom/RSS as an alternative.)

And another thing I really like about Atom/RSS is that it can easily be served up as static files; your CMS can build all the data offline and just upload it to a static content store, for example (back before WordPress took over the blogging world this used to be the standard way of publishing a blog, even, and things like Jekyll and Pelican still work that way). Or it can be served up by a CDN. Perhaps even a distributed one like Coral (RIP) or something built on IPFS.

Do you really want shared users who subscribe to the same content to be able to share those resources? Well, server-side RSS/Atom readers can already do that! And I can also see a world in which we have a distributed content store that’s fed by RSS/Atom, perhaps something IPFS-like in that regard as well. And with RFC 5005 there’s a perfectly reasonable mechanism for allowing Atom to become the backbone of a fully distributed Social Experience, if the last several years of Twitter, Tumblr, and Facebook haven’t made you automatically cringe at the thought of everyone being connected 24/7 with algorithmically-promoted Content.

Oh, and ActivityPub stuff generally also means that all attachments and other content also get replicated. This leads to some pretty major legal issues; Mastodon has had a huge problem with child pornography being distributed between instances, for example, and the legal repercussions on instance owners can be extremely dire. And when these attachments are also being distributed and replicated, they become very difficult to revoke or update.

Some things like PeerTube “solve” this by only distributing what are essentially torrent links to the data, to be distributed between clients when they want to view the content. Well, that can also be done via RSS/Atom just fine. There’s nothing special about ActivityPub that enables WebTorrent to work, for example.

Oh, and what about private/friends-only content? That ActivityPub provides? Well, it isn’t really all that private. If you post something that’s followers-only or even a DM on Mastodon, it actually still replicates pretty widely on the network (at least to every instance that has a recipient of the message), meaning any admin on those instances have complete, unfettered access to the content, and really the message just has a flag saying “please treat this as private information” when it chooses what to display to folks. Not very private, is it? An approach that is at least as secure – if not moreso – is having unique per-user RSS/Atom feeds so that when someone follows a blog (or whatever) the protocol itself only conveys what information is available to that follower. This does make the feed discovery mechanism a little more complicated for the case of following via a server-side reader, but at the same time it also means that if you migrate between readers, all of your follower status also goes with you. This does have some security implications, of course, but the reality is that all private content on the Internet comes down to a “gentleman’s agreement” at some point; nothing stops a “friend” from copy-pasting or screenshotting something private and putting it somewhere else. (And at least that requires a manual, intended action on the bad actor’s part!)

Something that would be pretty great as far as private content goes is something like the OTR protocol, or like DVD-CSS where the content is distributed encrypted, with the master key encrypted by all the known friends' public keys; if you receive some content you can try decrypting it with your private key, and if you get a valid decryption key as a result, then you can read the entry. And if not, at best you can see that there’s a private entry of a certain length, and whatever other plaintext metadata came with it. (This obviously still carries a privacy implication, but this could also be combined with subscriber-specific feeds so that only content they have access to gets floated to them in the first place.)

That would be a much better, more secure model for distributing private content than anything provided by ActivityPub, and it would be reasonably straightforward to extend into RSS/Atom with some simple custom namespaces.

Also, RSS/Atom is way, way easier to implement a client for than ActivityPub. And way easier to make consuming services for. And way easier to make generators for. Anyone can join in on RSS/Atom with very simple tools – even a text editor. You can parse RSS/Atom with a shell script if you want. (Not that I’d recommend this, though; please use something like SimplePie or feedparser.)

You know what’s also great? You can adapt walled-garden things to RSS and thus make it available with whatever subscription engine you want. And this can all happen even with a cron job that runs a simple shell script that emits a static file! Whereas ActivityPub needs things to already speak ActivityPub, and be way more, well, active to do anything with it, including all of the basic verbs requiring POST requests; the current story for adapting content to ActivityPub involves running a Mastodon instance and a bot to repost things to Mastodon, which then just posts a link that people have to click on without any useful metadata to see unless there happens to be an OpenGraph tag on the destination link and the Mastodon client understands OpenGraph, and so on.

Basically, don’t write off Atom or RSS as protocols for sharing just yet. It has a lot more going for it than you might think.

Anyway, time to post this to my site, where it will ironically be shared on Mastodon via an overly-convoluted series of tubes

…initiated by an Atom feed.

Comments