Why I dislike .well-known
There is a growing trend for new protocols to express themselves in the form of a well-known URI. It’s seen as an easy place to stash something where other interoperable software might want to probe for protocol support, as an improvement to the old ad-hoc behavior of things like robots.txt and favicon.ico and the like.
I am not personally a fan of it, for a few reasons.
The big one is that it means that all discovery for all uses of a protocol must be uniform across all areas on a single domain. For example, if you have URLs like https://example.com/~alice/ and https://example.com/~bob/ and both of them want to support the foo protocol, then https://example.com/.well-known/foo needs to have some means of distinguishing the two. And the way of doing that can be tricky. Do you have something like https://example.com/.well-known/foo?resource=/~alice/? How do you deal with path normalization (e.g. /~alice vs ~alice/ vs. ~alice/homepage.html)? Do you have to consider things like cross-domain attacks? What if multiple users on a tilde site want to support different sets of protocols, or use different implementations?
Putting things into the query string (the most typical approach for this) also means that you’re going to have to have some sort of dynamic mapping or request routing, which means this can’t work with purely static hosting. It also means that you might have to probe for this protocol support across every single URL being accessed on a domain.
Another concern is that every protocol you want to support requires a separate HTTP request, and these things can add up pretty quickly. For example, whenever I post an article to my website and it goes out on my Mastodon feed, I get many dozens of Mastodon instances each probing an absolute litany of related URLs trying to determine whether this is a Mastodon instance or similar, and sometimes this ends up even overwhelming my server. Even without that, I’m getting a constant flood of requests for things like /.well-known/traffic-advice and /.well-known/wp-login.php and the like.
A much better approach, in my opinion, is to have the discovery baked into the resource itself. In HTML you can use the <link> tag (e.g. <link rel="foo" href="/~alice/foo-support.xml">)1 and in HTTP in general you can use the Link: header (e.g. Link: <https://example.com/%7Ealice/foo-support.xml>; rel=foo). A single HTTP request can tell a client about all possibly-supported protocols, all at once, and it can be baked in statically and thus supports static hosting. It also means there is no special case for handling multiple resources across a single domain, and if two pages have the exact same link target it can also be inferred that they are the same as far as that protocol is concerned.
One concern that comes up is that of page bloat, when every page needs to include a bunch of <link> tags to express a growing list of supported protocols. This is, to me, a non-issue; for starters, you’re only declaring the protocols that you actually support, and it’s a single point of reference for all clients to discover all of the supported protocols. But also, the amount of bandwidth added to a page for even a few dozen protocols is miniscule compared to the amount of bandwidth taken by other accepted optimizations, such as inline CSS and data: blobs for images, as well as the bandwidth taken up by the incessant .well-known probes that are taking place at this point.
A suite of related protocols could also be offered as a forwarding URL, such as IndieAuth’s current practice of bundling it all together into indieauth-metadata. This does require an extra HTTP request, but it only has to happen once, as that URL is extremely cacheable.
The IndieWeb wiki has more to say about the growing usage of .well-known and why it’s considered an antipattern.
As an aside, I am also not a fan of Webfinger, because not only does it require .well-known to work, but it attempts to flatten a namespace in ways that are difficult to deal with. On this website, you can easily get the update feed for any given section by discovering its rel="alternate" URLs, but there is no such mechanism in Webfinger; you can follow the site as a whole as @beesbuzz.biz@beesbuzz.biz (thanks to Bridgy Fed) but you can’t follow just the /code/ section, for example. There are some hacks such as making @code@beesbuzz.biz, @comics@beesbuzz.biz and so on, but then what about nested subdirectories (e.g. /food/coffee/)? Why not just use the URL itself as the specifier? With <link>-based discovery, you already can, and there’s nothing special about it.
So, anyway: When designing a new protocol, please consider not using .well-known URLs for discovery purposes. Let the URL itself provide its own information.
Comments
To see the comments on this entry, please log in. Alternately, send me an email, or join me on Discord!