Rotting Links on the Web
A while ago, I and a few others from the Micro.blog community did a weekly roundup of posts shared and discussed on the timeline. I was reminded of one such post from 6 years ago recently. What saddened me was that most links mentioned in that digest are unavailable today. The posts are either unreachable at the shared links or broken with missing content or images.
The link rot is painfully prevalent on the web today, as this statistic from early 2024 highlights.
Since January 2013, 66.5% of the links pointing to the 2,062,173 websites we sampled have rotted. We found another 6.45% with temporary errors. We don't know if they're still there or not.
This is unsurprising in a world where the internet constantly evolves, and new shiny things are launched daily.
Blogs are even more prone to this problem. The incentive to maintain the old posts is too small to overcome the effort required to keep all the links working. We like to believe blogs are part of the open web and revere the platforms that allow us to take our content and migrate it to a different platform. However, because of the way they handle permalinks and resources, many links get left behind.
For example, Micro.blog brilliantly handles the migration to the platform. It sets the redirects well so that none of the old links are broken. But if it provided an option to configure the permalinks, there would have been no need to set the alias or redirects in the first place. This also creates a massive problem while migrating away from the platform, as one needs to set new redirects to match the permalink format the new platform supports or you desire.
Another problem is how every blogging platform handles the resources or assets, as they are generally called. The ones necessary for the blog's functionality or look & feel, like Javascript and CSS files, are usually handled. But what gets left behind are the images. Every platform handles them differently and usually generates its peculiar image URLs. So, even if the post links aren't broken, posts generally are due to missing images.
CDNs can possibly solve a part of this problem, as highlighted in the discussion on this thread. But the fact that so many images continue to get lost through the cracks tells me it's not integrated well enough yet to be adopted generally.
Permalink and alias/redirect configuration should be critical to consider in the interoperability of blogging platforms. The larger open web pays the price when platforms neglect permalink and asset management during migration.
I acknowledge that it's not an easy problem to solve. But it is an important one.