Full Network Conversion, pt. 1: The Tragedy of Athena

I’ve recently migrated my infrastructure to the otl-hga.net domain, after some financial instability made it unlikely to continue paying for two domain names every year, and while I was doing this I started taking the time to start converting all my existing networking to container related solutions.

It was earlier in December, and I had made a lot of progress on it. I had managed to port most of my Git repositories over to the new infrastructure, attach a Jenkins to them for CI/CD, have that actually work automatically when I push a commit, then wrapped the whole thing up in a single sign on environment that actually works better than raw Kerberos did.
Then one day I went to show a picture on my NextCloud to some friends of mine on the Internet and found that my whole server would not boot. I went and looked at the box itself through the provider dashboard VNC, and found a kernel panic waiting for me.
After that, I walked into the kitchen and started a large pot of coffee.
Data Recovery Phase
First, I’m gonna spoil the ending: everything’s fine! No significant data loss!
The first thing I did was sent the VPS into OVH’s rescue image. Then the next thing I tried to do, just to see if it would heal, was to run the old rootfs in a chroot and ran apt update && apt upgrade again. It didn’t fix anything, but the important thing is I tried it.
I then began backing up my docker volumes. Most of my docker-compose files were already in a git repository, which thankfully is on another fully-working system, and my important data lived either in other servers, or somewhere within Nextcloud (which I already backed up the volume for).
I say “most” because I didn’t think to put my docker compose files in a git repository until later in the process, and my original NextCloud compose file didn’t make it. The files themselves were safe, thankfully, and my partner hadn’t yet put anything on her account so I was the only one who had to copy old files into a “new” NextCloud instance. I lost a pozole recipe, and that was about it.

The Rebuilding (Or, How I Learned To Stop Yelling At snapd And Love Red Hat)
It was clear at this point that I was gonna have to reinstall the VPS from scratch. Not a whole lot of skin off my nose, I was gonna upgrade to the latest LTS of Ubuntu anyway, it was about that time of year.
…or was I? It’s true I’d been running Ubuntu servers since 2016, since at the time it was the only Linux distro that could reliably run srcds for my Garry’s Mod server without a whole lot of excess finagling. But as the years went on, Ubuntu stopped being “the best fit for my needs” and started being “historically the distro I used”. Canonical had been wielding the idiot ball off and on here and there, but most of the time that was in the desktop space and us server operators were largely unaffected. Who cares if Ubuntu was pushing Mir extra hard because everyone else was on Wayland? I was headless! What’s it matter if support for most of multilib got dropped? All the important server programs had amd64 builds anyway, this wasn’t Windows!
Then came snapd. This is Ubuntu’s answer to things like Flatpak – taking the general concept of “containers” and converting them to a sandboxed environment for desktop apps. This sounds cool, in theory, but it came with some problems:
- Apps running within a snap had very minimal access to the host filesystem. You had read only access to specific dirs in /etc, read/write to
${HOME}
, and some tmpfs space. That’s it. That’s all. This sounds fine for the most basic of apps in the most basic of configurations, but I don’t really keep any data in my home directory. I keep it in a different, much larger drive, because I learned some hard lessons a long, long time ago about OS meltdowns and data loss. (Which makes the first part of this article quite ironic, but I digress.) - In addition, all inter-application communication is restricted, or even outright forbidden. From a server perspective, this is bad – anyone reading this article should understand what kind of problem it would be for your server programs to be forbidden from accessing your databases just because they were different packages for the OS. This is also a problem for desktop users, which snap is targeted to: Even basic DBus communication is blocked. Media player integrations don’t work, browser integrations don’t work, hell, in many cases basic desktop notifications don’t even work. You know how on literally every other possible platform, if a desktop app has anything they’d like to inform you about, they give you a little tooltip or speech balloon in the corner of your screen? Not possible with Ubuntu snap!
- All of the above could be ignored by simply not using snap. But Canonical won’t allow that to happen either – every new version of Ubuntu results in another set of application packages being completely replaced with snaps. The Debian packages for
apt
get replaced with dummy packages that just runsnap install ${PKG}
, the previous set of dummy packages get deleted, and yet another key feature of the environment ceases to work correctly because snap has gotten in the way. It seems their eventual goal is to be rid of apt entirely, at least as much as possible, and havesnap
be the only package management available.
There are legitimate use cases for snap. Mostly, I would use it for older or more proprietary programs that require a very tuned set of libraries or environmental characteristics but not much inter-process communication outside of things it may launch itself. Which, in my case, includes Steam, and uh… Steam. This isn’t a technology I would use for general purpose package management. Canonical somehow disagrees, and this is a disagreement so big I’ve had to start distro shopping.
I ended up rebuilding my whole network with Fedora instead. I’m saving the rest of this tale for Part 2.