Single-stepping init systems

The Linux init systems are a bit in flux at the moment. That is, they're in flux in Debian; outside Debian, most other distributions have stepped away from sysvinit and towards something else (systemd, openrc, or upstart). I've not been a proponent of any switch, though I understand the reasoning, and it probably makes sense for us to switch at some point. But yesterday, the fact that this customer's system was running sysvinit and not systemd or upstart saved me quite a bit.

There's a server. It has one quadcore processor. For reasons that I won't go into here, the customer wants an extra quadcore processor to be added to the system.

After having done so, I power on the system... only to see it power itself off at some point during boot. I did notice some kernel messages fly by just moments before the system would power itself off, but it was impossible for me to read them. So what did I do?

  • Boot the system with init=/bin/bash,
  • After having booted the system, go to /etc/rcS.d and manually run each and every one of the scripts there in turn. When the system powers off, I know what the problem is.
  • Disable the init script that causes the problem, and boot the system normally.

That last bit is, obviously, a bit of an ugly workaround; the better way to fix this issue would have been to debug what the actual issue was, and implement a proper fix. However, I didn't have time for that (the fact that there was need for a second quadcore chip explains how much this system is in use), and the workaround was acceptable for the customer. It is not the first time that this ability to single-step the init system has saved me. The fact that sysvinit is so simplistic is what makes this possible, and I consider that one of its most important features.

Recently, I came into contact with a distribution that uses systemd as its init system (in casu, Arch Linux). I had made a mistake in configuration; I had installed and enabled a graphical login system, but had no xterm or similar available, and had done something else wrong through which I couldn't get a regular shell on the console anymore, either. To fix this, I tried doing something like the above (running with init=/bin/bash and single-stepping the init system), but found that doing so with systemd is nigh impossible. In the end, I knew what exactly the problem was and could disable automatically starting the login manager through removing a symlink, but it brought home the issue that debugging a similar issue when running systemd rather than sysvinit might be a lot harder to do.

We'll see what the future brings.