Rock migration finished

This weekend, I started modifying the way I use disk space on rock, my home desktop and server.

rock used to be a PentiumIII running at 650Mhz, until I received an SMP box from Osamu Aoki when he moved back to Japan and couldn't take it with him. So rock is now a dual Celeron 433Mhz, and the machine that originally served as my desktop went on to replace pop, my parent's box.

A little while later, rock's hard disk died, and I was left with a single 13G hard disk (or so I thought). At that point, I used the sarge installer to install rock on an LVM system, so that I could easily enlarge the volumes in the installation later on, without having to start copying files for no good reason. When I later on bought a second-hand 80G hard disk to add to the LVM system, I found that there were in fact two more hard disks inside rock, which simply hadn't been connected to the IDE controllers; one was 20G large, the other 40G.

So I added them all, and enlarged all the volumes that could use some extra space.

A while later, I started worrying. What if one of the disks would die? Reading the documentation, I found that I would lose all the LVM volumes that were on the dead disk, even if only partially. Obviously there's also an option to get LVM to mount partial volumes to get at the data that's still available, but it didn't sound too hopeful. In short, I became convinced that what I was doing wasn't all that sure for my data. Obviously I have backups of the important stuff, but avoiding failure is always preferable over having to use recovery procedures, even if they're good recovery procedures.

So I decided that I would migrate to a setup that would use at least some redundancy; that way, I could stop worrying as much. And since I had four disks, there should've been a way to do that.

I started partitioning, and found that I had overlooked a second 40G hard disk that rock was using; so rather than creating a RAID5 array on top of two 33G partitions (on the 40G and 80G disks) and one RAID0 array or something similar (combining the 13G and 20G disks), I decided to create a 40G RAID5 array composed of two 40G disks and one 40G partition on an 80G disk. All other disk space (the other 40G on the 80G disk, and the 20G and 13G disks) would be combined into an LVM volume group for less important data (such as the squid cache and other large parts of /var, swap space, and a bunch of digitized CD's of which I still have the originals). Also, the RAID5 array wouldn't just be one large volume or so; instead, I created another LVM volume group in the RAID array. Theoretically I could of course combine everything in one volume group and use pvmove and/or the right options on lvcreate to force important volumes on the RAID array, but then having separate volume groups for RAID/no RAID would force me to think carefully before managing volumes, which is never a bad idea.

So on Friday I started moving data. This would involve running pvmove on a particular physical volume until all data would be removed from it; then run vgreduce -a <vgname> to remove the physical volume from it; use the just-freed drive to create new partitions on it to hold the live data; copy the live data over; and start doing pvmove again. Rinse, repeat, until all data is copied over and/or you've freed enough disks to create the RAID array.

Luckily one can create a RAID array in degraded mode, or the procedure would've involved updating my backups, verifying them, wiping the hard disks, and reinstalling. As it is, I could get away with creating partitions that were just large enough to hold all data, and hoping people wouldn't try creating more data.

I started working on this on Friday, and am just now, over 48 hours later, finishing up. It should've been possible to do all this in far less time; but rather than explaining what went wrong, let me just say, for the record, that I hate hardware. And that I should plan better.

Anyway. The last stumbling block was the fact that the system simply wouldn't boot from the new root device. The reason was fairly obvious; the initrd was generated when mdadm was not installed yet, so had to be regenerated. But after calling yaird with the right options, it still wouldn't work.

It took me a while to figure it out; but eventually, I found that yaird will read your /etc/fstab to find out about your root device; that it will see how it can get at that root device (where it's smart enough to know about md devices and LVM devices etc); and that it will then add the right software to the initrd based on that.

Sure enough, my temporary root device that I had put outside of any LVM thing on /dev/hdb3 did have /dev/hdb3 as its root partition. So yaird didn't think mdadm was necessary. Heh.

Quick edit fixed that.

So now I have my important data on an LVM system on RAID5, and my less important data still on LVM.

Me happy.