Western broken

western, my Sun Ultra10 which I used to use as server/gateway/firewall, broke down last weekend.

Since western had only one 9G hard disk which was getting full, I wanted to add another disk. The 80G disk that I had bought for my desktop turned out not to be needed (since there were 2 extra disks hidden in the machine anyway), so I added it to the Ultra. However, it didn't get detected; according to Philip, an Ultra10 has a PIO-only IDE interface, which may be the cause of the problem—it's not that unlikely that the newest IDE disks, such as this 80G one, do not support outdated transfer modes such as PIO anymore. So, I decided to take out one of the disks from rock, my desktop, and put in the 80G one instead. 40G is still plenty for a server which until recently had only 9G.

Thank $DEITY for LVM

Connected the 80G disk as /dev/hdd to rock, and ran the following commands:

rock:~# cfdisk /dev/hdd
[create a partition]
rock:~# pvcreate /dev/hdd1
rock:~# vgextend rock /dev/hdd1
rock:~# pvmove /dev/hde1
rock:~# vgreduce -a

Now shut down the box again, and remove /dev/hde. And no, the hde part is not a typo: rock has four IDE controllers.

Meanwhile, however, I was noticing something peculiar about western: its external network card did not get a DHCP lease anymore, for some reason. That card was a 3com 905, with a slightly loose RJ45 connector; by unplugging or reconnecting the cable, it might be that I moved it a bit too much around and broke it. While it's a pity that I lost that 3c905, it's not really much of a problem: put in a different card, boot again, done.

The Realtek ne2k-pci board that I put in next didn't work. It did get a DHCP lease, but it couldn't use the connection for some reason. Ne2k boards are crap, so it could be related.

I put in yet another card, and booted the machine again, this time without replacing the cover (in case that card wouldn't work either). After it had been running for about 1 minute, the system suddenly shut itself off for no apparent reason. So I put the cover back on and tried to boot. After 10 seconds, even before silo was loaded, the system shut itself off for no apparent reason. I thought maybe it was the card, and even tried the 3c905 again. The system shut itself off for no apparent reason.

It hasn't stopped doing that. I finally, on sunday evening, set up rock as a replacement server and put the 9G disk in there too, so that I could at least access the data and config files. But I'd love to get western to boot again. Stupid thing.

Any hints are very welcome. Really. I'll even give you a Free Beer at FOSDEM if you give me the hint that fixes it.