Tuesday, May 20, 2008

Linux Grub bootloader harddisk UUID inanity

I spent the last couple of hours figuring out why my Ubuntu machine was not booting. Everything was fine...and then I removed a backup IDE hard-disk to place in my safe deposit box. Rebooted to remove it and then uhg....hang...then an initramfs. My Linux skills are rusty. WTF is initramfs? Oh yeah, low-level initialization and startup stuff that I don't want to know about. Super. So....UUID are supposed to make life easier by using a unique identifier for each device. However, it doesn't work like it should and the unique identifier is unique alright...unique to the machine and apparently impermanent as well, making the whole concept.....worthless. I personally can't tell one long sequence of hexadecimal from another. Meaningless noise. The old system of using simple cryptic names like /dev/sda1 etc. worked just fine and if there was a boot problem I could diagnose it by determining if my partitions had changed names (which they shouldn't, ever, but do and is thus a reason for UUID). Ending up at the initramfs prompt can be caused by lots of things that I don't really care to concern myself with. One of them is having incorrect boot parameters as specified in the /boot/grub/menu.lst.

So the fix was easy...just not documented in the startup or identified by the non-error-like weirdness of ending up with a non-bootable system only because a stupid machine-unique-id was rendered incorrect due to some inanity in IDE versus SCSI devices. Basically, at the Grub menu, hit ESC and edit the boot command for the kernel that you'd like to use. Where it reads root=UUID=somelongsequenceofhex just remove the UUID=somemachinereadablebullshit and replace it with the partition that you know contains the boot partition. In my case this was /dev/sda2 so it reads something like root=/dev/sda2. If you don't know your boot partition, boot off of a Linux distro CD, get a command shell, look at your /proc/partitions file and figure it out. Knoppix is good for that, or use your distro CD. I've got 2 raid arrays with 5 disks each and 2 IDE disks. I can spot my root disk solely based on partition size.

Once you've got the right boot command your system will then boot normally, like it should have done to begin with. Run the blkid command on your boot partition to retrieve its machine-unique-id and replace the invalid UUIDs in your /boot/grub/menu.lst file. You also need to change /etc/fstab to reflect the apparently random UUID changes. That was a complete waste of time, but now your Linux system will boot without being poked and prodded.

I really, really love Linux. I love the philosophy of open-source and I love the feeling of personal power that I have knowing that I can (and do) modify my system to my liking from the ground up. But I don't like arbitrary bullshit. Proprietary or open-source, it doesn't matter. These things should work...flawlessly...autonomously... My boot partition didn't fucking move, change, or disappear. The magic fucking 8-ball that Grub is using just gave the wrong answer even though the game itself is rigged.  Damn it.  Don't play games with my time.