327780

Yes, I know that number by heart now. It's not a silly number, it's a bug. A grave bug on binutils which I opened, since the current version of binutils in unstable is, well, rather broken. I spent today tracking it down, and I found it. I must say I didn't expect that—I know next to nothing about the ELF file format, and even less about the binutils internals, so...

But anyway, this is what I did:

  • I compiled a simple application with a broken binutils installed, and ran it inside gdb. It of course segfaulted before it even reached main(), but that was expected. The interesting bit of information was where it segfaulted.
  • With objdump -D, I checked where that was. Turned out it was in some obscure section called '<.plt>', which I had never heard of before; as I found out, that is the "Procedure Linkage Table"—whatever that may mean.
  • I then compiled the same application against a known good binutils, and compared. The result of that comparison can be found in the bug log (which I linked to above); anyhow, there was a difference in one of the opcodes. The result of that was that rather than jumping to some bit inside the application (at 0x8000267c), it would jump to address 28b6.
    That can't be right, of course.
  • The issue was now to find out what the relevant change was. Running debdiff over the sources of both packages produced a 30M diff, so going through that would've been a little too cumbersome.
    IRC to the rescue.
    P2 told me that the code responsible for the PLT was to be found in libbfd, and that I should look at the source files in the directory for that file. Looking at the diff, I found that the file bfd/elf32-m68k.c had seen quite a lot of changes relating to ColdFire V4E support. The ColdFire V4E is a processor of which the instruction set is based on the m68k instruction set, and it should technically be possible to produce an ABI that will run on both systems, though Debian doesn't support it yet; however, it appeared that the relocation code which was added in support of the V4E was broken somehow—after reviewing patches for a few hours, I finally stumbled upon revision 1.74, of which the commit log says Add support for generating PLT lookups for the ColdFire. Which is so closely related that it can't be a coincidence. I built me a new cross-binutils based on the most recent source package but with that revision reversed, and tested, which seemed to succeed. Now I'm double-checking by building a native version, which might take a while; but I expect that one to work, too.

I'm pretty excited about this, both because we've had this bug for almost two months now (and we really needed to get rid of it) and because it's actually the first time I track down a bug in anything toolchain-related. This is nice.