Debugging the ASRock Boards - round 1
From BCCD 3.0
SUCCESS! Turns out that our problem was GRUB trashing the VESA framebuffer before the kernel could get to it. Solution is to run a much older (dumber) version of GRUB that is not too clever for its own good. Tracking this down was not trivial. Most times, the bootloader gets the kernel going, then is entirely out of the picture.
I started by comparing everything I could between the stage we knew did not work (LIBERATED) and the stages we knew did (LIVE, DISKLESS). I checked loaded kernel modules, executed init scripts, kernel output, and kernel command line options. Wherever I could, I tried making LIBERATED behave exactly like LIVE and DISKLESS, but I couldn't get the framebuffer to come up with more than a few kilobytes of memory and more than EGA/CGA support.
The only substantive difference I could find was in the kernel log output (dmesg). LIVE and DISKLESS had log lines starting with "vgaarb" /and/ "vesafb", while LIBERATED had only lines starting with "vgaarb". I looked at the kernel source and discovered that the build option controlling vesafb (CONFIG_FB_VESA) is only available as a built-in, not a module. Functionally, that means that the kernel is entirely responsible for the order in which it is initialized (a module would allow us to load it in a different order relative to other modules), which scrubbed an idea I had to rebuild the kernel with vesafb as a module.
Despite seeing that vesafb was built into the kernel, I wanted to verify that nothing in user-space was altering its behavior. I had already grep'd in all the initrd's trying to find references to video or framebuffer behavior and found nothing, but to be sure I stuck some debugging code at the top of the init script in each initrd. This consisted of the following bash snippet:
/bin/busybox clear /bin/busybox echo "=== OUTPUT ===" /bin/busybox dmesg|/bin/busybox grep vga /bin/busybox dmesg|/bin/busybox grep vesa /bin/busybox sleep 10
This clears the screen of other output, prints a recognizable line even if the grep's return nothing, and looks in dmesg for the vga and vesa strings, then sleeps for 10 seconds so I had some time to read what was displayed (since this comes before loading USB modules, keyboard is not functional). After rebuilding each initrd and rebooting, I saw what I expected to see - that LIBERATED had only vgaarb, while LIVE and DISKLESS had both vgaarb and vesafb.
This showed that the difference had to be something that happened no later than the end of the kernel initialization. I verified using md5sum that each kernel in question really was identical, which stumped me for a while since nothing really executed before the kernel, except for the bootloader. I then remembered that modern GRUB has a plethora of modules, including lots of video and framebuffer drivers, while syslinux, isolinux, and pxelinux (the bootloaders in use for USB, CD, and network booting, respectively) have no equivalent functionality.
Remembering that v3.4.0 had working X11 even in LIBERATED mode, I booted that up and tried to figure out the differences between v3.4.0 and v3.3.2. It turns out that v3.3.2 uses GRUB 1.99 while v3.4.0 uses GRUB 0.97 (I'm not sure why this is yet). I snarfed up the v3.4.0 GRUB configs, dropped them into my v3.3.2 build, and X11 started working without fuss after a reboot.
Tomorrow I'll work on getting v3.3.2's builds for this fix.