Grub wouldn't find kernel after applying 162 and rebooting with file system check

To my great surprise my ipfire box didn’t boot anymore after applying update 162 and rebooting with force file system check. My setup has pakfire pointing to the Testing repository. For whatever reasons, grub.cfg was still pointing to kernel 5.10.76-ipfire instead of 5.15.6-ipfire. Fixing was simple, take the drive out, edit the grub.cfg from another system, put the drive back and boot. (I also fsck the partitions while I was at it). Runs on an ASROCK motherboard which looks like is UEFI if that helps.

Would be happy to look at the right logs to understand what happened and possibly help the community. Which log would I look at to find what happened to the grub update tasks?

Edit: I thought I was updating to 160, but read the blog post about 162 release which includes kernel 5.15. Updated post to reflect the right version

My generic x86_64 box upgraded to core 162 testing, without issues. grub.cfg is set to boot /vmlinuz-5.15.6-ipfire and “uname -a” reports the same.

With the plethora of hardware in use, occasional hicups occur.

1 Like

I’m starting to think its a hardware issue that happened at the “wrong” moment. The system has been found has frozen hard twice in a 24 hour period since I manually updating grub.cfg to point to the new kernel. Not sure what’s causing it at the moment, but most likely irrelevant to ipfire.

This is not the first report of an not updated grub.conf after a core update that ship a new kernel.
I have already added a filesystem ‘sync’ as last command of the update script.
But cannot reproduce it yet and have no idea what can cause this.

Which filesystem did you use?
Any clues in the pakfire logs? (/opt/pakfire/log/*)

2 Likes

This happened to me too yesterday and it wasn’t “testing” but the full update. The update (which was run over the web UI over a VPN link, which might be suboptimal) just stopped, and when remoting into the machine (IPfire is run in a VM) the machine had stopped at grub. Notably, I don’t think I actually restarted the machine, so it might have segfaulted during the update as well; fsck ran when I got into it, which seems consistent with that theory. I got into it by changing to the new kernel in the emergency grub editor (in two places) and just booting, after boot a pakfire update/pakfire upgrade from the command line seems to have fixed the rest.

The logs (in reverse):

01:24:01 pakfire: PAKFIRE UPGR: core-upgrade-162: Upgrading files and running post-upgrading scri pts…
01:24:01 pakfire: DECRYPT FINISHED: core-upgrade-162 - Status: 0
01:24:00 pakfire: DECRYPT STARTED: core-upgrade-162
01:24:00 pakfire: CLEANUP: tmp
01:24:00 pakfire: PAKFIRE UPGR: core-upgrade-162: Decrypting…
01:24:00 pakfire: DOWNLOAD FINISHED: pub/network/security/ipfire/pakfire2/2.27-x86_64/paks/core-u pgrade-2.27-162.ipfire
01:24:00 pakfire: DOWNLOAD INFO: Signature of core-upgrade-2.27-162.ipfire is fine.
01:23:59 pakfire: DOWNLOAD INFO: File received. Start checking signature…
01:23:59 pakfire: DOWNLOAD INFO: HTTP-Status-Code: 200 - 200 OK
01:23:45 pakfire: DOWNLOAD INFO: pub/network/security/ipfire/pakfire2/2.27-x86_64/paks/core-upgra de-2.27-162.ipfire has size of 80142736 bytes

…and kernel:

01:24:06 kernel: <27>udevd[542]: specified group ‘kvm’ unknown
01:24:06 kernel: <27>udevd[542]: specified group ‘render’ unknown
01:24:06 kernel: <27>udevd[542]: specified group ‘input’ unknown

/opt/pakfire/logs/update-core-upgrade-162.log seems to contain two updates indeed; the first one stops after the line
var/ipfire/dhcpc/dhcpcd-hooks/01-test
and then comes 913 NUL characters.
Then the second (successful) update starts.