Core update 182 caused Grub boot failure

FWIW, I eventually used the 182 flash image to reinstall IPFire, and then restored a recent backup. That worked without problems and obviated the repair issue.

2 Likes

Thanks @shyciii, but as I remember the Detect and Show menu only gave me options which did not boot.

I had IPFire running previously (in hindsight I think I had booted it from a fedora server USB stick similarly to how Supergrub works) and had run install-bootloader and updated the grub.cfg but the problem is exactly the same.

I’ve run out of ideas at this point, but unlike Lincoln, I can’t afford to reinstall at this point as my installation is complex and I don’t have the time to go through all the steps it would take to reproduce it all, even with various backups. My system runs fine once started, but boot is totally broken.

I’ve run into similar on one firewall. I switched the machine from UEFI to Legacy and it booted without issue.

I have an hypothesis to answer your question.

This is what I have in my grub.cfg:

if [ x$feature_platform_search_hint = xy ]; then
    search --no-floppy --fs-uuid --set=root --hint-bios=hd0,msdos1 --hint-efi=hd0,msdos1 --hint-baremetal=ahci0,msdos1  4b1989dc-209b-4d7d-b581-4c64a1f9920b

Note: the UUID is not really mine.

When GRUB executes the search command, it uses the hints provided to locate the boot partition. The hints (–hint-bios, --hint-efi, --hint-baremetal) are essentially suggestions to GRUB about where it might find the partition in different boot modes or system configurations. If the search for the BIOS hint (–hint-bios=hd0,msdos1) fails, GRUB then tries the EFI hint (–hint-efi=hd0,msdos1), which finds the files but fails to boot being mismatched with the firmware legacy modality.

If this conjecture is true, the problems could be due to a change in the disk layout, hardware alterations, or misconfigurations or the specific partition (in my case would be hd0,msdos1) is not accessible or has been modified.

Considering that you are using a a KVM VM there are further complications that can compromise the boot process after the grub update. For example, is the KVM VM set up to use BIOS? Is there a problem due to VM Disk Access Mode? Can a change in the host system be involved? Are there filesystem-level issues in the VM? What about the host underlying partition? Finally, could it be a new bug in GRUB revealed by using your particular KVM setting?

Can you try to manually issue a bare metal search commands (assuming hint-bios fails) in GRUB console? For example, using the bare metal option:

search --no-floppy --fs-uuid --set=root --hint-baremetal=ahci0,msdos1 <UUID>

followed by the Linux and initrd stanza and finally boot.

You never tried to reinstall grub to the disk MBR grub-install /dev/sdX. Could this solve the issue?

1 Like

It’s a long thread and a bit complicated. If you read it from the beginning you will notice that @dnl has always used the legacy mode. The problem is that after the grub update in the last IPFIre version, grub failes to locate the partition using the hint-bios modality, finds the partition using the hint-efi modality but then it fails to boot due to the mismatch (at least this is my understanding of the issue).

@cfuso thanks for wading in to this lengthy thread!

I can’t easily image the entire hard disk, but will spare you the details here.

The /usr/bin/install-bootloader script, which I ran previously, is an IPFire written shell script which includes various grub-install commands. So I have done that and it didn’t resolve the problem. (Interestingly the copy of that script on my system is different to a copy from a newly built test VM, although not in a way which would affect this problem).

You may be on to something about Grub being unable to find the partition though. The SSD in my IPFire system is reporting:

Device     Boot    Start       End   Sectors   Size Id Type
/dev/sda1  *        2048    264191    262144   128M 83 Linux
/dev/sda2         264192    329727     65536    32M ef EFI (FAT-12/16/32)
/dev/sda3         329728   2292233   1962506 958.3M 82 Linux swap / Solaris
/dev/sda4        2293760 125045423 122751664  58.5G  5 Extended
/dev/sda5        2295808  96667647  94371840    45G 83 Linux
/dev/sda6       96669696 125045423  28375728  13.5G 83 Linux
GPT PMBR size mismatch (31955 != 30285823) will be corrected by write.
The backup GPT table is not on the end of the device.

EDIT: Ignore that. When I run parted I am prompted:

Warning: Not all of the space available to /dev/sdb appears to be used, you can
fix the GPT to use all of the space (an extra 30253868 blocks) or continue with
the current setting?

Note that it says /dev/sdb which is the USB stick which is short-partitioned when it had the supergrub image copied to it. So sadly this is not the cause of my problem.

Nether
fdisk -l /dev/sda or parted /dev/sda unit s print report any problems.
END EDIT

I’d really like to be able to trigger a full rebuild of the initramfs and everything, but I’m unsure how to do that in this Linux distro.

EDIT: Came across the steps in this historic bug.

EDIT2: Regenerating the initramfs and running a grub-install --no-floppy --recheck --force --target=i386-pc /dev/sda didn’t change a thing. Exactly the same problem!

@cfusco I think you’re right - the system is failing to boot in legacy BIOS mode so attempting to fall back(?!) to EFI which is unable to boot. It then opens the grub rescue > prompt.

The BIOS is pretty limited and contains no obvious EFI configuration, unlike a variety of other computers I’ve worked with, however it does indicate that it has “UEFI 2.4 compliance” and offers a UEFI shell for what that’s worth.

Full copy of grub.cfg follows (looks standard to me!)

#
# DO NOT EDIT THIS FILE
#
# It is automatically generated by grub-mkconfig using templates
# from /etc/grub.d and settings from /etc/default/grub
#

### BEGIN /etc/grub.d/00_cloud ###
### END /etc/grub.d/00_cloud ###

### BEGIN /etc/grub.d/00_header ###
if [ -s $prefix/grubenv ]; then
  load_env
fi
if [ "${next_entry}" ] ; then
   set default="${next_entry}"
   set next_entry=
   save_env next_entry
   set boot_once=true
else
   set default="${saved_entry}"
fi

if [ x"${feature_menuentry_id}" = xy ]; then
  menuentry_id_option="--id"
else
  menuentry_id_option=""
fi

export menuentry_id_option

if [ "${prev_saved_entry}" ]; then
  set saved_entry="${prev_saved_entry}"
  save_env saved_entry
  set prev_saved_entry=
  save_env prev_saved_entry
  set boot_once=true
fi

function savedefault {
  if [ -z "${boot_once}" ]; then
    saved_entry="${chosen}"
    save_env saved_entry
  fi
}

function load_video {
  if [ x$feature_all_video_module = xy ]; then
    insmod all_video
  else
    insmod efi_gop
    insmod efi_uga
    insmod ieee1275_fb
    insmod vbe
    insmod vga
    insmod video_bochs
    insmod video_cirrus
  fi
}

if [ x$feature_default_font_path = xy ] ; then
   font=unicode
else
insmod part_msdos
insmod xfs
set root='hd0,msdos5'
if [ x$feature_platform_search_hint = xy ]; then
  search --no-floppy --fs-uuid --set=root --hint-bios=hd0,msdos5 --hint-efi=hd0,msdos5 --hint-baremetal=ahci0,msdos5  6aef81a5-4a87-4c4d-bded-71320c47231a
else
  search --no-floppy --fs-uuid --set=root 6aef81a5-4a87-4c4d-bded-71320c47231a
fi
    font="/usr/share/grub/unicode.pf2"
fi

if loadfont $font ; then
  set gfxmode=auto
  load_video
  insmod gfxterm
  set locale_dir=$prefix/locale
  set lang=en_US
  insmod gettext
fi
terminal_output gfxterm
insmod part_msdos
insmod xfs
set root='hd0,msdos1'
if [ x$feature_platform_search_hint = xy ]; then
  search --no-floppy --fs-uuid --set=root --hint-bios=hd0,msdos1 --hint-efi=hd0,msdos1 --hint-baremetal=ahci0,msdos1  84a89654-8aa2-4551-a916-090de2501475
else
  search --no-floppy --fs-uuid --set=root 84a89654-8aa2-4551-a916-090de2501475
fi
insmod png
background_image -m stretch /grub/splash.png
if [ x$feature_timeout_style = xy ] ; then
  set timeout_style=menu
  set timeout=5
# Fallback normal timeout code in case the timeout_style feature is
# unavailable.
else
  set timeout=5
fi
### END /etc/grub.d/00_header ###

### BEGIN /etc/grub.d/10_linux ###
menuentry 'IPFire 2.27 (x86_64) - core182 GNU/Linux' --class ipfire --class gnu-linux --class gnu --class os $menuentry_id_option 'gnulinux-simple-6aef81a5-4a87-4c4d-bded-71320c47231a' {
	load_video
	set gfxpayload=keep
	insmod gzio
	insmod part_msdos
	insmod xfs
	set root='hd0,msdos1'
	if [ x$feature_platform_search_hint = xy ]; then
	  search --no-floppy --fs-uuid --set=root --hint-bios=hd0,msdos1 --hint-efi=hd0,msdos1 --hint-baremetal=ahci0,msdos1  84a89654-8aa2-4551-a916-090de2501475
	else
	  search --no-floppy --fs-uuid --set=root 84a89654-8aa2-4551-a916-090de2501475
	fi
	echo	'Loading Linux 6.1.61-ipfire ...'
	linux	/vmlinuz-6.1.61-ipfire root=UUID=6aef81a5-4a87-4c4d-bded-71320c47231a ro panic=10 rd.auto 
	echo	'Loading initial ramdisk ...'
	initrd	/initramfs-6.1.61-ipfire.img
}
submenu 'Advanced options for IPFire 2.27 (x86_64) - core182 GNU/Linux' $menuentry_id_option 'gnulinux-advanced-6aef81a5-4a87-4c4d-bded-71320c47231a' {
	menuentry 'IPFire 2.27 (x86_64) - core182 GNU/Linux, with Linux 6.1.61-ipfire' --class ipfire --class gnu-linux --class gnu --class os $menuentry_id_option 'gnulinux-6.1.61-ipfire-advanced-6aef81a5-4a87-4c4d-bded-71320c47231a' {
		load_video
		set gfxpayload=keep
		insmod gzio
		insmod part_msdos
		insmod xfs
		set root='hd0,msdos1'
		if [ x$feature_platform_search_hint = xy ]; then
		  search --no-floppy --fs-uuid --set=root --hint-bios=hd0,msdos1 --hint-efi=hd0,msdos1 --hint-baremetal=ahci0,msdos1  84a89654-8aa2-4551-a916-090de2501475
		else
		  search --no-floppy --fs-uuid --set=root 84a89654-8aa2-4551-a916-090de2501475
		fi
		echo	'Loading Linux 6.1.61-ipfire ...'
		linux	/vmlinuz-6.1.61-ipfire root=UUID=6aef81a5-4a87-4c4d-bded-71320c47231a ro panic=10 rd.auto 
		echo	'Loading initial ramdisk ...'
		initrd	/initramfs-6.1.61-ipfire.img
	}
}

### END /etc/grub.d/10_linux ###

### BEGIN /etc/grub.d/20_linux_xen ###

### END /etc/grub.d/20_linux_xen ###

### BEGIN /etc/grub.d/25_bli ###
if [ "$grub_platform" = "efi" ]; then
  insmod bli
fi
### END /etc/grub.d/25_bli ###

### BEGIN /etc/grub.d/30_os-prober ###
### END /etc/grub.d/30_os-prober ###

### BEGIN /etc/grub.d/30_uefi-firmware ###
if [ "$grub_platform" = "efi" ]; then
	fwsetup --is-supported
	if [ "$?" = 0 ]; then
		menuentry 'UEFI Firmware Settings' $menuentry_id_option 'uefi-firmware' {
			fwsetup
		}
	fi
fi
### END /etc/grub.d/30_uefi-firmware ###

### BEGIN /etc/grub.d/40_custom ###
# This file provides an easy way to add custom menu entries.  Simply type the
# menu entries you want to add after this comment.  Be careful not to change
# the 'exec tail' line above.
### END /etc/grub.d/40_custom ###

### BEGIN /etc/grub.d/41_custom ###
if [ -f  ${config_directory}/custom.cfg ]; then
  source ${config_directory}/custom.cfg
elif [ -z "${config_directory}" -a -f  $prefix/custom.cfg ]; then
  source $prefix/custom.cfg
fi
### END /etc/grub.d/41_custom ###

I just noticed that SuperGrub also only detects EFI operating system boot options. It’s only when I choose to extract entries from grub.cfg when I can choose an IPFire boot option which works (with legacy boot).

Looks like there are two bugs in conjunction with grub-2.12rc

  1. on some systems grub try to boot in uEFI secure boot mode. (shim) wich need signatures from microsoft for shim, grub… This mode is not supported by IPFire.
  2. on systems with xfs filesystem on the boot partition grub seens not correct installed (always?).

I will update to the 2.12 final release in core183 that fix at least the second issue. (I cannot test the first because on my uEFI system this error never occour.)

4 Likes

This is strange because the RPi4 should boot via u-boot (except some special installations on usb disks that falls back to grub) But on none of them is an xfs filesystem or secure boot.

2 Likes

Thanks very much @arne_f ! I do use XFS.

I’ve previously raised https://bugzilla.ipfire.org/show_bug.cgi?id=13509 for this issue.

I have updated grub in the next tree.

You can try this Index of /next/2024-01-10 06:26:25 +0000-a2af8c71 or via pakfire “unstable” (only x86_64 has finished the build yet)

Thank you very much. I’ll be able to test this in a day.

Can I change back to the stable release after using testing here?

Thank you

It is possible to switch back but on stable setups it may not a good idia because possible inconsistencies and leftovers). core183 has also a new kernel so it man be a bit dangerous.

We have rolled back grub in core182 and updated the updater.(but the corrected version may not on all mirrors yet so you should wait a bit before try again on a stable install.)

3 Likes

Thanks @arne_f . Unfortunately I cannot see the updated version of core182 now, even if I manually refresh the list in pakfire, many hours after your post. I would have expected it to be available on all mirrors now.

You may have already tried this. If you have, speak up first before running a second time.

Do this:

And then change the dropdown from Testing to Stable.

I fixed my boot by booting manually (SuperGrub would work well), then rolling back the version to 181 in the configuration

echo 181 > /opt/pakfire/db/core/mine

Updating (since the current core 182 has a version of grub that works)

pakfire update && pakfire upgrade

and finally updating the bootloader (these last two commands might be superfluous but would not hurt

/usr/bin/install-bootloader
grub-mkconfig -o /boot/grub/grub.cfg
1 Like

Thanks for explaining how to trigger an update to reinstall @dejan ! That’s the information I’d been waiting on (sorry Jon, your advice was unhelpful as Arne had just warned us not to use testing).

Unfortunately modifying that text DB file and reinstalling 182 did not fix the problem on my system, it’s still trying to boot EFI for some reason and I can’t get it to boot legacy BIOS without using a boot stick (like Supergrub) and having that load an entry from grub.cfg.

Also now the problem with the console font (or display?) is back. I’ll update the bugzilla ticket. (EDIT: This happened once before but I rebuilt Grub again and it went away, so I removed it from this thread. The thread is just too long sorry)

Looks like a buggy videos bios of an intel onboard grafik. The grub shipped with IPFire has a patch for this

https://git.ipfire.org/?p=ipfire-2.x.git;a=blob;f=src/patches/grub/grub-2.02_disable_vga_fallback.patch;h=0cf30cff4899ba6b1fb7af336af2cb81ecef8779;hb=7270984c460653f2215271b86286f74e6e9fb6ca

but if you replace it with a different build it fails again.

2 Likes