After update ipfire to 157, no boot

Hey Jon, my log/zip file is here BTW ;
https://community.ipfire.org/t/after-update-ipfire-to-157-no-boot/5641/10?u=neopegasus

1 Like

Hi All,
It happened to me too. I have a relatively new firewall appliance that has a intel J3160 processor. Purchased the device back in January of this year so it has not been running long. At the initial install I would have put the latest version on the device, so it would have been a Jan 2021 version (not sure what release). To recover, I had to do a new install with backup recovery to get it up and running again. The profile is 052661e7f9c06c03afdb1fb849ac27cbbf7f860e, assuming it does not change from a complete rebuild. ( Which is a question, does the profile id change after a complete rebuild of the system?, just curious. ) On the other hand, my backup firewall device, which has a i3-4005U processor, updated just fine (it started at release 154), its profile id is 3dce541a03a2724dab08ca9ccfd03875fc3dc899.

I am not sure thattelling you the disk allocation after recovery will tell you anything as I have no way to determine if things changed from the pre-core 157 upgrade.

I hope this helps, P

@pmueller: my input for consideration as you are narrowing in on root cause. I purchased my IPFire Mini Appliance about one year back. Your assumption that it was running an old installation >4 years is still possible but would be strange.

I can’t provide evidence as I reinstalled right away before I have seen your request for logs.

I would recommend to everybody to create a full image with clonezilla before updating a new core.

Yes, it’s an additional downtime but it’s worth!

Greetz

I did update my machine, which is an apu2. Everything worked as intended. The partitioning is following the last IPFire scheme:

Filesystem      Size  Used Avail Use% Mounted on
devtmpfs        2.0G  4.0K  2.0G   1% /dev
tmpfs           2.0G   12K  2.0G   1% /dev/shm
tmpfs           2.0G  584K  2.0G   1% /run
/dev/sda4        54G  3.0G   49G   6% /
/dev/sda1       110M   35M   67M  34% /boot
/dev/sda2        32M  254K   32M   1% /boot/efi
/var/lock       8.0M   12K  8.0M   1% /var/lock

I will check during the week end my friend’s apu1 IPFire machine and see what happened. To be precise, I only know that after the update it did not boot. Maybe it is not this issue but something else. I will report here asap.

For now, I can only state that my update went well, without any problem.

1 Like

Hi all,

thanks for your replies. Looks like we can rule out the /boot partition size issue.

While upgrading to Core Update 157, we reconfigure Grub (see this for details), which should look like this in /var/log/pakfire/update-core-upgrade-157.log:

Generating grub configuration file ...
Found background: /boot/grub/splash.png
Found linux image: /boot/vmlinuz-4.14.232-ipfire
Found initrd image: /boot/initramfs-4.14.232-ipfire.img
done

However, in the log file provided by @neopegasus (thanks), it looks like Grub missed the kernel:

Generating grub configuration file ...
Found background: /boot/grub/splash.png
done

Whyever that is - I have no idea at the moment. Apparently, storage size, partition layouts or age of the IPFire installation are not the root cause of this.

The only thing I can think of currently is some extremly slow storage, as we do not conduct a sync after running extract_files() - but I don’t know if this can possibly (in means of file system internals) to cause Grub to miss the kernel files completely.

@neopegasus, @angrytux, @thier28, @cfusco: You are not running IPFire on SD or Compaq flash cards, are you? (At least the IPFire Mini appliances should not.) :slight_smile:

Clonezilla is not necessary for this. The web interface is capable of creating backups as well, and does so automatically before installing a Core Update.

@cwensink: Could you run grub-mkconfig -o /boot/grub/grub.cfg manually on one of your system which has Core Update 157 installed but not rebooted since? If the output of that command looks like the first log snippet above, you can safely reboot. If not, please let us know. :slight_smile:

Thanks, and best regards,
Peter Müller

2 Likes

Looks like I am safe, here is the output for both systems

[root@ipfire ~]# grub-mkconfig -o /boot/grub/grub.cfg
Generating grub configuration file ...
Found background: /boot/grub/splash.png
Found linux image: /boot/vmlinuz-4.14.232-ipfire
Found initrd image: /boot/initramfs-4.14.232-ipfire.img
done
[root@ipfire ~]#

Hi,

may I ask why? Hardware failures? Was the Grub issue reproducible?

Thanks, and best regards,
Peter Müller

Hi,

all right, I wish you all the best and keep my fingers crossed. :slight_smile:

Thanks, and best regards,
Peter Müller

Peter,

For those that are affected by this issue, is there going to be a 157.1 update or a 158 update published to fix this issue once the core cause is identified?

Chris

Hi,

I cannot answer your question at the moment as we do not even have an educated guess about what the root cause is. Sorry.

Thanks, and best regards,
Peter Müller

1 Like

For my part, no. msata SSD

Hi Peter,

No the Ipfire before the problem was running on a normal but yet old(2008) WD SATA HDD, because I noticed it’s a old HDD when I took it out I decided to upgrade the HDD to a SSD.

The HDD was 350GB.
The SSD is 500GB.

The HDD is stil in the box as a backup if something goes wrong.

Best regards.

Neopegasus.

Peter,
My emulation (Serial, MacOS) had weird behavior. Menues were barely readable. I have played around with the terminal config but no success. There was also an error on non supported graphic mode but I have not captured the message :frowning: After some trial and error I was able to initiate the installation. From then on, the menues were readable and installation worked smoothly. No more issues with Grub.

Sh… happens. I will consider a backup hardware.

Hi all,

so bad flash storage isn’t it, either.

To be honest, I am out of ideas by now. :frowning: If anybody comes up with a reasonable one, please let us know.

Thanks, and best regards,
Peter Müller

1 Like

I think it’s a bunch of uncorrelated issues involving several people, including myself. I believe, this has nothing to do with 157 release.

I posted in this thread after I got informed from a friend of mine that his IPFire (apu1) could not boot after the 156 to 157 update. Therefore I wrote here (big mistake) before checking what was really going on at the time. My apologies for wasting your time @pmuller

After I went ahed and decided to update, my own machine updated without any problem. My friends machine, using his backup hard disk, did the same. It turned out that the main hard disk of my friend did not boot properly because a file system error:

EXT4-fs (sda4): mounted filesystem with ordered data mode. Opts: (null)
dracut: Checking ext4: /dev/disk/by-uuid/cea89658-1ad9-49dc-bded-c7f1184ac72b
dracut: issuing e2fsck -a  /dev/disk/by-uuid/cea89658-1ad9-49dc-bded-c7f1184ac72b
random: fast init done
ata1.00: READ LOG DMA EXT failed, trying PIO
ata1.00: exception Emask 0x0 SAct 0x7ffff8 SErr 0x0 action 0x0
ata1.00: irq_stat 0x40000008
ata1.00: failed command: READ FPDMA QUEUED
ata1.00: cmd 60/28:20:d6:6b:d5/01:00:00:00:00/40 tag 4 ncq dma 151552 in
         res 53/40:08:f6:6c:d5/00:00:00:00:00/40 Emask 0x409 (media error) <F>
ata1.00: status: { DRDY SENSE ERR }
ata1.00: error: { UNC }
ata1.00: configured for UDMA/133
sd 0:0:0:0: [sda] tag#4 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
sd 0:0:0:0: [sda] tag#4 Sense Key : Medium Error [current] 
sd 0:0:0:0: [sda] tag#4 Add. Sense: Unrecovered read error - auto reallocate failed
sd 0:0:0:0: [sda] tag#4 CDB: Read(10) 28 00 00 d5 6b d6 00 01 28 00
print_req_error: I/O error, dev sda, sector 13987062
ata1: EH complete
ata1.00: exception Emask 0x0 SAct 0x60000003 SErr 0x0 action 0x0
ata1.00: irq_stat 0x40000008
ata1.00: failed command: READ FPDMA QUEUED
ata1.00: cmd 60/02:e8:f6:6c:d5/00:00:00:00:00/40 tag 29 ncq dma 1024 in
         res 53/40:02:f6:6c:d5/00:00:00:00:00/40 Emask 0x409 (media error) <F>
ata1.00: status: { DRDY SENSE ERR }
ata1.00: error: { UNC }
ata1.00: configured for UDMA/133
sd 0:0:0:0: [sda] tag#29 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
sd 0:0:0:0: [sda] tag#29 Sense Key : Medium Error [current] 
sd 0:0:0:0: [sda] tag#29 Add. Sense: Unrecovered read error - auto reallocate failed
sd 0:0:0:0: [sda] tag#29 CDB: Read(10) 28 00 00 d5 6c f6 00 00 02 00
print_req_error: I/O error, dev sda, sector 13987062
Buffer I/O error on dev sda4, logical block 6326104, async page read
ata1: EH complete
ata1.00: exception Emask 0x0 SAct 0x1000000 SErr 0x0 action 0x0
ata1.00: irq_stat 0x40000008
ata1.00: failed command: READ FPDMA QUEUED
ata1.00: cmd 60/02:c0:f6:6c:d5/00:00:00:00:00/40 tag 24 ncq dma 1024 in
         res 53/40:02:f6:6c:d5/00:00:00:00:00/40 Emask 0x409 (media error) <F>
ata1.00: status: { DRDY SENSE ERR }
ata1.00: error: { UNC }
ata1.00: configured for UDMA/133
sd 0:0:0:0: [sda] tag#24 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
sd 0:0:0:0: [sda] tag#24 Sense Key : Medium Error [current] 
sd 0:0:0:0: [sda] tag#24 Add. Sense: Unrecovered read error - auto reallocate failed
sd 0:0:0:0: [sda] tag#24 CDB: Read(10) 28 00 00 d5 6c f6 00 00 02 00
print_req_error: I/O error, dev sda, sector 13987062
Buffer I/O error on dev sda4, logical block 6326104, async page read
ata1: EH complete


/dev/disk/by-uuid/cea89658-1ad9-49dc-bded-c7f1184ac72b: UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY.
        (i.e.dracut Warning: e2fsck returned with 4
, without -a or -p options)
dradracut Warning: /dev/disk/by-uuid/cea89658-1ad9-49dc-bded-c7f1184ac72b contains a file system with errors, check forced.
cut Warning: e2fdracut Warning: Error reading block 1581526 (Input/output error) while reading directory block.
sck returned with 4
dracut Warndracut Warning: *** An error occurred during the file system check.
ing: /dev/disk/bdracut Warning: *** Dropping you to a shell; the system will try
y-uuid/cea89658-dracut Warning: *** to mount the filesystem(s), when you leave the shell.
1ad9-49dc-bded-cdracut Warning: filesystem)
7f1184ac72b contains a file system with errors, check forced.
dracut Warning: Error reading block 1581526 (Input/output error) while reading directory block.
dracut Warning: *** An error occurred during the file system check.
dracut Warning: *** Dropping you to a shell; the system will try
dracut Warning: *** to mount the filesystem(s), when you leave the shell.


dracut Warning: filesystem)


Generating "/run/initramfs/rdsosreport.txt"
You might want to save "/run/initramfs/rdsosreport.txt" to a USB stick or /boot
after mounting them and attach it to a bug report.

To get more debug information in the report,
reboot with "rd.debug" added to the kernel command line.

Dropping to debug shell.

(Repair:/# 

Thanks for helping us out. I will try to be more careful next time. I think we should close this thread.

2 Likes

Hi all,

for the records: The only thing that occurred to me regarding this problem was a missing sync statement in the extract_files() function (see here for it’s source code).

On systems having either

  • extremely slow/bad (flash) storage or
  • running on very high I/O load,

this might be useful to ensure we have actually written down upgraded files to disk, before letting Grub search for a new kernel.

Its somewhat homeopathic, but I guess it is better than nothing. I will suggest this on the mailing list.

EDIT: Done, please refer to this patch, which resulted in this commit.

Thanks, and best regards,
Peter Müller

4 Likes

Hi @cfusco,

thank you for reporting back. :slight_smile:

Indeed, that disk seem to have reached its technical lifetime. Input/output errors are never a good sign when it comes to mass storage failures…

There is absolutely need to apologise: I am glad for this constructive discussion, people reporting back, providing logfiles - everything went exactly as I like supporting/debugging. :slight_smile:

Given the fact that we still have two other users in this thread experiencing the same or a similar error, I still suspect some quirk being shipped with Core Update 157. I would be odd to see three hard drives die at the same time… :wink:

Thanks, and best regards,
Peter Müller

6 Likes

Hello all,
I was running 157 on my NanoPi R1 and now I can’t boot. This is what I have on the screen (see attached text file in zip). Can anyone help please?

Thanks in advance…

ipfire_error.zip (5.1 KB)

is indicating that the ext4 filesystem on your main partition has errors.

If you are running from uSD card, then probably time to replace the card.

If you are running from eMMC, then you could try installing to uSD card, booting from that and trying to run fsck on the eMMC partitions.