Core 170: general protection fault

Hello community,

I‘m using ipfire for quite a time with an older version in an home environment.
It was running silently on a older black dwarf VIA cpu.

Two days ago I updated to latest version (core170), also changed


the SSD and I‘m getting general protection faults regularly. (Example Screen attached)
The whole system stops and I have to manually reboot the firewall.

How can I analyse what‘s the root cause for the gpf? I‘m a novice linux user and actually found nothing in the logs (kernel) which points me to the problem.
Could be hw failure (RAM, SSD, etc) or is it a problem with 170?

Thanks for some advice

I think the kernel function __swtich_to is doing bad things with the memory. I have no idea if this depends on some bugs specific for your hardware introduced/exposed by some change in the kernel when IPFire upgraded to linux kernel 5.15.59, or there is some corrupt memory in your ssd.

Not an expert at all. For this reason I would downgrade and watch carefully if this problem goes away. Maybe you could also considering writing a bug report.

I hope you get some better suggestion by someone that understands these things way better than I do.

Thanks for the fast reply - actually I didn‘t got the clue before, that ‚switch_to‘ is the kernel function!
Therefore you already helped very much!
I have a look if it‘s always the same function which issues the gpf.

I will then fallback to an older version of ipfire and check for a while if the problem reoccurs again.
(I just switched to core170 because of the ip blocklist feature)

(The other option is to change the SSD again)

Regards
Alex

How do you know it’s the SSD? What are the system components? Also you haven’t told us anything about the behaviour. Does it already crash right after the boot up, or does it fail after a incomprehensible time?

Hello Terry,

Sorry for being unclear. The error description a little bit more detailed:

  • since switching to ipfire 2.27 core 170 the system sporadically freezes after time with an general protection fault
  • at the moment it‘s not deterministic for me when the system freezes, sometimes shortly after boot, sometimes hours later
  • with the change of the ipfire version I also changed the physical SSD (because I had some trouble with the old one while erasing/partitioning), so this might be an error cause as well

I was asking in the community perhaps someone encountered a similar problem with 170 and to get a first hint how to analyse the problem (I never got an gpf before).

I will watch the problem a little bit further, try to fallback to the ipfire version before and see if it comes up as well and in this case would change the SSD back to the old one.

I keep informed. Thanks
Regards
Alex

Still no system information. What CPU, what chipset. I wonder if there are any systems out there since 2012. Also I wonder if the VIA SATA controllers even support TRIM. Otherwise you will break every SSD quite quickly.

Hello Terry,
there is few info about the hardware available.
It is an old Terra Black Dwarf G2 Firewall with VIA Nano U3500@1Ghz

And it looks like this:
https://www.ais-computer.ch/shop/index.php?controller=attachment&id_attachment=843

Some other ipfire user had also problems (but already during installation):

He has also a fireinfo profile linked.

I know the HW is old, but I was just wondering, because my setup was just running smoothly since I upgraded lately.

Btw: it‘s always the gpf with:
RIP: 0010:__switch_to+0x299/0x3d0

The point/question with the SATA TRIM I cannot answer… ¯_(ツ)_/¯ I just know, that several people run the HW with other firewall distros like pfsense/opnsense/ipfire

Kind regards
Alex

I don’t know the intrinsics of the basic OS functions ( __switch_to seems to be the task switcher ), but I suppose there is a problem with the content of the task descriptor. This may be caused by memory errors or disk errors ( processes are loaded from disk ).
Do you have the possibility to check your system? The cited article contains some info about doing this.

Hm hard to tell what southbridge is used. Only the VT8237S, VT8251 and VT8261 should be able to run a SSD because of support for AHCI with NCQ and TRIM.

But what’s the media? HDD, SSD or CF card? Do you actually use a SATA SSD or a CF card?

As already mentioned you should test the RAM with MemTest86 and the drive with CheckDrive or CrystalDiskInfo.

1 Like