E1000 driver failing after 192 update

kernel: e1000e 0000:01:00.0 orange0: Detected Hardware Unit Hang:

initially on orange interface but last few hours seems to be on green interface. typically lasts a couple minutes and the connection comes back up - very frustrating!! any suggestions on: 1) what to look for next and 2) resolution? (will probably roll back if nothing found right away)

This is a well known problem with the Intel e1000e driver since 2013.

You can easily find lots of people having this problem by searching on the internet. It is not an IPFire specific issue.

In the IPFire forum there are these posts.

https://community.ipfire.org/t/e1000e-green0-detected-hardware-unit-hang/6324

https://community.ipfire.org/t/e10001-detected-hardware-unit-hang/13414

https://community.ipfire.org/t/after-the-core-158-update-the-green-network-card-says-goodbye-after-a-few-days/5896

https://community.ipfire.org/t/network-problem-with-e1000e-after-upgrade-from-core-173-to-191/13594

Several of the above posts metion the workaround which is to disable some of the offloading functions on the nics.

ethtool -K green0 gso off gro off tso off

When you reboot the nics will re-enanble those functions so you will need to add the command into the rc.local file in IPFire.

Look through the above post links and in the IPFDire documentation on rc.local

https://www.ipfire.org/docs/pkgs/rc-local

1 Like

thanks - i thought i’d seen it before and figured there was a workaround - thanks much!!

Here the driver hang occured after installing Core198 (on Core196 it works as expected)..
I tried to apply the workaround and now I’m monitoring the system.

Is there any chance, that the faulty driver can be rolled back in the kernel?

Hallo @lhsei

Welcome to the IPFire community.

The problem is related to a faulty driver/hardware issue since 2013.

Changes to the Linux Firmware and/or other changes in the kernel can end up triggering this problem for users that previously have not had an issue.

I doubt that the Kernel Developers will have any idea what changes have triggered the issue with the driver and likely would not be willing to revert all their changes to deal with a problem that Intel should have fixed back in 2013.

As there has been no fix from Intel, and unlikely to ever be one in my view, then the disabling of some of the offloading features of the nics looks to be the only available solution.

2 Likes