Running IPFire 2.25 (x86_64) - Core Update 145 on a Lenovo ThinkCentre i5, green on a Netgear Gbit card, red on the motherboard’s Intel NIC, and getting random disconnects on the RED interface since earlier in the year, but it seemed to get worse after Core Update 144.
I’ve found that I can trigger a network drop if I run the speed test at https://www.dslreports.com/speedtest 9 out of 10 times, less so if I run the Ookla test (speedtest.net), but still disconnects on a lesser frequency, both fail on the Upload section. I saw elsewhere in the forum that the Intel NIC driver might be misbehaving.
Thanks for your responses. I guess the easiest solution is to install a second NIC card with a chip other than Intel. As I wrote, the disconnects were there before, but got worse. Might have been a coincidence. I will add the second NIC and report back, have to do it when my wife is not using the connection or there will be hell to pay…
Well, I finally got to add a second Raltek chip NIC, assigned RED to the new NIC and so far IPFire has run without disconnects for 28 hours, previously it was dropping the connection several times a day, so that confirms it is the Intel NIC. The Kernel log shows:
Time
Section
09:08:02
kernel:
e1000e 0000:00:19.0 red0: Detected Hardware Unit Hang:
09:08:02
kernel:
TDH
09:08:02
kernel:
TDT
09:08:02
kernel:
next_to_use
09:08:02
kernel:
next_to_clean
09:08:02
kernel:
buffer_info[next_to_clean]:
09:08:02
kernel:
time_stamp <10a4426cc>
09:08:02
kernel:
next_to_watch
09:08:02
kernel:
jiffies <10a442800>
09:08:02
kernel:
next_to_watch.status <0>
09:08:02
kernel:
MAC Status <80083>
09:08:02
kernel:
PHY Status <796d>
09:08:02
kernel:
PHY 1000BASE-T Status <3c00>
09:08:02
kernel:
PHY Extended Status <3000>
09:08:02
kernel:
PCI Status <10>
09:08:04
kernel:
e1000e 0000:00:19.0 red0: Detected Hardware Unit Hang:
09:08:04
kernel:
TDH
09:08:04
kernel:
TDT
09:08:04
kernel:
next_to_use
09:08:04
kernel:
next_to_clean
09:08:04
kernel:
buffer_info[next_to_clean]:
09:08:04
kernel:
time_stamp <10a4426cc>
09:08:04
kernel:
next_to_watch
09:08:04
kernel:
jiffies <10a442a40>
09:08:04
kernel:
next_to_watch.status <0>
09:08:04
kernel:
MAC Status <80083>
09:08:04
kernel:
PHY Status <796d>
09:08:04
kernel:
PHY 1000BASE-T Status <3c00>
09:08:04
kernel:
PHY Extended Status <3000>
09:08:04
kernel:
PCI Status <10>
09:08:06
kernel:
e1000e 0000:00:19.0 red0: Detected Hardware Unit Hang:
09:08:06
kernel:
TDH
09:08:06
kernel:
TDT
09:08:06
kernel:
next_to_use
09:08:06
kernel:
next_to_clean
09:08:06
kernel:
buffer_info[next_to_clean]:
09:08:06
kernel:
time_stamp <10a4426cc>
09:08:06
kernel:
next_to_watch
09:08:06
kernel:
jiffies <10a442cc0>
09:08:06
kernel:
next_to_watch.status <0>
09:08:06
kernel:
MAC Status <80083>
09:08:06
kernel:
PHY Status <796d>
09:08:06
kernel:
PHY 1000BASE-T Status <3c00>
09:08:06
kernel:
PHY Extended Status <3000>
09:08:06
kernel:
PCI Status <10>
09:08:08
kernel:
e1000e 0000:00:19.0 red0: Detected Hardware Unit Hang:
09:08:08
kernel:
TDH
09:08:08
kernel:
TDT
09:08:08
kernel:
next_to_use
09:08:08
kernel:
next_to_clean
09:08:08
kernel:
buffer_info[next_to_clean]:
09:08:08
kernel:
time_stamp <10a4426cc>
09:08:08
kernel:
next_to_watch
09:08:08
kernel:
jiffies <10a442f00>
09:08:08
kernel:
next_to_watch.status <0>
09:08:08
kernel:
MAC Status <80083>
09:08:08
kernel:
PHY Status <796d>
09:08:08
kernel:
PHY 1000BASE-T Status <3c00>
09:08:08
kernel:
PHY Extended Status <3000>
09:08:08
kernel:
PCI Status <10>
09:08:10
kernel:
e1000e 0000:00:19.0 red0: Detected Hardware Unit Hang:
Indeed, I had done that earlier. I had a PUMA 6 chipset modem that was resetting itself several times a day, so I bugged the ISP until they sent a technician, who confirmed the problem, replaced every connector between the modem and the utility pole outside, and also found low power on the highest frequency channels at the pole. A second technician came and probably replaced all the connectors on the main cable going around the neighborhood, and after that I had good power on all channels and good SNR. But the problem continued, so I got a new DOCSIS 3.1 modem with a Broadcom chipset from the ISP. This reduced the number of disconnects and shortened their duration, but there were still happening. I read somewhere in this forum about similar behaviour on Intel NICs, and lo and behold, my RED was an Intel NIC. Just yesterday I installed a Realtek chip NIC on the firewall box and reassigned RED to it, and so far no more disconnects. I hope that will be the end of the problem.
Since went to new hardware and installed core 144 (Old hardware would not support updates past 122 ) I have had many drops, requiring restarting my cable modem and firewall to get the DHCP once cable modem was restarted. However this looks to be just a coincidence seems getting timeouts from modem to ISP in the modem logs and the ISP was repotting issues in my area when this started. But is still happening occasionally although they no longer finding problems but still think ISP issue causing my problem. may want to double check everything on your end upstream to ipfire.