New and first installation - randomly drops internet

Hi,

Just moved from IPCop to IPFire.
Installation went well, although having the image on an USB-stick it still wanted to download the iso to install. The installation images on your guide show it should find my HDD, but despite changing bios setting to every possible option and switching 3 HDDs, I still had to download iso. Then two steps later I got to choose if I want to install on my thumb drive or HDD.

My real issue now is that it randomly drops the Internet for 5-10 seconds.
This is a major pain especially for my kids who play a lot of games. if it drops in tournaments they may get banned due to this.
It never happened with IPCop which I have used for 10+ years.

Hardware is fine, no heat issues. Only config done is adding static routes on green.
Oh and that, why is it two steps for adding a static route? First add, then update.
IPCop just added it.

Hardware is a own built, DFI SB630 motherboard with Intel i7 2600 CPU, 16GB RAM, 500GB HDD. Extra Intel dual port gigabit NIC.

https://fireinfo.ipfire.org/profile/fb1f9a941898b42a431b433484a4b0309e64827c

If you downloaded the IMG.XZ file, then that is intended to run permanently from either

  • USB device (stick, USB SSD etc) or
  • after extracting it directly to internal HDD/SSD

With your hardware, a more likely approach is to download the ISO file and follow the procedure to write it to USB key: https://wiki.ipfire.org/installation/step2

You do appear to be faced with a re-install.

Well, I did use the flash image first. After realizing it didn’t install on the SSD I wrote the iso with rufus to the USB key. After this it installed on the HDD. (switched as I didn’t think it found the SSD)

But I will re-burn the iso to a new USB key and do a re-install later.
Way past midnight here and a bit tired.

Thanks

A SATA SSD ought to get found. Perhaps troubleshoot that, at BIOS level, before doing yet another install.

Might be advisable to disconnect other HDD during install. Those can always be re-enabled later as “Extra HD”, for storage.

Have only one disk in the system. Currently an Intel SSD 400GB.

Ok, second try.
Burnt the iso to USB using Rufus and for reference I also burned the same iso to a CD-ROM.

Booting from USB first.

  • Install IPFire
  • Language
    then start installation
    And here it wants me to download installation image.

Adding CD-ROM and resets PC
So Install IPFire again

  • Language
  • Start installation
  • License agreement
    And here it found my SSD and asks if to delete all data
    Choose filesystem
    Done

So I guess the old fashion CD-ROM is still the best.
This issue I never have with any other Linux distro or IPCop.

So installed and first configure before going to GUI is done.
Using RED + GREEN.
My ISP has fibre, 250/250 so using DHCP.
Here is another strange thing, with IPCop I get the hostname from my ISP, but in IPFire I set my own hostname?

Adding fixed leases for all my things. Again, WHY have Add and directly after Update?? IPCop have only Add, OpnSense have add and I can keep adding everything. then click update.
And IPFire doesn’t even blank the fields after Add or Update.

And about 15 minutes after that is done it disconnect and re-connect the WAN. I cant find anything in the logs of the cause.

For reference I installed OpnSense yesterday and had it run all Saturday. Absolutely no problem at all.

While typing all this it has re-connect the WAN 3 times.
I was looking forward to RUN IPFire as it is very similar to IPCop that I loved for the past 10 years or so. If it can’t be solved I will go to OpnSense. :slightly_frowning_face:

Since last post it has re-connected WAN about 6 times.
No obvious entry in any log what I can see.

So switched back to OpnSense.
I have the IPFire intact on another disk so can swap fast if someone has a solution I can try out.

Your ISP allows you to use static ip address on your RED interface?
The IP address is public or private?

Not sure, I think they have some kind of dynamic.
I sometimes get a new IP.

Been using OpnSense since Sunday, no hiccup with red/WAN at all.

Are you using DNS over TLS?
If so I would try Standard DNS udp.
Have multiple Gamers at my home too.
No problems here.

Hi,

in order to find out why your WAN connection disappeared, could you please post the corresponding log entries (in /var/log/messages or via the web interface, choose section ‘RED’ please) here?

Thanks, and best regards,
Peter Müller

Ok, this just happened
IPFire diagnostics
Section: red
Date: May 31, 2020

12:01:03 dhcpcd[14728] : red0: carrier lost
12:01:03 dhcpcd[14728] : red0: deleting route to 1.1.1.0/24
12:01:03 dhcpcd[14728] : red0: deleting default route via 1.1.1.1
12:01:06 dhcpcd[14728] : red0: carrier acquired
12:01:06 dhcpcd[14728] : red0: IAID 00:00:00:00
12:01:06 dhcpcd[14728] : red0: rebinding lease of 1.1.1.62
12:01:06 dhcpcd[14728] : red0: probing address 1.1.1.62/24
12:01:07 dhcpcd[14728] : red0: soliciting an IPv6 router
12:01:11 dhcpcd[14728] : red0: leased 1.1.1.62 for 43200 seconds
12:01:11 dhcpcd[14728] : red0: adding route to 1.1.1.0/24
12:01:11 dhcpcd[14728] : red0: adding default route via 1.1.1.1

And once again after 14 minutes

12:15:02 dhcpcd[14728] : red0: carrier lost
12:15:02 dhcpcd[14728] : red0: deleting route to 1.1.1.0/24
12:15:02 dhcpcd[14728] : red0: deleting default route via 1.1.1.1
12:15:05 dhcpcd[14728] : red0: carrier acquired
12:15:05 dhcpcd[14728] : red0: IAID 00:00:00:00
12:15:06 dhcpcd[14728] : red0: soliciting an IPv6 router
12:15:07 dhcpcd[14728] : red0: rebinding lease of 1.1.1.62
12:15:07 dhcpcd[14728] : red0: probing address 1.1.1.62/24
12:15:12 dhcpcd[14728] : red0: leased 1.1.1.62 for 43200 seconds
12:15:12 dhcpcd[14728] : red0: adding route to 1.1.1.0/24
12:15:12 dhcpcd[14728] : red0: adding default route via 1.1.1.1

And a third time
12:49:41 dhcpcd[14728] : red0: carrier lost

Checked the kernel log, maybe some clue here.

IPFire diagnostics
Section: kernel
Date: May 31, 2020

12:01:01 kernel: e1000e 0000:00:19.0 red0: Detected Hardware Unit Hang:
12:01:01 kernel: TDH
12:01:01 kernel: TDT <6>
12:01:01 kernel: next_to_use <6>
12:01:01 kernel: next_to_clean
12:01:01 kernel: buffer_info[next_to_clean]:
12:01:01 kernel: time_stamp <10001bc72>
12:01:01 kernel: next_to_watch
12:01:01 kernel: jiffies <10001c3c0>
12:01:01 kernel: next_to_watch.status <0>
12:01:01 kernel: MAC Status <80083>
12:01:01 kernel: PHY Status <796d>
12:01:01 kernel: PHY 1000BASE-T Status <7800>
12:01:01 kernel: PHY Extended Status <3000>
12:01:01 kernel: PCI Status <10>
12:01:03 kernel: NETDEV WATCHDOG: red0 (e1000e): transmit queue 0 timed out
12:01:03 kernel: ------------[ cut here ]------------
12:01:03 kernel: WARNING: CPU: 0 PID: 0 at net/sched/sch_generic.c:319 dev_watchdog+0x1ef/0x200
12:01:03 kernel: Modules linked in: ipt_MASQUERADE nf_nat_masquerade_ipv4 cfg80211 rfkill 8021q garp xt_hashlimit xt_mark xt_policy xt_TCPMSS nf_nat_irc nf_conntrack_irc nf_nat_tftp nf_conntrack_tftp nf_nat_ftp nf_conntrack_ftp nf_nat_h323 nf_conntrack_h323 xt_CT xt_helper nf_nat_sip nf_conntrack_sip xt_conntrack xt_comment ipt_REJECT nf_reject_ipv4 nf_log_ipv4 nf_log_common xt_LOG xt_limit iptable_raw iptable_mangle iptable_filter vfat fat sch_fq_codel snd_hda_codec_realtek snd_hda_codec_generic iTCO_wdt iTCO_vendor_support i2c_algo_bit x86_pkg_temp_thermal fb_sys_fops syscopyarea intel_powerclamp sysfillrect sysimgblt coretemp kvm snd_hda_intel snd_hda_codec snd_hda_core irqbypass snd_hwdep snd_pcm crct10dif_pclmul crc32_pclmul i2c_i801 snd_timer ghash_clmulni_intel pcspkr lpc_ich pcc_cpufreq e1000e
12:01:03 kernel: i2c_core mfd_core snd ptp pps_core soundcore video ata_generic pata_acpi pata_jmicron
12:01:03 kernel: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.14.173-ipfire #1
12:01:03 kernel: Hardware name: To be filled by O.E.M. To be filled by O.E.M./SB630-CRM, BIOS 4.6.5 09/12/2012
12:01:03 kernel: task: ffffffffa5c134c0 task.stack: ffffffffa5c00000
12:01:03 kernel: RIP: 0010:dev_watchdog+0x1ef/0x200
12:01:03 kernel: RSP: 0018:ffff96141e203e78 EFLAGS: 00010246
12:01:03 kernel: RAX: 000000000000003a RBX: ffff961407e4ca00 RCX: 0000000000000000
12:01:03 kernel: RDX: 0000000000000000 RSI: 00000000000000f6 RDI: 0000000000000300
12:01:03 kernel: RBP: ffff961407ee8000 R08: ffff96141e2163f8 R09: 0000000000000001
12:01:03 kernel: R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000
12:01:03 kernel: R13: 0000000000000000 R14: ffff961407ee8000 R15: ffff961407ee8478
12:01:03 kernel: FS: 0000000000000000(0000) GS:ffff96141e200000(0000) knlGS:0000000000000000
12:01:03 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
12:01:03 kernel: CR2: 00007526c8204180 CR3: 000000020480a001 CR4: 00000000000606f0
12:01:03 kernel: Call Trace:
12:01:03 kernel:
12:01:03 kernel: ? dev_deactivate_queue.constprop.0+0x60/0x60
12:01:03 kernel: call_timer_fn+0x30/0x130
12:01:03 kernel: run_timer_softirq+0x2fe/0x9c0
12:01:03 kernel: ? trigger_load_balance+0x3a/0x230
12:01:03 kernel: ? tick_sched_timer+0x35/0x70
12:01:03 kernel: ? tick_sched_do_timer+0x40/0x40
12:01:03 kernel: __do_softirq+0xe8/0x2e2
12:01:03 kernel: irq_exit+0xcb/0xd0
12:01:03 kernel: smp_apic_timer_interrupt+0x7a/0x140
12:01:03 kernel: apic_timer_interrupt+0x85/0x90
12:01:03 kernel:
12:01:03 kernel: RIP: 0010:cpuidle_enter_state+0xb6/0x2c0
12:01:03 kernel: RSP: 0018:ffffffffa5c03e80 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff10
12:01:03 kernel: RAX: ffff96141e221100 RBX: ffff96141e229740 RCX: 000000000000001f
12:01:03 kernel: RDX: 0000000000000000 RSI: fffffffbbcc8801e RDI: 0000000000000000
12:01:03 kernel: RBP: 000000a00cd285fc R08: 000000a00cd285fc R09: 0000000000000004
12:01:03 kernel: R10: 000000000000a26c R11: ffff96141e220004 R12: 0000000000000004
12:01:03 kernel: R13: ffffffffa5ca4078 R14: 000000a001ed6517 R15: 00000000d7c53018
12:01:03 kernel: do_idle+0x170/0x1d0
12:01:03 kernel: cpu_startup_entry+0x6f/0x80
12:01:03 kernel: start_kernel+0x665/0x69a
12:01:03 kernel: secondary_startup_64+0xa5/0xb0
12:01:03 kernel: Code: 95 60 04 00 00 eb 90 48 89 ef c6 05 0c 9b a8 00 01 e8 e6 46 fd ff 44 89 e1 48 89 ee 48 c7 c7 08 a5 ad a5 48 89 c2 e8 dc a1 a7 ff <0f> 0b eb bc 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 66 66 66 66
12:01:03 kernel: —[ end trace c78ca47e05b026cb ]—
12:01:03 kernel: e1000e 0000:00:19.0 red0: Reset adapter unexpectedly
12:01:06 kernel: e1000e: red0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None

Going to switch back to OpnSense again.

Hi,

thanks for your reply and providing some logs.

12:01:01 kernel: e1000e 0000:00:19.0 red0: Detected Hardware Unit Hang:
[...]
12:01:03 kernel: e1000e 0000:00:19.0 red0: Reset adapter unexpectedly

This looks like the e1000e NIC driver is messing around again; I believe @arne_f is aware of that issue. Anyway:

12:01:03 dhcpcd[14728] : red0: deleting route to 1.1.1.0/24
12:01:03 dhcpcd[14728] : red0: deleting default route via 1.1.1.1

This should not happen - I am pretty sure 1.1.1.1 is not providing gateway connectivity to your system. Are you sure you configured the correct default gateway?

Thanks, and best regards,
Peter Müller

1 Like

Hi,

I’m having this same problem on a two port Intel NIC with the e1000e driver, on the green0 interface.

My hardware is a HP Thin Client with a Intel Pro NIC (two ethernet ports), wich is using the e1000e driver. It fails with the “Detected Hardware Unit Hang” with a very similar log to the above.

This same hardware works flawlessly on pfSense.

If I ssh to the machine and run:

ethtool -K eth0 gso off gro off tso off

Then it works like a charm.

Is there a fix for this?

Thank you!!

1 Like

I also see the same issue in all computers in the green zone that are pointing to google DNS servers. What is also notable is that the IP fire beeps the mini PC sound alarm as if has lost internet connections to the cable modem side to the ISP. I can see this issue recurrently almost every few minutes apart. When IPFire is taken out of the network path all works fine so no internet related issues. I have some suspicion also in two areas one is IPS but disabling it did not change anything and another area is the geolocation based blocking which could be related for video type streams based on services around the world. I use the geo based filtering to isolate inbound VoIP calls but it may have broader impacts. I will also disable that and try again.

------------[ cut here ]------------
03:47:09 kernel: NETDEV WATCHDOG: red0 (e1000e): transmit queue 0 timed out
03:47:09 kernel: WARNING: CPU: 1 PID: 0 at net/sched/sch_generic.c:477 dev_watchdog+0x255/0x260
03:47:09 kernel: Modules linked in: it87 hwmon_vid act_mirred act_connmark cls_u32 em_ipt act_gac t cls_basic ifb sch_ingress xt_layer7 sch_htb xt_NFQUEUE nfnetlink_queue xt_MASQ UERADE cfg80211 rfkill 8021q garp xt_set ip_set_hash_net xt_connlimit nf_conncou nt ip_set xt_hashlimit xt_policy xt_TCPMSS xt_conntrack xt_comment ipt_REJECT nf _reject_ipv4 xt_LOG xt_limit xt_mark xt_connmark nf_log_syslog iptable_raw iptab le_mangle iptable_filter vfat fat sch_cake x86_pkg_temp_thermal snd_hda_codec_re altek intel_powerclamp i2c_algo_bit coretemp fb_sys_fops snd_hda_codec_generic s yscopyarea ledtrig_audio sysfillrect at24 kvm_intel sysimgblt iTCO_wdt regmap_i2 c snd_hda_intel iTCO_vendor_support kvm snd_intel_dspcfg i2c_i801 snd_hda_codec irqbypass psmouse i2c_smbus pcspkr i2c_core snd_hda_core snd_hwdep e1000e snd_pc m lpc_ich ptp mfd_core pps_core snd_timer snd soundcore crct10dif_pclmul crc32_p clmul ghash_clmulni_intel serio_raw video
03:47:09 kernel: CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.15.35-ipfire #1
03:47:09 kernel: Hardware name: NEXCOM NDISB533/SHARKBAY, BIOS 4.6.5 11/29/2013
03:47:09 kernel: RIP: 0010:dev_watchdog+0x255/0x260
03:47:09 kernel: Code: 91 55 fd ff eb a6 48 89 ef c6 05 0c 8c fc 00 01 e8 80 f1 f9 ff 44 89 e9 48 89 ee 48 c7 c7 f8 da fa b7 48 89 c2 e8 ed dd 1c 00 <0f> 0b eb 87 0f 1f 80 00 00 00 00 0f 1f 44 00 00 41 54 55 53 48 89
03:47:09 kernel: RSP: 0018:ffff9ff2800ecea8 EFLAGS: 00010246
03:47:09 kernel: RAX: 0000000000000000 RBX: ffff938b454fb400 RCX: 0000000000000000
03:47:09 kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
03:47:09 kernel: RBP: ffff938b4498c000 R08: 0000000000000000 R09: 0000000000000000
03:47:09 kernel: R10: 0000000000000000 R11: 0000000000000000 R12: ffff938b4498c480
03:47:09 kernel: R13: 0000000000000000 R14: 0000000000000001 R15: ffff938b5aa9c740
03:47:09 kernel: FS: 0000000000000000(0000) GS:ffff938b5aa80000(0000) knlGS:0000000000000000
03:47:09 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
03:47:09 kernel: CR2: 0000708143bbd960 CR3: 000000008b60c003 CR4: 00000000001706e0
03:47:09 kernel: Call Trace:
03:47:09 kernel:
03:47:09 kernel: ? pfifo_fast_reset+0x140/0x140
03:47:09 kernel: call_timer_fn+0x26/0x100
03:47:09 kernel: __run_timers+0x1f2/0x270
03:47:09 kernel: run_timer_softirq+0x28/0x60
03:47:09 kernel: __do_softirq+0xc6/0x27e
03:47:09 kernel: irq_exit_rcu+0x89/0xb0
03:47:09 kernel: sysvec_apic_timer_interrupt+0x72/0x90
03:47:09 kernel:
03:47:09 kernel:
03:47:09 kernel: asm_sysvec_apic_timer_interrupt+0x12/0x20
03:47:09 kernel: RIP: 0010:cpuidle_enter_state+0xc7/0x380
03:47:09 kernel: Code: 8b 3d 65 d1 c8 48 e8 d8 d8 98 ff 49 89 c5 0f 1f 44 00 00 31 ff e8 99 e4 98 ff 45 84 ff 0f 85 15 01 00 00 fb 66 0f 1f 44 00 00 <45> 85 f6 0f 88 21 01 00 00 49 63 ce 48 8d 04 49 48 8d 14 81 48 c1
03:47:09 kernel: RSP: 0018:ffff9ff2800a7ea8 EFLAGS: 00000246
03:47:09 kernel: RAX: 0000000000000000 RBX: ffff938b5aab5430 RCX: 0000000000000000
03:47:09 kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
03:47:09 kernel: RBP: 0000000000000005 R08: 0000000000000000 R09: 0000000000000000
03:47:09 kernel: R10: 0000000000000000 R11: 0000000000000000 R12: ffffffffb83b01c0
03:47:09 kernel: R13: 000000a55b08fe8e R14: 0000000000000005 R15: 0000000000000000
03:47:09 kernel: ? cpuidle_enter_state+0xb7/0x380
03:47:09 kernel: cpuidle_enter+0x29/0x40
03:47:09 kernel: do_idle+0x1bf/0x200
03:47:09 kernel: cpu_startup_entry+0x19/0x20
03:47:09 kernel: secondary_startup_64_no_verify+0xb0/0xbb
03:47:09 kernel:
03:47:09 kernel: —[ end trace d3106f691f406835 ]—

I also see when disconnected the following entries

03:47:08 kernel: e1000e 0000:00:19.0 red0: Detected Hardware Unit Hang:

03:47:09 kernel: e1000e 0000:00:19.0 red0: Reset adapter unexpectedly
03:47:12 kernel: e1000e 0000:00:19.0 red0: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx /Tx

To cause the disconnect I just need intensive video streaming or just do the Google speed test few times… When IpFire is not in the path no issues observed.

It looks also that during speed test the upload speed test is causing the red interface restart rather than the download speed test.

Looks similar issue was logged way back

Hi @skndr,

first, welcome to the IPFire community. :slight_smile:

To keep the original thread on topic, I moved your posts into this one, since I believe you are suffering from the same e1000e driver issue.

Have you tried executing the command @rramalho posted earlier on your IPFire machine? If so, does it make a difference?

Thanks, and best regards,
Peter Müller

1 Like

Hi Peter,

Thanks for the lead. I did not try it as that was few version back and assumed that would be no issue in recent ones. I have now added the below to /etc/sysconfig/firewall.local

ifconfig | grep flags | cut -d: -f1 | grep -v ^lo$ | while read IFACE; do
ethtool -K $IFACE gso off gro off tso off
done

And so far looks good and promising as I can not cause the disconnections from myspeed test anymore. Thanks for the support and the great software!

1 Like