Connection on RED randomply stops working (maybe after DHCP renewal?)

Since a few months I experience problems with my RED connection.
At random intervals the connection just dies.
Internally everything is still working and the IPFire GUI works just fine as well.
When I reboot the box (a PCEngine APU4C4) the connection comes backup up and the box gets a new IP from the ISP.
The only thing so far that I found out was that I get a new IP every time when I reboot after the problem appears.
When I reboot when there’s no problem the IP stays the same and before the problems startet the IPs on my ISPs network were so stable that they were nearly fixed, I maybe had to change my A records once every 6-12 months.

I tried daily reboots and reconnects but it didn’t really seem to help.
The problem still appeared every now and then and the scheduled reconnects/reboots every 12h are only partially helpful since in the worst case it can happen that I’m out of the house and don’t have access for to my servers for 11:59h.

I recently tried a script which would check every 5 minutes (fcrontab) if it can ping my domain and if it doesn’t it will restart the box but that didn’t work the last time even though the script itself is working.

Does someone have an idea what I could do to improve the situation?
Do I need to reconnect the RED interface more often?
I’m as well a bit curios why the problem even started is my box maybe close to dying or did my ISP maybe change something?

4 lan interfaces? Are all them occupied?

Also: is brand new the cable between your PCEngine and the ISP?
Which speed is negotiated between your RED and the ISP device?

Only 3 interfaces and only two are connected (red and green).
It’s not a new cable but shouldn’t really be damaged since it never got moved anywhere.
I changed it now anyway for god measures to a CAT6 cable.
It’s a 1GBit/s connection and the the connection itself was and still is actually very good and stable besides the new problem.
However I think it’s more a problem on my side since the ISP never detected a problem when I called them and the modem never needs a reboot.
Restarting the firewall is enough.
When the problems started I let the ISP send me a new modem anyway since the previous one was very old, that didn’t fix it though.

Did you try back to the old one? Problem still there?

Maybe old values in some config files?! ARP cache, MTU size or so?

I’ve thrown the old one away because the ISP deactivated it but since the problem happened on the old one as well I don’t think it’s very likely that it would’ve helped.

I didn’t have to configure anything special in order to connect to the ISP and the cache should get cleaned after a reboot, no?

BTW I found the lease of the ISP in /var/ipfire/dhcpc/red0.lease:

BucSc=?

       32-129-172-5dyn.cable.fcom.ch38@6_:;18

I don’t really understand the format but maybe I need to reconnect the IPFire box every x hours to match the lease of the ISP?
AFAIK DHCP should do this automatically however.

I experienced a similar problem a few weeks ago and identified a buggy version of dhcpcd.
Every time i lost the connection on RED, i found an entry in the kernel-log “dhcpcd [Process-Number] segfault at 7fffec9e7000 …”
Which Core Version and which version of dhcpcd are you running?
I found a break-fix in this thread in the old forum, and after the upgrade to core 137 everything worked fine again.

Thank you for the input with the kernel log.
I’ve created a pastebin (https://pastebin.com/dBh1P40T) of the moment when the connection breaks, you can see that there is basically nothing going for about 2h and then collectd does its thing and then silence again for 2h until 6:18 when I wake up.

I’m currently on core version 138 and dhcpd version is isc-dhcpd-4.4.1.
Segfault doesn’t appear in my kernel logs.

I’m having the same issue with dhcpd that randomly crashes, I’ve enabled debug but nothing shows on the logs

I’m running ipfire 142, to avoid outage I’ve croned a small script to relaunch it :

#!/bin/bash

RESTART="/usr/local/bin/connscheduler reconnect"
PGREP="/usr/bin/pgrep"
PROC="dhcpcd"

$PGREP ${PROC}

if [ $? -ne 0 ]
then
 $RESTART
fi

Thank you for your input. That looks like a saner approach than what I did (restarting the whole firewall when I couldn’t reach Google every 5 minutes) :).