Sadly, this hasnāt lead to anything helpful, other than me finding some source code for dhcpcd mentioning checking for BOOTREPLY but the code is unclear to me.
Iām sure my ISP uses Cisco equipment, but either way this is looking like something Iāll have to ask them about.
Unless anyone has any good ideas Iām going to have to go through the RFCs for DHCP!
Might need to get a PCAP from red0 during the problem.
Iāve since seen Mar 23 19:57:36 gateway dhcpcd[13592]: red0: op (141) is not BOOTREPLY
in a few attempts. Usually the respones which are being marked as ānot BOOTREPLYā are within the first 4 seconds, making me wonder if a recent change to dhcpcd could have caused this to be a problem for me.
tcpdump for red0 (with no filters) was unable to capture whatever packets caused that error. When I eventually was able to record a successful connection I captured the DHCP offer and DHCP ACK without problems.
The problem is that we have updated it with core136 and reverted this back because user reported such problems with 8.02 to 8.10.
And now users have the same problems also with the old 7.2.3 version. Im not sure what happens here. Maybee some changes at the ISPās are the real reason⦠But im not sureā¦
Thank you.
Do you have any theories as to what may have changed?
Iāve not been deliberately breaking my connection to test and, without doing that, dhcpcd has been reliably renewing every 5 minutes (my ISP lease time is only 10 minutes).
The problem I described above, happens if for any reason the modem/router is restarted, my link to the ISP is down or the ISP themselves reset their router. While these things happen rarely, when they happen IPFire remains disconnected forever, needing intervention.
Iām hoping that the dhcp rapid_commit option may have been causing this issue. I have had to run the script I mentioned here every minute for the past 5 years to ensure my IPFire system reconnects to the internet if it is ever disconnected.
The RED interface can now be configured to no longer require the RFC4039 Rapid Commit option. This is a default option in almost all DHCP clients for over 20 years, but we have recently observed ISPs running broken DHCP servers which no longer work if this option is enabled. It can now be enabled or disabled using the setup command.
Hey @hvacguy thank you for posting but Iām confused.
What did you mean to add by linking to what I have just mentioned?
I was aware of the change in Core Update 190 and Iāve posted to this old thread now explaining that hope it will fix my issue.
I had to update the wiki as until yesterday it said that the rapid_commit option had to be changed in an /etc/ configuration file which no longer exists.
This says that your IPFire solicited a dhcp lease and was offered a lease and then a very short time later was offerred the same lease again which was ignored by dhcpcd.
The same lease being offered twice in short succession is not what should happen but from what was said by Roy Marples (originator and developer of dhcpcd) in an issue report, it would not cause dhcpcd any issues because it will just ignore the repeated version.
The only other advice that was given by Roy Marples was to comment out some options in dhcpcd.conf, one at a time, and see if things improve as your ISPās dhcp server maybe is not RFC compliant and doesnāt like one of them.
You have already excluded the rapid_commit option but maybe there is another option that your ISPās dhcp server has been designed to not work with even though it should.
Of course the ISPās dhcp server should work with all the options listed in the dhcpcd.conf file as they have been around for a long time and are RFC defined but, as with the rapid_commit, some ISPās are just having incorrectly working dhcp servers.
My previous ISP had a dhcp server that worked fine with the rapid_commit option but they were then bought out by another ISP and their dhcp server wonāt work properly with that option.
No response back from them about the issue.
problem is its going to be hard to duplicate for Roy.
The ISP has the lease too short. 1000 seconds should be the shortest lease. Looking at what others did (OpenWRT,OpenSEnse,pfsense) they install a script that runs in chron to see if the internet is up and if not, manually bring down the interface,bring it back up and initiate dhcp. They added this as a ākeep aliveā option for these Fiber DSL systems. Because this issue is with the way they configured the system and not the router OS or the modules like dchpd.
Thanks again @bonnietwin . This has been in my ātoo hard basketā for a while but I should invest the time to experiment with DHCP options and try to get to the root cause.
Unfortunately I have two problems. From my post 5 years ago:
The first issue is arguably more important.
I have been ignoring this for 5 years because I have a script which checks for connectivity running from /etc/fcron.cyclic which I shared previously in this post for someone who had a similar issue.
Does anyone have any ideas how to diagnose this plesae?
Iāve taken some packet captures of red0 before pulling the VDSL cable, waiting for the modem to disconnect, then reconnecting the cable. I canāt anything of value in them.
Unfortunately the packet captures contain a lot of personal metadata, so Iād prefer not to share them generally with this forum (but could PM an expert).
On the second issue, one retry resulted in this log in IPFire:
Mar 30 20:26:45 gateway dhcpcd[13171]: timed out
Mar 30 20:26:45 gateway dhcpcd[13171]: main: control_stop: No such file or directory
Mar 30 20:26:45 gateway dhcpcd[13171]: dhcpcd exited
but at least I can experiement with removing DHCP options as suggested.
So was this used equipment that windows10 was ran on?
Because there was an unconfirmed notice from Arch Linux to Main that Windows 10 flashed something in the firmware on an update in network cards that later was used in Arch Linux and caused DHCP renew issues. But no one responded back from Arch when Main asked what hardware was getting altered like this. So it might be something to look at and try a new card or usb interface that never has seen windows. Because that is what was reported the fix for them. Even though they should communicate better on this subject to Linux Main.
I remember this in Main Dev email about two years ago, but I imagine it was discussed in Arch linux. But since youāve mentioned this, I am going to search and see if its been brought up. I just wanted to mention this because of the possibility. But further investigation might be needed to rule out if the driver itself is malfunctioning too. The issue is being able to duplicate it in a dev lab and what interfaces are involved. I doubt my realtek driver I help integrate is causing issues because of all the testing that was done and didnāt have much issues with other modules, but I did had to add a flush buffers before dhcp renew, but I would think Linus would not overlook that because heās pretty thorough with things and he tasked himself for the Intel driver and that may or may not be needed. Grimm (Linus) is very intimidating so some of us will sneak the patch in instead of getting scald for bring it up because its everybodyās responsibility is to double check everyoneās work. I was too busy to double check the Intel driver, but there are 128 more people that was tasked to do so. I havenāt looked to see if dchpd devs are having an issue with things, but rarely they have something new pop up that they canāt resolve quickly.
But I did notice I have a big delay with my server handing out dhcp clients since 6.6.56 when they boot up, and that was another intel driver that was edited. I should investigate it but I havenāt got around to that.
Internet Systems Consortium DHCP Server 4.4.3-P1
Copyright 2004-2022 Internet Systems Consortium.
All rights reserved.
This is old and one would have to check if RTA_EXPIRES and RTA_MULTIPATH is configured in the Kernel which was one of the things that was going to be removed last year. Also I think there was a path change too in a Kernal call and that can explain why the file not found error occurred.
You have confused the dhcpd server with the dhcpcd client.
The dhcpd server is used to provide the dynamic and fixed IPās to clients on the green and blue networks in IPFire.
The version in IPFire is the latest and last version from the ISC DHCP server.
ISC have changed from the DHCP package for the server to the Kea package.
The migration from DHCP to Kea is on the list but it is not easy as the syntax and config file construction are radically different. We need to figure out how to migrate all the users systems, irrespective of the options etc that they have used.
The dhcpcd client is the package from Roy Marples and that is the one being used as a client on the red interface.
We currently have dhcpcd-10.2.2 in Core Update 193 Testing., which is the latest version, from Feb this year.