IPFire went down last night, can't find cause

Hi all,
@cwensink i would check your OpenVPN logs not only by grepping for errors, may you can find there more informations. Nevertheless, there seems to be something really wrong with your OpenVPN configuration possibly you can also find this messages in the OpenVPN logs " TLS Error: TLS key negotiation failed to occur within 60 seconds", since this message appears very frequent, OpenVPN have decided to create an own page/checklist for this problem → https://openvpn.net/faq/tls-error-tls-key-negotiation-failed-to-occur-within-60-seconds-check-your-network-connectivity/ may you know it and or checked it ?

Another thing, it might be great if you can set the code tags if you post logs (easier to read).

Some ideas to one problem. May it might be also helpful if you sort the different problems (OOM, OpenVPN, Unbound, Smart status) a little more for your investigation but also for the community may some help comes faster ?

Best,

Erik

1 Like

Not sure if it helps.
I have had no problem with core 169
Core update did add 2factor authentication to OPENvpn.
A feature I’m not using.
Not a tor addon user either as a side note.

How long has this appliance been in service and were any updates or changes made, no matter how minor, before the first crash instance?
What version and core update are you running?

The most worrying thing I’ve seen in this entire post are the drive IO errors. If the drive is failing, it will cause all sorts of random errors. Same can be said for memory though. I get SMART says the drive is healthy, but it’s not entirely reliable. Could be a physical issue as well like a loose power or data cable, maybe a failing port.
Not sure how long you can take this appliance out of service, but if no software changes were made, I would install a new drive, restore the latest backup file and run memtest.

1 Like

Tuesday night I re-built a new IPFire machine, and the next two days have been quiet, errors for openvpn ended on 9/20 per the messages log. It’s entirely possible that all of this had nothing to do with IPFire and was related to hardware failure or possibly a security breach on the router, but the router is offline now.

Any suggestions on what utilities to run to test the hardware of the device? The machine is a Quad Core Atom processor 8, GB ram, 120 GB ssd Micro ITX Supermicro box 2.7.Ghz, and I am ok with any testing app in an offline environment.

What apps do you guys like to use?

DisturbedDragon,

This unit has been in service for about 5 years, and has had a number of changes in the configuration along the way. It first started having these major problems when running 169, and for troubleshooting I tried updating to 170, but that did not fix the issue. On this appliance there’s a power plug with an external power supply that’s similar to a laptop except the plug screws in on the end (like a water hose).

Chris

With the new machine in place and working I suppose the issue is resolved.
The issue appears to be an acute one. Working fine one day and then became crashy. I’m going with a hardware error again printing out the IO issues logged. You could still troubleshoot the old machine with the steps provided previously since it is now offline.
I will say that upgrading a machine experiencing issues is a bad idea. Updating the software could introduce a new bug compounding existing problems with new ones. Better to resolve any issues then upgrading.