IPFire went down last night, can't find cause

Since the new certificate has been in place have you had any more memory spikes? Also, did you have to re-set up every OpenVPN client install again?

Chris

So the system did it yesterday and it did it again today where the system did some kind of a reboot, but honestly today it happened so fast that I didn’t even notice. The same thing happened yesterday, but I don’t know what is causing this situation. Attached is the bootlog.
bootlog.gz (14.3 KB)

Here is the memory Graph:

You can see the tiny white line between 3:30 and 4:00 PM today. The uptime on IPFire is 1 hour 3 minutes, as of 4:51 PM today so 3:48 is when the system went down / restarted for some unknown reason. This time not because of a memory load, the graph doesn’t show it, but something is still causing our system to go down now daily. Here’s the /var/log/messages for the 3:00 hour with security related info removed.
10-19-22-1500.txt.gz (159.1 KB)

If anyone could help I would appreciate it.

The 05:30 crash is not in the message log. That might be the important one since memory usage starts climbing near 05:00.

I can find the one that happened at 15:46:43 (up at 15:48:05):

Oct 19 15:46:43 ipfire kernel: DROP_INPUT IN=red0 OUT= MAC=<red-mac-address>:84:bb:69:d2:b8:b0:08:00 SRC=47.35.152.95 DST=<External-IP> LEN=80 TOS=0x00 PREC=0x00 TTL=111 ID=57233 PROTO=UDP SPT=55342 DPT=64645 LEN=60 MARK=0x80000000

Oct 19 15:48:05 ipfire syslogd 1.5.1: restart (remote reception).
Oct 19 15:48:05 ipfire kernel: usb 1-5.2.2.1: New USB device found, idVendor=0557, idProduct=2419, bcdDevice= 1.00

I don’t see anything interesting in the message log. To me it acts like it is a power issue (lost power for whatever odd reason) or a maybe hardware problem. Maybe someone else can look and add there comments.

I have the same problem, this process (openvpn-authent) fills memory until full. This process occurs periodically every night or after a new certificate is created.
Other processes are then terminated.

I have now deleted the certificates and deactivated OpenVPN, since then the problem has not occurred again.

Same thing occurred again this morning:

Oct 17 05:00:34 ipfire openvpnserver[31339]: MANAGEMENT: Client connected from /var/run/openvpn.sock
Oct 17 06:25:26 ipfire kernel: openvpn-authent invoked oom-killer: gfp_mask=0x1100dca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), order=0, oom_score_adj=0
Oct 17 06:25:26 ipfire kernel: CPU: 0 PID: 16199 Comm: openvpn-authent Not tainted 5.15.59-ipfire #1
Oct 17 06:25:26 ipfire kernel: [  16199]     0 16199  2154150  1858074 16859136   236139             0 openvpn-authent
Oct 17 06:25:26 ipfire kernel: [  31339]    99 31339     1910       32    49152      145             0 openvpn
Oct 17 06:25:26 ipfire kernel: [  31389]     0 31389     4059        0    73728     1802             0 openvpn-authent
Oct 17 06:25:26 ipfire kernel: oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/,task=openvpn-authent,pid=16199,uid=0
Oct 17 06:25:26 ipfire kernel: Out of memory: Killed process 16199 (openvpn-authent) total-vm:8616600kB, anon-rss:7432056kB, file-rss:240kB, shmem-rss:0kB, UID:0 pgtables:16464kB oom_score_adj:0
grep: messages: binary file matches
[root@ipfire log]# uptime
 09:09:18 up  4:16,  1 user,  load average: 1.43, 1.37, 1.24
[root@ipfire log]#

Is there any way to downgrade to 168 where these problems did not occur, or do you have to delete and rebuild a whole system?

4 Restarts of some sort between Wednesday at 15:00 and Today at 09:00

Sorry to say you have to rebuild.

EDIT:
Please open a bug report about the openvpn-authent issue. This will help make sure the Development team reviews this information.

Login using your IPFire email address and the IPFire password.

Information to add a bug report in IPFire Bugzilla:

1 Like

Is there an archive ftp server that holds the older build versions?

Chris

yes

https://mirror1.ipfire.org/releases/ipfire-2.x/

Chris - FYI

1 Like

I already downloaded that version, but I can’t reboot until after hours, probably this weekend.

Chris

Hi Chris, stop openVPN, delete the certificate, upgrade to core 171 and reboot. Then create a new certificate and start openVPN.

Seems to help me. It’s worth a try and less work than reinstalling everything.

Many greetings

Jürgen Schamberger

1 Like

If it is a bug in OpenVPN < 2.5.7 then it is not an endemic problem, it must be a combination between an earlier version and something else.

I am using OpenVPN with IPFire for several years and I have not seen any issues with Out Of Memory events in any of that time (I have grepped through the IPFire logs for oom and Out of memory and found nothing) and my memory graph has not seen any large spikes. The used memory has varied between 7% min and 27% max.

It might have some relation to interactions with other packages. I have seen some references to qemu also being impacted but I don’t use qemu on my IPFire system.

3 Likes

I’ll try that and post the results later today.

Chris

1 Like

Unfortunately too early happy. Same again tonight :confused:

Oct 21 01:49:21 famschamrouter kernel: oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/,task=openvpn-authent,pid=7917,uid=0
Oct 21 01:49:21 famschamrouter kernel: Out of memory: Killed process 7917 (openvpn-authent) total-vm:8616456kB, anon-rss:6662008kB, file-rss:0kB, shmem-rss:0kB, UID:0 pgtables:15108kB oom_score_adj:0


So far it has not happened to me, uptime is now a whopping 14 hours and counting.

I have run into another issue though. Since upgrading both sides of an IPSec tunnel to 171, we have lost communication between the sites, even though the tunnel appears connected, I cannot ping either side from the other. Has anyone else experienced this ?

I created a bug report in IPFire Bugzilla

https://bugzilla.ipfire.org/show_bug.cgi?id=12963

Jürgen

1 Like

I installed Core 168, 169, 170 and 171 on another machine today and tested it with a minimal configuration.

The problem seems to have existed since Core 169. On Core 168 the openvpn-authentic process doesn’t start because the file in /usr/sbin doesn’t exist.

I created a certificate in Core 168 and then upgraded to Core 171. With this “old” certificate, openVPN can be stopped and started without any problems.

I think there is a problem with Core 169’s certificate generation.
Was a new openVPN version installed with Core 169 or is it TOPT?

Jürgen

I have no idea what TOPT means.

OpenVPN was updated from 2.5.4 to 2.5.6 in CU169 and then to 2.5.7 in CU171

From 2.5.4 to 2.5.6 there was no change to the rootfile and openvpn-authenticator does not come from OpenVPN. It is a separate program created as part of the 2FA addition to OpenVPN, which was introduced with CU169.

Are you using the 2FA option with OTP selected on your clients.

I am not using OTP with my OpenVPN connection and I don’t have openvpn-authenticator running at all, although it is available in the sbin directory. If you are using OTP, what happens if you don’t use the OTP, does the problem still occur with openvpn-authenticator maxing out on memory?

If the issue is actually with the certificate generation then that would not be coming from OpenVPN but from OpenSSL.

OpenSSL was updated from 1.1.1o to 1.1.1p in CU169 and then to 1.1.1q in CU170

In OpenVPN-2.5.6 the only change mentioned related to certificates is

repair handling of EC certificates on Windows with pkcs11-helper
So this is related to windows and to pkcs11 which do not relate to what is used for IPFire OpenVPN.

2 Likes

Sorry I’m not an expert, just a user. I meant OTP. I can only report what I observe. The problem seems to occur from Core 169 when creating a new certificate. I don’t know if it’s OpenSSL.
I don’t use OTP.