WIO & OpenVPN - fcron[nnn]: process already running: root’s ‘test -x /usr/local/bin/run-parts

As I have now discovered by chance, I have recently had the same problem (30% of an 8-core CPU and 11% RAM of 12 GiB).

WIO has worked well for several years now and regularly sent warnings via email when a computer failed. Emails about OpenVPN connections were also sent correctly.

The problem probably occurred when I installed the update 2.27 core 182 a few weeks ago (previously I [probably] had 2.27 core 178 on it), because the day of the update was “Last checked” in WIO.
I noticed it now when I installed IPFire (now updated to 2.29 core 184) on new hardware and imported the backup of the previous system.

Uninstalling, reinstalling and setting up WIO again without using the backup didn’t bring any improvement.

If I “Enable OpenVPN RW and IPsec Statusmails?” deactivate and then kill the viovpn.pl process and then activate “Enable OpenVPN RW and IPsec Statusmails” again, then it works again for a few hours.

You can also easily identify the problem via the GUI, as “Last checked” is no longer current in the event of an error.
I activated fcron logging and found the following in the log in the event of an error:

07:44:00 fcron[7458]: process already running: root’s ‘test -x /usr/local/bin/run-parts && /usr/lo cal/bin/run-parts /etc/fcron.minutely’
07:43:00 fcron[7458]: process already running: root’s ‘test -x /usr/local/bin/run-parts && /usr/lo cal/bin/run-parts /etc/fcron.minutely’
07:41:59 fcron[7458]: process already running: root’s ‘test -x /usr/local/bin/run-parts && /usr/lo cal/bin/run-parts /etc/fcron.minutely’
07:41:00 fcron[7458]: process already running: root’s ‘test -x /usr/local/bin/run-parts && /usr/lo cal/bin/run-parts /etc/fcron.minutely’

I hope I can contribute something to the solution with the information. The emails about when a VPN connection was established would be very important to me.

Hi @wniva

Welcome to the IPFire community.

No change has been made to the WIO code in the period CU178 to CU184

The last code change to WIO was in Aug 2020, which was CU148.

When you say new hardware, what was the difference in the hardware. Did both systems have the same cpu architecture?

Your logs indicate a problem with a file in the /etc/fcron.minutely/ directory.
Can you confirm that you only have the wio helper script in that directory. It will be labelled wio and be around 400 Bytes large.

Presuming that you have activate logging enabled in the WIO Configuration page.
If you look in the WUI menu Logs - System Logs and then select Who Is Online? in the drop down box labelled Section: and then press the Update button, are there any messages in those logs indicating any problems.

If there are only messages such as:-

19:35:42 wio: Client: test1.domain.org - IP: 192.168.128.20 - Status: Active
19:35:42 wio: Client: ipfire2.domain.org - IP: 192.168.26.220 - Status: Inactive

then could you try the following.

Edit line 35 of /var/ipfire/wio/wio.pl

to uncomment

#use warnings;

This should then give more info in the logs when it is having a problem.

How many clients do you have listed in your WIO page?

On my system with 18 clients defined in WIO a full refresh of all the clients takes between 40 and 45 seconds.

I just ran a quick test. I turned on the emails for OpenVPN RW’s and made and closed connections.

I received the status emails without any issues.

My memory is running at 15% for a 4GB memory system.

The CPU is a 4 core system and is running at 2% to 3%.

When running the status emails there was no change in the memory or cpu % values.

I am running with Core Update 184.

At the moment I can not reproduce your issue.

Hopefully getting some more log info might help to see what is going on.

In the past I used to have problems with WIO that resulted from tests using FQDNs and a failing unbound service on the firewall itself. Sometimes (a dozen or so core updates ago) unbound just stopped and needed a restart.

As soon as the FQDNs did not resolve, I received an alarm for every host in the list. Changing the WIO tests to IP only as a workaround was okay as long as unbound was causing troubles. Now everything is working as designed again.

Sorry for the late reply.

Regarding the questions (C=comment/Q=question/A=answer):

@bonnietwin

Q: When you say new hardware, what was the difference in the hardware. Did both systems have the same CPU architecture?
A: Yes, there was a change here:
old: AMD Opteron
new: Intel Xeon

Q: Can you confirm that you only have the wio helper script in that directory.
A: There is an info.txt and wio file in there.

C: …are there any messages in those logs indicating any problems.
A: There is no abnormality found in the logs (as long as it works). Entries like this also appear: “Client: WIO OVPN [Name] - IP: ... - Status: ACTIVE”

C: Edit line 35 of /var/ipfire/wio/wio.pl to uncomment #use warnings;
A: I’ll try that today.

Q: How many clients do you have listed in your WIO page?
A: 20 OVPN and 28 other connections
I’m currently using 5 minute intervals, but may test again with 10 minutes.

C: At the moment I can’t reproduce your issue.
A: My system sometimes runs for several hours before the situation described occurs. During this time, WIO updates and email notifications work fine.

@datamorgana

C: In the past I used to have problems with WIO that resulted from tests using FQDNs and a failing unbound service on the firewall itself.
Changing the WIO tests to IP only as a workaround was okay as long as unbound was causing troubles.
A: In the meantime (after about 4 weeks) I can narrow down the problem to OpenVPN because I have switched off OpenVPN RW and IPsec and all clients (IP or FQDN) are working without any problems.

I have “Enable OpenVPN RW and IPsec status emails?” activated and “Time interval for checking the OpenVPN RW and IPsec Status” set to 5 minutes.
Additionally, I activated use warnings in wio.pl.

At first this worked, but after about 1 hour the messages:

'fcron[7454]: process already running: root’s ‘test -x /usr/local/bin/run-parts && /usr/local/bin/run-parts /etc/fcron.minutely’

and the clients (IP/FQDN) are no longer updated.

There were no warnings in the WIO system log.

The warnings would be in the `var/log/httpd/error_log

This still suggests to me that the wio status check is being run by fcron but the previous check had not yet stopped.

Is the “Time interval for checking:” also set at 5 minutes.

There are two different Time interval settings. One is for all the clients and the other one for the OpenVPN and IPSec connections.

If you have both those settings set to 5 minutes I would suggest trying out with something like 15 or 20 minutes and see if that resolves the problem.

2 Likes