Fcron strange behavior

binabik · 28 July 2021 12:00

Hello,

I have been using IPFire for a few years. The computer boots every day at 05:30 using the wakeup BIOS mechanism and shuts down automatically using a scheduled script at 22:30 (fcrontab). There have been no problems for a long time. But since the upgrade to core 157 I’m experiencing problems with it. Some days, randomly, the shutdown script runs about 2 minutes after system boot and shuts down the system. When that happens I simply turn on the computer manually and all is well, the system works fine. I have gone through my fcrontab file and the shutdown script only appears once, scheduled at 22.30 every day. I have no entries for the script in any of the fcron folders under /etc. I am the only user on the computer.
Any idea what is going on? Why is the script running when it is not scheduled?

Note: when I upgraded to core 157 I lost the grub configuration and had to reinstall IpFire using a backup to restore.

Thank you very much.

bbitsch · 28 July 2021 12:18

How exactly is the fcrontab schedule line?
Are there any options to execute ‘undone events’?

binabik · 28 July 2021 12:55

Hello,

This is the fcrontab line:
30 22 * * * /scripts/apagado/apagado_firewall.sh

Apagado means shutdown in spanish.

This is the content of the apagado_firewall.sh (no rocket science):

#!/bin/bash

# sync time
hwclock --systohc

# send a report (uptime, disk usage, and some data more)
source /scripts/sendreports/full_report.sh

# # shutdown the firewall
/sbin/shutdown -h now

Today, this line was in my /var/logs/messages file:

Jul 28 05:32:20 wall shutdown[5188]: shutting down for system halt

And using a custom script and ps to monitor the processes I got this:

root      3595  0.0  0.0   3544  1716 ?        Ss   05:30   0:00 /usr/sbin/fcron -y
root      5125  0.0  0.0   3544   196 ?        S    05:32   0:00  \_ /usr/sbin/fcron -y
root      5126  0.0  0.0   4188  2900 ?        Ss   05:32   0:00      \_ /bin/bash /scripts/apagado/apagado_firewall.sh

Thanks for your reply.

binabik · 28 July 2021 13:07

Hi again,

Sorry Bernhard, I don’t know what you mean by ‘undone events’.

And yes, the system shut down yesterday at 22:30 and sent me the report.

Thanks.

bbitsch · 28 July 2021 16:58

Okay, was just an idea.
If IPFire was stopped before 22:30, the scheduled task could have been executed after restart.
But your line and the sequence of events doesn’t confirm this case.

mfischer · 28 July 2021 18:31

Hi,

just curious: why don’t you use the “Connection Scheduler”?

conn_schedul_shutdown

Best,
Matthias

binabik · 29 July 2021 09:03

Hi Matthias,

I used to use Connection Scheduler but then I needed to perform some tasks before shutdown so I switched to a script. Now I don’t need to perform those tasks anymore so I guess I could schedule sending reports and use the Connection Scheduler to shutdown the computer. I will do that if there is no other solution. However, I would like to know why this is happening, if I am doing something wrong or if it is a system problem.

Thank you.

jon · 29 July 2021 15:09

You may want to check if the clock is correct from boot time (05:30) to sudden/unexpected shutdown. Maybe the clock states ~22:28 at boot time…

This info might be in one of the logs.

binabik · 30 July 2021 09:02

Hi Jon,

I believe this is not the case. The only line I have found in the message log shows the correct system time:

Jul 28 05:32:20 wall shutdown[5188]: shutting down for system halt

The strange thing is that the problem occurs randomly with no apparent pattern.

Thanks a lot.

bbitsch · 30 July 2021 09:34

Does your system have a RTC?
Can you monitor the boot process from a console?
What time is given for the system start ( lines like

Jul 27 18:37:13 BitschCop acpid: exiting
Jul 27 18:37:15 BitschCop kernel: Kernel logging (proc) stopped.
Jul 27 18:37:15 BitschCop kernel: Kernel log daemon terminating.
Jul 27 18:37:16 BitschCop exiting on signal 15
Jul 27 18:39:22 BitschCop syslogd 1.5.1: restart (remote reception).

in /var/log/messages) ?

Just some thoughts.
Background: If there is no (functioning) RTC the last time setting is valid at system restart. If the fetch from a time server lasts longer than the activation of fcron, old time events are executed.

binabik · 30 July 2021 10:58

Hi,

The system starts at 05:30. At that time I am still sleeping But I have created a script that is launched at boot time (fcrontab, @reboot). That script runs for the first 20 minutes taking a snapshot of the running processes using the ps command every second (ps auxf). This is how I was able to display the lines:

root      3595  0.0  0.0   3544  1716 ?        Ss   05:30   0:00 /usr/sbin/fcron -y
root      5125  0.0  0.0   3544   196 ?        S    05:32   0:00  \_ /usr/sbin/fcron -y
root      5126  0.0  0.0   4188  2900 ?        Ss   05:32   0:00      \_ /bin/bash /scripts/apagado/apagado_firewall.sh

The system has the RTC from the BIOS:

hwclock -r
2021-07-30 11:43:54.669813+0:00

I think the clock is OK at boot time:


Jul 27 22:30:41 wall acpid: exiting
Jul 27 22:30:42 wall kernel: Kernel logging (proc) stopped.
Jul 27 22:30:42 wall kernel: Kernel log daemon terminating.
Jul 27 22:30:43 wall exiting on signal 15
Jul 28 05:30:27 wall syslogd 1.5.1: restart (remote reception).
..............
Jul 28 05:32:50 wall acpid: exiting
Jul 28 05:32:51 wall kernel: Kernel logging (proc) stopped.
Jul 28 05:32:51 wall kernel: Kernel log daemon terminating.
Jul 28 05:32:52 wall exiting on signal 15
Jul 28 07:57:24 wall syslogd 1.5.1: restart (remote reception).

The last line is manual power on.

Thanks.

bbitsch · 30 July 2021 11:05

Okay.
This isn’t the problem. ( If fcrond isn’t started before syslogd )
At the moment I haven’t a further idea.

binabik · 30 July 2021 13:10

Hi,

Look at this:

2021/07/28 05:31:09 (Wednesday)
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root      2109  0.0  0.0   2440  1624 ?        Ss   05:30   0:00 syslogd -m 0 -r
root      3595  0.0  0.0   3544  1716 ?        Ss   05:30   0:00 /usr/sbin/fcron -y

According to this, the PID of the syslog is lower than the fcron, so I believe syslog started before frcon. Am I right?

Thanks.

binabik · 23 August 2021 10:33

Hi,
I’ve solved the issue simply adding a delay to the shutdown command in the apagado_firewall.sh, so the content of the script is now this:

#!/bin/bash

# sync time
hwclock --systohc

# send a report (uptime, disk usage, and some data more)
source /scripts/sendreports/full_report.sh

# # shutdown the firewall
/sbin/shutdown -h +10

I was using sendEmail for sending reports for a while. When I upgraded to core 158, sendEmail was removed so I created my own script for sending reports, using python and smtplib module. That was when the problems started.
I use my script for sending reports in others machines without any problem, and even in this IPfire machine the script works well and the reports are sent. But for some unkown reason, I think this is the source of the random problem (connection to smtp server not closed properly maybe).

How I said, adding a delay between the sent of the report and the shutdown, has soved de problem.

Thanks to all.