Pmacctd - shuts down at 1:25

hellfire · 15 November 2019 07:40

Hi,

I’m starting pmacctd from command line as a daemon, using

pmacctd -f /etc/pmacct/ -D

The daemon runs of course but I noticed that every day at 1:25 it shuts down itself and does not restart any more. I’m using pmacct/d for testing purposes for quite a few days, however, each night it stops and does not restart.

IPFire log tells nothing about the cause, ok some lines are added to /var/log/messages, I cannot add them at the moment, will do when at home again, but nevertheless, the logs indicate no error or similar at this point of time, neither from pmacct nor from IPFire.

As already told, it is strange that this happen each day now at the very same time. IPFire of course does not shut down nor does it loose its internet connection or toe the switch/LAN or runs low on memory this time (I’m monitoring this, too, and can say this for sure)

Any idea what might happen here?
Michael

hellfire · 15 November 2019 15:04

So here is the promised log extract at 1:25

Nov 10 01:24:04 ipfire pakfire: PAKFIRE INFO: IPFire Pakfire 2.23-x86_64 started!
Nov 10 01:24:04 ipfire pakfire: CORE INFO: core-list.db is 2399 seconds old. - DEBUG: noforce
Nov 10 01:25:07 ipfire pmacctd[11359]: WARN ( default/core ): connection lost to ‘green_ports-memory’; closing connection.
Nov 10 01:25:08 ipfire pmacctd[11359]: WARN ( default/core ): connection lost to ‘green_basic-memory’; closing connection.
Nov 10 01:25:09 ipfire pmacctd[11359]: WARN ( default/core ): connection lost to ‘green_full-memory’; closing connection.
Nov 10 01:25:09 ipfire pmacctd[11359]: WARN ( default/core ): no more plugins active. Shutting down.
Nov 10 01:25:09 ipfire kernel: device green0 left promiscuous mode
Nov 10 01:25:11 ipfire pmacct: GeoIP database has been updated
Nov 10 01:25:13 ipfire pakfire: PAKFIRE INFO: IPFire Pakfire 2.23-x86_64 started!

What’s interesting, I’m logging the output of pmacct print plugin from within a Python script and just saw that the appropriate CSV-file was still process at the moment when pmacctd shut down:

2019-11-15 01:25:09,952 - traffic - DEBUG - Performing reverse-resolve of 157.230.131.25
2019-11-15 01:25:09,954 - traffic - DEBUG - Resolved 157.230.131.25 to: [‘on-us-cloudsync4.synology.com’, ‘157.230.131.25’]
2019-11-15 01:25:10,049 - traffic - DEBUG - Performing reverse-resolve of 192.168.6.97
2019-11-15 01:25:10,050 - traffic - DEBUG - Returning cached result for 192.168.6.97 ([‘Donau’, ‘192.168.6.97’])
2019-11-15 01:25:10,051 - traffic - DEBUG - Performing reverse-resolve of 165.22.130.180
2019-11-15 01:25:10,053 - traffic - DEBUG - Resolved 165.22.130.180 to: [‘on-us-cloudsync5.synology.com’, ‘165.22.130.180’]

Obviously some seconds after the last print trigger was set, pmacctd exited. Since then now more logs where written into my DB.

Still no clue though where to look for the shutdown of pmacctd.

hellfire · 16 November 2019 15:35

Same happened again, today.

I’ve now set up a cron job to restart pmacctd at 1:28 o’clock. Guess this will work, however I rather would like to know the source of the shutdown.

I’ve found an option (plugin_exit_any) in pmacct Wiki which might solve this, but reading the description, I feel that the opposite will hapen, if I set it to true.

Hence, I added the option to my config but I explicitely set the value to false which is the default value, btw. But one never knows…

Michael

ummeegge · 16 November 2019 19:42

Hi Michael,
is there somewhere a segfault via dmesg findable ? Have found here also some problems with the print plugin in 1.7.3er version. Is 1:25 not also fcron time ?

Currently not sure what´s happening there…

Best,

Erik

hellfire · 16 November 2019 19:52

Hi Erik,

nothing special here on my side, using dmesg | grep segfault does not return a single line, if I correctly used this command.

Cron job? Cron log is disabled for some good reasons (too many log entries) and I could not find a single line in cron tab that points exactly to 1:25h.

Btw, I’ve opened a ticket in git: https://github.com/pmacct/pmacct/issues/340

Michael

ummeegge · 16 November 2019 19:54

Great! Will see if i can find here also some more usable infos and come may around if it fits.

Best,

Erik

hellfire · 16 November 2019 19:59

Btw, are you running pmacctd 24/7? I mean, did you also get those shut downs of the daemon at a specific point of time?

OT: will pmacct work with core 137, already?

Michael

ummeegge · 16 November 2019 20:09

Have found it one time

Nov 13 01:25:12 ipfire pmacctd[18528]: WARN ( default/core ): connection lost to ‘plugin1-memory’; closing connection.
Nov 13 01:25:12 ipfirer pmacctd[18528]: WARN ( default/core ): no more plugins active. Shutting down.
Nov 13 01:25:15 ipfire pmacctd[27939]: INFO ( default/core ): Start logging …
Nov 13 01:25:15 ipfire pmacctd[27940]: INFO ( default/core ): Promiscuous Mode Accounting Daemon, pmacctd 1.7.3-git (20190418-00+c4)

whereby it did restarts at that time.

Otherwise i have found that one

Nov 13 01:25:15 ipfire-server pmacct: GeoIP database has been updated

at that time. Which is the GeoIP updater in fcron.

Have currently that problem with the memory plugin:

Nov 12 20:22:16 ipfire-server pmacctd[18528]: WARN ( plugin1/memory ): Failed during write: Resource temporarily unavailable
Nov 12 20:22:17 ipfire-server pmacctd[18531]: WARN ( plugin1/memory ): Missing data detected (plugin_buffer_size=344 plugin_pipe_size=4096000).
Nov 12 20:22:17 ipfire-server pmacctd[18531]: WARN ( plugin1/memory ): Increase values or look for plugin_buffer_size, plugin_pipe_size in CONFIG-KEYS document.#012
Nov 13 01:25:12 ipfire-server pmacctd[18528]: WARN ( default/core ): connection lost to ‘plugin1-memory’; closing connect

which seems to be not a configuration issue but i need to investigate it a little deeper.

Best,

Erik

EDIT: Can you check please also the version ?

hellfire · 16 November 2019 20:16

OK, you hit the issue with the memory plugin mine is the print plugin, both at 1:25. Funny and also strange

Which version of what do you mean? I’m still using pmacct you once compiled recently and put it here for download: https://people.ipfire.org/~ummeegge/pmacct/pmacct+rabbitmq+GeoIP/

[root@ipfire metrics]# pmacct -V
pmacct IMT plugin client, pmacct 1.7.3-git (20190418-00+c4)
‘–prefix=/usr’ ‘–sysconfdir=/etc/pmacct’ ‘–enable-static=no’ ‘–enable-sqlite3’ ‘–enable-l2’ ‘–enable-plabel’ ‘–enable-rabbitmq’ ‘–enable-geoipv2’ ‘–enable-jansson’ ‘CFLAGS=-O2 -pipe -Wall -fexceptions -fPIC -m64 -mindirect-branch=thunk -mfunction-return=thunk -mtune=generic -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -fstack-protector-strong’ ‘–enable-64bit’ ‘–enable-traffic-bins’ ‘–enable-bgp-bins’ ‘–enable-bmp-bins’ ‘–enable-st-bins’

hellfire · 16 November 2019 20:34

Same here, as posted above from my logs. Where is this job configured? crontab does not list it. Is it burried somewhere ind the /etc/fcron.xxx subfolders?

Edit: Found! /etc/fcron.daily. But this means that cron is executing the jobs in this subfolder each day at 1:25? Can I temporarly disable the GeoIP updater?

Edit2: Yes, according to this line in crontab: &nice(10),bootrun 25 1 * * * test -x /usr/local/bin/run-parts && /usr/local/bin/run-parts /etc/fcron.daily

So still the question: can I disable the GeoIP updater without deleting it from the folder?

Michael

ummeegge · 16 November 2019 21:16

You can just remove the script from /etc/frcon.daily .

hellfire · 17 November 2019 08:47

So far I did not remove the cron job, however, I restart pmacct at 1:28 and this worked today.
It seems it is somehow related to the GeoIp updater, isn’t it?

ummeegge · 17 November 2019 09:13

You can check it by executing the geoip-updater manually. If pmacctd runs, you can execute the script with enabled debugger via
bash -x /etc/fcron.daily/geoip-updater
if Pmacct runs before (ps aux) and after executing the updater the problem should not be caused by the script in my opinion.

Best,

Erik

hellfire · 17 November 2019 14:37

Found the source of the issue by executing the updater manually:
+ /etc/init.d/pmacct restart
Stopping the pmacct daemon… [ OK ]
Starting the pmacct daemon…
ERROR: [/etc/pmacct/pmacct.conf] file not found. [ FAIL ]

The init script tries to restart with config-file pmacct.conf. Mine, however, is named different. So the daemon won’t start in this case. Now, I’ve two options: rename my configuration file or change the init script.

The first one is not that easy since I will monitor green and red interface later, right now just the green one, and using the mapping file I cannot properly distinguish between both interfaces.

Anyway, when using two config files later on, I guess I will have to change the init script anyway to start two pmacctd daemons at once.

Or do you know how to bring the interface name and not the index from the mapping file into the aggregation filter and into the resulting CSV-file?
The numeric index for the interface ssems to be possible, but I don’t know if this is true also for the interface name, like green.

Michael

ummeegge · 17 November 2019 16:16

Hi Michael,

this has also been a problem with a newer build since i wanted to norm the directory names to ‘pmacctd’ which is here fixed in my env but which should be another thing. OK, so the first suspect with fcron was indeed the problem… We should find really a way to mark such stuff in one topic as solved !

Why not using the networks.lst → https://github.com/pmacct/pmacct/blob/master/examples/networks.lst.example ?
configure directive:
networks_file: /etc/pmacctd/networks.lst
aggregate option:
src_net, dst_net

Why you want to use two Core instances ? Isn´t it possible to solve this via two or multiple plugins in one configuration and therefore with only one Core process ?

May via ‘CustID#1’ or via subnet address ?

Best,

Erik