Hello, never had much of an issue using the IPFire Mail Service on any deployment. Recently deployed core 196 to a new appliance and activated the mail service like I usually would and happened to notice some strange unexplained issues with its operation.
We have checked that all credentials are accurate and valid, we verified the mail server for SMTP and the port, authentication is fine, and the sender and recipients are valid.
This morning, I was setting a new WIO alarm for a device and for IPSec and suddenly encountered a nearly unresponsive box that is deployed about 120 miles from where I am. Thankfully SSH was still available, and quick thinking lead to the decision to reboot the box to hopefully regain control and determine what was hogging all the resources.
Evaluating the logs, particularly for the Kernel and Mail, it would appear that the dma binary managed to have about 50k queued up mails in the /var/spool/dma location between a matter of 6 minutes. To make a long story short, I ultimately needed to killall -9 dma so that I could reclaim my system without it succumbing to a grinding halt from a CPU and Memory overload with a maxed out SWAP…and ultimately needed to comment out the conf settings in auth.conf and dma.conf under /var/ipfire/dma/
I noticed in some of the comments of the Mail log that it thought the recipient email was invalid, after some in depth troubleshooting, I decided that maybe DMA did not like the TLD I provided or maybe the fact that it was a subdomain type of domain with a custom TLD being .zone (not the .com, .net, .gov, .edu, .org). I noticed that common TLDs as mentioned previously do not appear to run into the same issues I have experienced today.
I guess my ask is, are there any recommended settings aside from what is in the IPFire Documentation to use to prevent something like this? Or has anyone else run into strange bugs while using DMA or any aspect of the IPFire Mail Service…or perhaps this is a new unfound issue? Because it was a rather terrifying experience being faced with the reality that I may have had an unresponsive system 120+ miles away that might have required I go visit it to perform a hard reboot ![]()