Since installing V198 I have noticed the system is being bogged down by IPS/Suricata. Key indicators:
I have a speedtest to my ISP running periodically (on one of my servers lscr.io speedtest-tracker docker container) and noticed in the graph that since V198 I see my 1g download speed going from an average of 920mbps to about 620mbps
I started investigating what might be causing this, initially thinking it was an ISP/NBN issue, but when I checked the FW I found that load average was higher than usual and that Suricata was by far chewing up the most resourcesā¦.on average north of 20% of CPU timeā¦..this has never been the case before and I have been running IPS for a very long time
My FW green network has a 10gb ethernet connection as most of my servers on green are 10gb connected. I noticed running an iperf3 between the FW and one of my servers (to and from) yielded a bitrate of ~9.4gbps with IPS switched off and ~900mbps with IPS switched on
Further to this, looking at the IPS throughput graph on the FW GUI I can see that where most of the traffic was āoffloadedā and a small amount was āscannedā prior to V198 its is now almost exactly the opposite ie majority of traffic āscannedā and a small amount of traffic āoffloadedā
Finally I ran several speedtests from the container mentioned above to be sure IPS was causing the issue and sure enough I got ~300mbps drop when IPS was on vs when it was offā¦ā¦.
Everything I can see from a degradation started when I went to V198 at the end of last month.
Has anyone else seen this performance hit due to IPS? Any ideas?
I am seeing the same behavior here with my mini appliance. When I turn off monitoring of the IPS system for RED, I get the original transfer rates (200 Mbps). When turned on, the download speed is halved.
So with the latest Suricata version of 8.0.1 which is able to cover more protocols that previously then more of the traffic is able to be analysed.
That seems to me to be a good thing but the total has gone from 105kbps to 122kbps which is not a big difference and certainly I have not seen any impact on my speed checks.
The vectorscan library that suricata uses had an update in CU198 and this might well have enabled suricata to handle analyses quicker so that might be related to the total numbers for traffic through suricata having increased.
I have a script that collects the speed data every 4 hours. I have not noticed any change in that performance.
My cpu% has not changed. It is still at around 0.25%
There has been no change in the Load Average graph over the change from CU197 to CU198. Also no change to the memory, Connections or Firewall Hits levels.
So no I am not seeing any performance hit due to IPS with CU198.
I am only running Emerging Threats ruleset on my system.
I have a problem with my upgraded internet access ( Vodafone Cable Internet ) at the moment.
I get only 100MBit/s, provisioned is 1000MBit/s. One aspect may be the connection at higher frequencies. This is investigated by the provider.
After following this thread I disabled IPS. And I get 400MBit/s, which could be explained by the physical problems.
You are going to have to look in your kernel and suricata logs to see if you can see what is triggering that, as I donāt experience that at all, so I have no idea what could be causing it.
Bernhard Im seeing that step change as well from an on/off perspective. As I said this has only started since CU198 was installedā¦..nothing else has changed. Just to recap IPS shows a throughput change from a 10:1 offloaded vs scanned to now almost exactly the other way around. Also did some simple testing using iperf3 during the night (everyone in bed and nothing really happening on the network). Connection to the ISP was disconnected to remove internet traffic from the equation:
IPS off on every interfaceā¦..iperf3 on green (over 10gbps) generates north of 9.4gbps bitrate consistently (this is normal)
IPS activated only on the Green interfaceā¦ā¦iperf3 generates ~900mbps bitrate consistentlyā¦ā¦.note 900mbps is normal for a 1g link
ethtool confirms the connection on both ipfier and destination server are both at 10gbps
Like I said above 900mbps is what I see as normal throughput between two 1gbps connections. Itās as if suricata tells the 10gb NIC to only push 1gbps max instead of 10gbps when it is activated on that network? Very weirdā¦.Im also trying to understand if the following is also relevant 11.5. High Performance Configuration ā Suricata 9.0.0-dev documentation
Also maybe suricata doesnāt handle iperf wellā¦ā¦.
In the end something changed in CU198 thatās causing thisā¦ā¦
The article sited focuses on NICs and their drivers, only.
I can see a high CPU usage during the speedtest. This is based on x parallel iperf3 tasks.
Therefore I checked with htop.
Each iperf3 task shows CPU high usage (~12%).
If I enable IPS, there show up a couple of suricata tasks with a CPU usage of 90% - 135%.
This shows, that suricata isnāt able to manage the amount of packets.
I have enabled no rules from Talos.
Im seeing the CPU hit as well, but the drop in NIC throughput look too coincidental. Im going to build a much faster test machine and just see if it can handle it betterā¦ā¦for the record current machine i5-6500 4core 3.2/3.4GHz, 8GB Ram, PCI3.0, Intel I350 Gigabit card for Orange and Blue, Intel X540 10 Gigabit card for Green, and onboard Intel I219 for Red which is much greater than the listed minimum reqs for IPFIREā¦ā¦IPS bottoms it all out when switched on so IPFIRE may need to change their listed min reqs at a minimum? However I still think someone needs to see what changed in CU198 with IPS that is causing an obvious drag that wasnāt there in CU197 and belowā¦ā¦ā¦
I donāt think it is the NIC throughput. This would limit the bandwidth without IPS also.
Switching IPS on ( without activated rules! ) produces the high load of suricata tasks. These arenāt able to transfer the network packets with the speed of the NIC.
Maybe with a much powerful CPU this can be handled, but this cannot be the goal.
Another idea, which brings the dependancy of the NIC up again, is that suricata must give a lot of packet handling to the NIC to be efficient. Another case of possible deficiencies of suricata.
I donāt know, what changed with the step to suricata 8.xx.
Because of the incidence of IPFire CU and upgrade of my internet connection, I donāt know the situation before the CU.
I can not reproduce this at all. I have tried with two separate systems.
Disabled IPS, then unchecked all rules for the emerging threats provider I am using and then enabled IPS again. The suricata graph traffic went down and the memory consumption went down. This is not surprising as running suricata without any rules enabled, means very little is being scanned.
I could not see an increased high load of suricata tasks.
Do a fresh install of CU197, test that out with and without IPS and then upgrade to CU198 and re-do the tests. That will give you both sets of data with your new internet connection.
I did another test with same results.
Lacking a spare system, I cannot easily downgrade my production system. But I will try at night the next days.
Fact is, even with no rules activated suricata scans all packets. This is documented in the graph.
But with that high load suricata canāt fulfill the realtime conditons. suricata is part of the network stack between NIC and application. Realtime processes should not consume nearly all CPU power.
ā> suricata 8.0.1 in IPFire CU198 is not usable ( with arbitrary HW fulfilling system requirements ).
This is the cpu usage graph. The peak circled in red is the cpu usage from restarting of suricata after disabling all the rules. The peak circled in black is the cpu usage from restarting suricata after enabling all my defined rules again.
In between is the cpu usage from when traffic was going through suricata but not being scanned as there are no rules for the traffic to be scanned against. After the black circled peak and before the red circled peak is the cpu usage when the defined rules were enabled and the traffic being scanned against it. It is all very low.
This is the screenshot of the suricata traffic graph. The red circled portion shows the traffic during the time all the rules were disabled. The peak in the section to the right after the rules were enabled again and suricata re-started is when I was watching a You Tube video. That maxed out at 80 Kbps. During when the rules were disabled it was between 10 and 20 Kbps.
This screenshot shows when the change was made from CU197 to CU198. The total traffic rate stayed the same. What changed was the proportion that suricata-8 is able to scan while suricata-7 had to ignore it and it was offloaded as suricata couldnāt do anything with it.
This is an improvement in suricata-8 where it is now able to scan more protocols than it could in the past. So to me the fact that more can be scanned and doesnāt need to be offloaded is a positive situation. The bulk of the total traffic, with both CUās, is around the 90 to 100 Kbps.
I can see the issues with an iperf3 speedtest to the WAN, only.
My speed is 1000MBit/s.
If I whitelist the iperf server, I can measure ~950MBit/s.
The peaks in the IPS graphic change from red to green.
The CPU% in htop remains at a moderate level.
If no rule categories or rules are selected, the only 13 rules active are the ālocal rulesā. Is some problem there?
Hi, i had a similar problem recently. After upgrading my internet modem, my internet speed was slowly but surely loose speed. On top of that i was loosing connection with all dns providers, and dhcp would totaly die every 2 to 4 days.
I solved the problem by changing the mtu on red.
I found my red mtu with this command : tracepath -4 -b xxx.xxx.xxx.xxx (x=internet ip)
I modified /var/ipfire/ethernet/settings dhcp mtu with the result of tracepath ( 65535 )
To apply the modified setting, /etc/init.d/network restart
Now itās been 3 weeks since the modification and no more slow downs, dns problems, or dhcp server dying.
Bernhard I think Loup means just run the command to your external network (for ease i used microsoft web address) ie tracepath -4 -b www.microā¦ā¦.com. This will give an āpmtuā figure at the top of the trace in my case 1492. You can check your red mtu by running āip link showā and looking for the red0: line.
You can set red0 mtu at the command line āip link set mtu 1492 dev red0ā
Firstly this did nothing noticeable for me (my red mtu before was 1500)
Now as far as making this change stick - in my /var/ipfire/ethernet/settings there are no MTU references and I cannot set force dhcp mtu via setup as it wonāt let me touch the field as I have set the interface to PPP DIALUP etc so I guess I have to use the above ip link set command in rc.local??