Core 168 (testing) squid problem

baruch234 · 15 May 2022 20:54

We recently updated 3 different IPFire machines from 2.27 (x86_64) - core167 to core168 Development Build: master/9f42266a. Machines #1 and #2 use comparatively simple configurations and hardware equipment ‘out of the box’:

IPFire Duo Box: everything OK
APU 2C4 Board (including firmware update): everything OK
IPFire Business appliance with squid proxy & update accelerator enabled, additional RAM and two extra HDDs. This machine uses comparatively complex configuration (IPSec, OpenVPN, three local subnets, intrusion prevention, firewall rules etc.). This machine was running properly before the update and had serious problems with squid after the update:

Communication between client machines in the local subnets and IPFire/Web-Proxy was heavily disturbed, external websites could not be reached, especially using ‘https://xxx’, or took very long (20 seconds and more) to load. Certificates were misinterpreted and so on. The firewall log showed a massive amount of entries with ‘DROP_CTINVALID’ concearning connections between local clients and squid.
Switching off squidclamav and update-accelerator did not help.

Unfortunately we cannot provide more helpful details: this machine is our main firewall for production purposes - so we had to roll back to core167 immediately to successfully resolve the problem.

pmueller · 18 May 2022 14:44

Hi,

thank you for testing and reporting this.

This is odd indeed; like you, I do not observe any Squid-related hiccups with Core Update 168, but neither run the update accelerator nor the URL filter.

However, the description of yours strongly reminds me of bug #12812, which is a Suricata-related problem, but I really hoped the Suricata update in Core 168 would have fixed the underlying issue.

To validate this theory: Do you use the IPS on the IPFire systems where the proxy is running fine after Core Update 168?

Completely understandable, but makes investigating tricky nevertheless…

Thanks, and best regards,
Peter Müller

baruch234 · 18 May 2022 16:01

Hi,
we make use of Suricata intensively on all 3 IPFire machines - IPFire #2 (APU) runs squid & Suricata and did not show any problems, IPFire #1 is simply used as OpenVPN/IPSec access point - we didn’t observe any disturbance here.
Worth to be remarked seems, that network connections were not cut off completely and not in a reproduceable manner: Sometimes it took about 20s to load a page (e.g.: google, denic, wikipedia, our own website etc.), sometimes pages didn’t load at all. We had impairs to all kinds of web traffic: browsers, mail clients, anti-malware-software - IPSec-traffic (site-to-site between #1 and #3) and OpenVPN-traffic were not affected. Sending all traffic through IPFire #2 did solve the problems.
Hope this could add some further information.

pmueller · 21 May 2022 10:05

Hi,

thank you for reporting back.

May I ask which IPS ruleset provider you use? Were there any noticeable occurrences of certain IPS hits during the outage?

Sorry for insisting on the IPS side of this issue; I experienced very similar looking behaviour, and it eventually always traced down to Suricata, not Squid.

Thanks, and best regards,
Peter Müller

baruch234 · 21 May 2022 13:16

Hi,

every machine (#1 to #3) makes use of ‘Emergingthreats.net Community-Regelsatz’.

In addition, IPFire #3 makes use of

‘Etnetera Aggressive Blacklist Rules’
‘OISF Traffic ID Rules’
‘Abuse.ch SSLBL Blacklist Rules’
‘PT Attack Detection Team Rules’

On every machine, IPS is watching RED, ORANGE (if applicable) and OpenVPN.

On every machine, the IPs of the local IPFire itself (GREEN, BLUE, ORANGE - if applicable) and the corresponding IPs of the other two IPFire-machines are registered in the list of IPS exceptions.

Unfortunately I didn’t conserve any logs nor do I remember log entries because of immediate rollback. So I can’t report any IPS hits during the outage - sorry.

pmueller · 24 May 2022 07:42

Hi,

thank you for your reply.

I will try to reproduce the problem with the IPS configuration of your third IPFire machine - perhaps that will get us somewhere.

Thanks, and best regards,
Peter Müller

baruch234 · 13 June 2022 13:52

we executed the update to IPFire 2.27 (x86_64) - core168 (stable) today and observed the same problems.

The firewall logfile again showed a couple of DROP_CTINVALID entries on the green interface (which is not monitored by IPS) with requests from local machines pointing to IP-addresses (tcp443) that belong to ESET anti malware services (we extensively use ESET).

I did the following and succeeded to get rid of it:

(1) deactivted IPS
(2) deactivated all ruleset-providers
(3) activated IPS again
(4) activated all ruleset-providers one-by-one again
(5) reboot

IPFire now seems to run fine.

pmueller · 15 June 2022 20:03

Hi,

apologies for my belated response and the inconvenience.

During testing, I was unfortunately unable to reproduce this issue, no matter what I tried. Also, there was no other similar feedback, suggesting that this is somehow an individual case. Nevertheless, sorry for the hiccup, and the odd feeling remaining that we don’t know the root cause of this incident.

Sorry to disappoint, and best regards,
Peter Müller