Iperf3 maxes at 2 Gbps on PC router box - dual AQN-107 NICs

I built a PC from spare parts I had on hands , and installed IPfire on it.
Specs are :
Asus Prime X470 Pro motherboard
AMD Ryzen 2700 CPU
32GB DDR4-2800
Patriot 120GB SATA SSD
GTX 1050Ti GPU
2 x Aquantia AQN-107 10GBASE-T NICs

Install went very smoothly - kudos for that !

I then tested iperf3 through the router, with an additional client machine on the LAN side, and an additional server machine on the WAN side.

The throughput maxed at about 1.4 Gbps with one TCP stream and 2 Gbps with 2 TCP streams.

CPU is between 20-25%, mostly in IRQ usage.

If I connect my server and client boxes directly (no IPFire box in between), I can max the line rate of (approximately 9.4 Gbps), so I know IPFire is causing the reduction in throughput.

Is there anything I can do to improve the throughput ? I was hoping this box could handle 10 Gbps, and maybe even 20 Gbps (10 Gbps full-duplex). Do I need a faster CPU ? Or better NIC ?

Just to be clear, the intrusion detection feature is turned off.

also the QoS affects the throughput, even though it has been greatly improved.

Also depends on if the network controller function on the board has been coded to utilise all cores of your processor.

When you run iperf3 and you look with htop are all the cores evenly being used or is one core maxed out and the rest sitting idle or at low usage?

1 Like

Thanks. QoS is disabled.

I just checked with top .
I see that cpu0 and cpu2 are at 100%. cpu3 is at 70%. cpu6 is at 50%.
The other 12 are all idling.

I think you found the bottleneck. Another one could be the PCIe bandwidth of the motherboard.

2 Likes

So, you are saying the NIC drivers are unable to utilize all cores, and that is the bottleneck ?

As far as the PCIe bandwidth, one NIC is an x8/PCIe 3.0 slot, and the other NIC in an x4/PCIe 2.1 slot. That slot is forced to x4 in the BIOS (default is x2). Both slots have enough bandwidth to support 10 Gbps. The AQN-107 supports up to 3.0 x4.

I find this the simplest explanation (using the Occam Razor approach). If the traffic does not scale to 10 Gbps and you have a core maxed out while others are idle, and the PCIe is capable of supporting that speed, what other bottleneck would be there to be the culprit? I am not 100% certain because I am not knowledgeable enough to exclude other factors of which I am not aware.

1 Like

1
I understand that the connecting cables are of the correct category?

2

Best

2 Likes

iptom,
Yes, all cables on those 4 AQN-107 (two on IPFire box, one each on client & server) are either CAT6 or CAT6A. The lights indicate they are all connected at 10 Gbps link speed.

1 Like

FYI, I bought an Intel X550-T2 NIC (PCIe x8, dual 10 Gbps port). I’m hitting the same limit of about 2 Gbps with that NIC in IPFire with iperf3. In Opnsense 23.1, on the same hardware, I was able to hit nearly the full 20 Gbps (bidirectional 10 Gbps). It seems odd that there would be a 10x performance difference between the two.

Assuming you are comparing vanilla-distro against vanilla-distro, no filtering, no QoS etc. just pure routing, Opensense is a BSD-based Firewall. IPFire is a Linux-based. Different OS, different drivers. If you find a Linux distro that gets 10X IPFire, then there is a bottleneck introduced by IPFire somewhere. Such report would be very valuable for the project. My guess, there is a problem with the Linux drivers of the network cards.

EDIT: this is the Linux Kernel documentation for those class of cards. Is it possible that in IPFire the driver needs some tuning up?

Yes, I’m aware Opense is BSD based and IPFire is Linux. I expected some difference between the two, just not 10x. I haven’t tried other Linux distros for routing.

I am comparing things out of the box, pretty much untuned. AFAIK QoS and filtering are off by default in both IPFire and Opensense.

You are right that it could be a driver issue, but I have experienced the same bottleneck with 2 different types of NICs - the AQN-107 (two of them) and the X550-T2 (single dual-port NIC), so I’m suspicious that both NIC drivers in IPFire have the same issue. But I really don’t know where else to look/tune.

FYI, the AQN-107 didn’t perform as well in Opensense, I didn’t reach the full 20 Gbps, more like 11 Gbps. Still much more than 2 Gbps that I got in IPFire. The single-card X550-T2 uses fewer watts also.

Old thread I know but:

I am having the same issue with hardware that went from being a Debian desktop to an ipfire box in the same evening. So with no hardware change between the two and no network cable change (wasn’t even unplugged), I was maxing out at 9 and some change with Debian and really can’t get past 1 on ipfire. I swapped this hardware specifically because of the low iPerf3 scores on the other ipfire box.

Dual 10g SFP+ with fiber, again same hardware as the Debian and my other 10g capable computers don’t break a sweat maintaining 9+.

You have to use the two pcie3.0 x16 slots for the nic cards. and the pcie2.0x16 or the 3 PCI-E 2.0 x1 slots for the video.

Other than that, we could check kernel cpu profiles and IRQ timer settings if both cards are plugged into the two PCIe3.0 slots.

transmission package queue size is not set very well

I would do this and see if this helps:

ifconfig green0 txqueuelen 10000
ifconfig blue0 txqueuelen 10000
ifconfig orange0 txqueuelen 10000
ifconfig red0 txqueuelen 10000

take a look at the /etc/sysctrl.conf in debain and compare it to the manual settings they installed in ipfire.

The tcp send parameters are probably need to gone through. The most common mistake is invoking a set of parameters, but leaving one out. Like the one that limits blank data in a tcp packet when setting tcp legnth and memory sizes. I haven’t gone through it yet, but that is where most of the fine tuning most likely needs to be done. When Linux is installed at a datacenter, this file is edited so the transmission has the best throughput with the lowest latency.

On Debain in sysctrl.conf, it usually just relies on its internal setting and the line are commented out with a #

This is an insane recommendation and will seriously harm the performance of your network.

@dr_techno I don’t think there is much value in digging out threads that are years old and giving out really bad advice. We have not learned enough about the problem with this hardware and some insane software fixes are not going to cut it.

2 Likes

normally, 5000 is set for 1GB interfaces in buisness switches with 10GBps uplink,

i recommended 10000 because that is a normal setting for 10GB to 100GB up link, but this is just the Tx cue so optimal setting will be the balanced between queue length and its latency. Normal client workstations have 200 tx queue length because of the type of network load. But for using a port as a switch, you definitely up the tx queue. That is why the system uses the base queue size for this application (1000) on the drivers. Adjusting this setting on the command line will also let you know if the driver is communicating correctly with the kernel too.

But as far as sysctrl.conf is concerned. The adjustments are much greater and do take finesse, but since I do have some time today, I’ll look at it to see what obviously stands out. But what hardware is used can determine some of these adjustments so its not a one size fits all if you really want to get the most out of the hardware.

Btw if for some reason if adjusting it causes issues , you just set it back to 1000 or reboot. So I really do question why you think its going to harm anything.

You might want to read up on Bufferbloat: https://www.bufferbloat.net

IPFire employs dynamic queue management. The queue of the NIC should never fill up which is why it makes no sense to supersize it. We are also not only optimising IPFire for maximum throughput, but for low latency.

1 Like