Please help, as I’m at a loss. There must be something I’m missing.
I have both TCP and UDP OpenVPN configurations setup. The server is an Intel(R) Xeon(R) CPU E5-2630 bare metal AES-NI CPU Crypto: Yes (active) with 12 cores, 32GB. It’s running the latest 2.4.4.p3. It’s colo’d at the datacenter on a gigabit link (carp’d). I’m trying to VPN at home, over a gigabit link, and transfer files. and at best, I get 200mbits over openvpn with TCP. With TCP it seesaws with the avg being 200, but it’s consistently faster than UDP. UDP is a solid 180ish. Disconnected, I can pull 106 Megabytes/second to our same servers over sftp, over an external interface. What magic settings am I missing to get anywhere near line speed? There’s 7% cpu load on the pfSense server.
I’ve tried multiple ciphers, AES 128, 256, FAST-IO, Buffers, adding sndbuf & rcvbuf, and I’ve turned FastIO on and off. I’ve even tried changing ports from 1194 to 1198, in case my local ISP was throttling 1194.
Pfsense has dedicated nics to the WAN and LAN switches.
7% CPU load is suspiciously close to 8.333%, which would be a single core fully loaded (100/12=8.333). Can you check your core-by-core usage to make sure you’re not capping a single core out? It sounds like you’re not properly making use of your hardware acceleration for crypto if that’s the case.
OpenVPN is single threaded at the moment (openvpn 3 is multi-threaded!), so it’s worth checking. I strongly prefer CPUs with high clock speed and low core count for pfsense.
I too would like to know a good answer to this, I am in the EXACT same boat. I am thinking of switching away from OpenVPN
I have a colo box running an E3-1270 V2 on a 10G/10G pipe, home I have 1G/1G with pfSense on an E3-1220 V2 and I get the same speed as you (Literally the same…)
HTTP download speed between the two is right around a full gigabit
6ms latency between the sites, Cogent on the colo side and AT&T Fiber on the home side
Weirdly I can get 600+Mb/s when downloading through an OpenVPN Client interface from PIA on the same boxes
So I am able to get 500-600Mbit over OpenVPN (on 1Gb connection) with an Intel i5-7200U in pfsense and my desktop CPU at the time was a 3930K. I plan to try to further optimize this but even this is much better than the performance we used to get.
Some tips:
Use AES-GCM, as it can be multi-threaded (I am using AES-256-GCM with SHA384)
UDP Should get better performance, but do some tests of your own with TCP and UDP to see
Try the UDP Fast I/O option
Experiment with the Send/Recv buffer options – I am currently just using default but will probably experiment with it a bit more in the future.
Maybe your performance is being limited on the client – are you running the openvpn client on your windows desktop/laptop? Does it support AES-NI? Are you running the client on something else and does it support AES-NI?
I may be off base, but does each connection use the same NIC? I ask because I was having shit speeds when routing though the built-in Broadcom NIC. I reconfigured to route through an add-on Intel NIC and all speeds jumped to damn near gigabit, where they seemed a lot more reasonable.
If the opposing site of your VPN supports multiple connections you could do that to improve speed. In general, add more OpenVPN Clients to the same endpoint and bind them to different Interfaces. Now add those Interfaces to an Interface Group and let it load balance.
This will help utilize more cores on your CPU thus potentially increasing overall throughput.
Could it be your provider? I am easily saturating my entire WAN link with OpenVPN when running a speedtest. I have enabled AES-NI and BSD Crypto Device under System > Advanced > Miscellaneous and my OpenVPN client has Intel RRAND engine selected for hardware acceleration. I am pulling over 300 mbps with about 15% cpu usage while my WAN connection caps at 350 mbps download speeds. I am using a Quotom Q555 mini PC that has an Intel i5-7200U in it so your specs I believe are quite beefy compared to mine.
Nailed it! So, because it apparently doesn’t use AES-NI, and it’s single threaded that explains the performance I’m getting with a Xeon system. Only way to get better is to dumb down the encryption tunnel. At least Openvpn on Windows appears to be multi-threaded.
PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND
MTU is blank on pfsense, so we’re assuming 1500. My side is 1500. I enabled fast path ages ago. Hardware Checksum, Tcp Segmentation, and Hardware LRO are all checked for disabled.
My client: Pick and choose a desktop: I7-4930k, Windows 10, 32G, or AMD Ryzen 9 3900x, Windows 10, 64G, both using the latest OpenVPN client. All connections are wired, no wifi.
I can fire up my MacBook Pro and use TunnelBlick and get the same results.
What’s the advantage of AES-256-GCM with SHA384 over AES-128-GCM with SHA256?
I’m thinking it’s none. What I’ve read so far is that GCM is secure at 128-bit even though CBC was considered to require 256-bit to be secure, so using 256-bit with GCM adds unnecessary complexity and heavily increases overhead for no benefit. Similarly, while SHA1 is considered vulnerable with 160 bits, SHA256 is considered secure, so what do you gain from 384-bit SHA? I am under the impression that some processors with hardware crypto accelerarion implement higher-bit functions by performing multiple 128-bit operations, so this could potentially cause a small performance hit for some clients with older chips, and a huge performance hit for clients that don’t have hardware crypto at all. Does that sound correct to you?
WAN & LAN are on the onboard Intel IGB0/1. It’s a SuperMicro MB. I also have an Intel quad port for various other networks, like SYNC, and other hard wired private networks. WAN & LAN are on their own physical interface, plugged into their own respective switches. I have 6 interfaces in the pfSense box, 5 are used.
Connecting to the same servers at the data center, on their external interface, not using pfSense/openvpn, I get 106 Megabytes/sec – ~900 mbits/sec. When trying to get to the LAN side, thru pfSense w/openvpn, that’s when speeds collapse.
Hrmm… My advanced settings are set to AES-NI, and I see there’s now an option for both AES-NI and BSD-Crypto. On next OS update, I think I’ll make that change.
Multiple gets over the same tunnel increase the thruput slightly, and barely increase openvpn cpu load. So there goes that theory. You’d swear this thing is QOS’d.
My setup is nothing like yours as I run a virtual pf-sense under proxmox with a dozen other VM’s on a Pentium G3220 (no AES-NI). What is similar is the fact that I had a very similar problem and I also have an Intel 4 port 1Gig add-on card. When I first setup the machine I was using one port on the built in and one port on the external card; I had terrible results that at first I chalked up to the Pentium CPU but there was no real load happening; so then I moved all my PFSense stuff over to the external card and magically everything was fine. For whatever reason pf-sense just didnt like my built on nic’s. I only have 150Mbit but I can fully saturate that (vs only pulling around 30Mbit before moving the connections around).