Hi all, I wanted to check, my company recently moved us from using Cisco any connect to using zscaler. I haven’t heard anything good about zscaler. Mostly around speed and latency. I am now seeing everything I do the latency has doubled. From what I understood only our company apps should be running through zscaler yet I see all my traffic go over it. My speed is now between half and a quarter of what it should be. Latency where I used to have a stable 5ms to the Azure services I was developing I now have anywhere between 11 and 280ms. This makes testing very difficult. Is this normal for zscaler? The endpoint is still within the same country as the Azure services.
Have your Zscaler admins check the MTU on the forwarding profile assigned to the app profile you’re assigned to.
A lot of times, lowering it to 1360 helps. Especially on Wireless Internet connections (not Wi-Fi), like TMobile Home.
Have been chasing a latency issue with support for over a month now, no matter how you look at it ZIA With tunnel 2.0, 1.0, TWLP - we lose half our throughput.
It’s at the point we’re probably not renewing with them and are looking for alternatives. Their support has been scripted and useless.
Silly question but it seems like Zscaler doesn’t support IPv6 at the edge. Isn’t that indirectly stifling IPv6 adoption?
Zscaler runs SLAs on latency. If you frequently see more than 100ms you need to log it with them and they’ll look into it. If it’s too bad you’ll get service credits.
From the SLA doc:
Percentage of Qualified Transactions and Data Packets With Average Latency of 100 Milliseconds or Less.
Service Credit
= 95.00%
N/A
< 95.00% but >= 94.00%
7 days
< 94.00% but >= 90.00%
15 days
< 90.00%
30 days
Private or public application?
Have you submitted a support ticket or reached out your account team to have them investigate why?
It should not be this slow. If you are using Zscaler client, check if it shows both private access and Internet security as connected.
Also, check tunnel version in your Imternet security. If it shows DTLS, then ask your admin to configure TLS for test.
Ip.zscaler.com if it shows a far away DC then that too could contribute to the delay.
If you are using ztunnel 2.0, then enable path mtu discovery in the forwarding profile.
There is also an option for Dynamic service edge selection in the forwarding profile, which should also help alleviate anything contributing to latency.
We had one of our sites IP geolocation show up as being on the other side of Australia, causing traffic to go via Singapore instead of Sydney.
Was a pain trying to get it back to the right location, 2-3 weeks.
Apart from that MTU etc seem to help the most, but every now and then you just get a weird day for some users.
Lots of missing context. 280ms is equivalent to going around the other side of the earth.
ZDX could shed light.
Is your Zscaler pop appropriate for your location? (Speedtest will show this w latency).
Is it just a single app having this issue?
Okay tested today with Zscaler. We did a packet capture and tested tls 1.0, that gave much worse speeds. A reverse mtr showed packet loss within the data center to my IP.
Is there any noticeable performance difference between tls 2.0 and dtls?
Update: the packet loss has been resolved. However the latency and speed remains really poor. Without zscaler I can have a stable 8ms ping to Azure. With Zscaler running (to the same datacenter as the breakout) my latency is anywhere between 18ms and 350ms.
Anyone know if there is a way to address this? I’ve been told this is due to my home ISP that I see this.
We are seeing similar issues, are going back and forth with support and management for more than six months now. We monitor the proxy latency and observe huge peaks during office hours.
Their load balancing seems to be poor dogsh… single nodes can be overloaded by other customers, and your users stay connected to the overloaded nodes.
In September, they will change to block connections to single nodes directly. Hopefully this will help with the latency spikes.
Happy to get any updates from your case, since we’ve tried everything and are losing our minds…
For future reference we found a fix on ethernet by disabling ipv6 and Recv Segment Coalescing, but it has to be applied to every new adapter connected. For Wi-Fi adapters you have to disable it in registry.
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Class in registry. Search the reg folder for the wifi adapter name which is normally wi-fi and set *RscIPv4 and *RscIPv6 to 0. May also be named *WdiRscIPv4 and *WdiRscIPv6.
Will check with them. Mostly this has been on fibre and the office WiFi.
Support gets slightly better if you have a technical account manager.
What has support actually had you do though? Most of the time I see latency when someone is outside a Pop and should be using a PSE to increase performance or when they try to do GRE with tunnel 2.0 routing through it. Or using tls vs dtls.
Throughput is rarely the key contributing factor to performance unless the job involves sending really large files. It’s typically latency or loss. You should focus on the actual site slowdown and not a throughput test. If there are limited complaints, you are chasing ghosts.
For future reference we found a fix on ethernet by disabling ipv6 and Recv Segment Coalescing, but it has to be applied to every new adapter connected. For Wi-Fi adapters you have to disable it in registry.
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Class in registry. Search the reg folder for the wifi adapter name which is normally wi-fi and set *RscIPv4 and *RscIPv6 to 0. May also be named *WdiRscIPv4 and *WdiRscIPv6.
Thanks for this. What about things that are under 100ms, but where you used to have 20ms now it’s 40-50ms? Meaning your normal latency doubled.