New on LowEndTalk? Please Register and read our Community Rules.
All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.
All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.
Need good ideas in analyzing L7 network slowness with L3/L4 being ok.
So I have this backend server that is serving several frontend/edge servers and for some reason a trivial HEAD health check takes like 2 seconds round trip for most edge servers. But e.g. iperf tests or ping tests (L3) are ok, also mtr tcp tests on L4.
From the same DC the same health check (from a different server in the DC) on L7 is like 25ms.
How do I find out why the L7 is so slow in comparison to L3/L4?
Thanked by 1oloke

Comments
Ok, so I brought the rtt down from >2 seconds to 54ms by using a vpn between edge and backend.
Looks like if the DC heavily rate limits one (standard) port on their network....
Hey! I guess, that the problem clearly isn't with the server, but with the network equipment between your points, which is slowing down standard web traffic. Since latency is eliminated through a VPN, this means that filters or data center security systems (DPI) are "slowing down" packets on standard ports 80 and 443.
Most likely, the network is forcibly inspecting the contents of your requests, which is creating an extra two seconds of latency.
Yepp, this is what I'm guessing at the moment as well (though I never touched switches/routers etc. that are not small home devices).
It's atm not the port you mentioned by a typical replacement port (of those ports) where I see this.
If they start rate limiting/inspecting my vpn port as well I'm out
only think I don't get is that a
mtr -P xxx -T xxx.xxx.xxx.xxxis not showing the problem.That makes sense
If switching to a different common port doesn't change anything, it really looks like some kind of traffic limitation or inspection on their end, not just a port issue. The fact that a VPN solves the problem is a pretty strong indication that direct traffic is being handled differently. And yes... if they start limiting the VPN as well, then something is clearly happening at the network level, and you'll probably need to talk to your ISP.
mtr -T -P only checks that the connection to the port opens quickly.
It doesn't check the HTTP request itself. Therefore, the connection may be fine, but the slowdown may only appear after the connection is established—when data transfer begins (This is the kinda first conclusion, but it needs to be checked further)/
You can check an mtr on the HTTP port and the VPN port. If you get different routes, but consistent for each port, it's an almost sure sign of connection tampering. Not all criminal providers are so careless to let it be seen in a traceroute, so the same route isn't proof of no tampering.
Assuming your application is running on TCP and iperf3 TCP seems fine, a difference between HEAD request and iperf3 TCP is that the HEAD request needs new TCP connections and TLS handshake, while iperf3 measures persistent throughput after TCP connection is established.
@yoursunny actually there is no TLS involved, it's just plain http HEAD request. Also those HEAD requests are fast (like 10 to 15ms) when send from a different server in the same DC on the common port. And they are fast when run through vpn from elsewhere, so it's not the server having problems.
@OpaqueRegistrant checked with mtr -P (port) -T with both ports, both take the same route.
So for me it really looks like @Hosting_b2b described: some kind of rate limiting or other kind of tinkering on the network for that common port by the DC or upstreams (the latter unlikely).
Just use
tcpdumpon both ends then to figure out what's going on.Ups, there is another thing I forgot. The vpn is using udp, so that might be the diff as well while the normal http is normal tcp.
I have to admit that this "just" is just my problem here. Why I surely can do the tcpdump (I would do a tcpdump -tulpen | grep xxx.xxx.xxx.xxx) and see the packets, how do I read out where the problem is from there? I not a big fan of reading tcpdump reports but would love to learn...
tcpdump -pni IFNAME -w 1.pcap "port 80"
Open the file in Wireshark, look for timestamps, packet losses, retransmissions, MSS adjustments, ICMP errors related to the flow, "expert info" on TCP flow, etc.
I'd do something like
tcpdump -npi intf host xxx.xxx.xxx.xxx and port yyyy(so you only get the relevant packets, maybe also add a-v) on both sides, and then basically just compare the two sides. Mainly look out for missing packets (and re-sends).