New on LowEndTalk? Please Register and read our Community Rules.
All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.
All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.
Hurricane Electric - is this packet loss normal?
proofofsteak
Member
Migrating data between a US server and a Hetzner Germany server and noticed the connection dropping every couple gigabytes or so.
Ran a few MTRs and saw this, seems there is massive packet loss on data routed through HE?
Not a networking/IP guy at all, but surely this can't be normal, any idea what is going on? Is this normal HE performance?
Included a Cogent route at the bottom for comparison, no loss.
HE Route #1 [USA Utah > Hetzner DE, big loss)
HOST: xx62 Loss% Snt Last Avg Best Wrst StDev
1.|-- original.server.location 0.0% 200 1.1 1.1 0.6 8.5 0.8
2.|-- core-100ge0-8-1-5.slc01.fiberstate.com 0.0% 200 2.1 1.4 0.3 26.3 2.9
3.|-- e0-2.switch1.slc4.he.net 63.0% 200 1.4 3.3 1.1 30.2 4.1
4.|-- port-channel9.core2.den1.he.net 92.0% 200 13.1 15.8 12.8 41.5 7.1
5.|-- port-channel8.core2.oma1.he.net 79.0% 200 23.4 24.1 22.8 39.6 2.7
6.|-- 100ge0-69.core2.chi1.he.net 83.5% 200 32.2 33.9 32.0 44.9 3.6
7.|-- port-channel13.core3.chi1.he.net 89.5% 200 31.2 40.8 31.2 90.2 14.0
8.|-- port-channel1.core2.nyc4.he.net 86.0% 200 61.2 50.8 47.9 61.4 4.2
9.|-- port-channel20.core3.lon2.he.net 93.0% 200 114.9 119.6 114.9 142.9 8.6
10.|-- port-channel4.core1.ams1.he.net 11.5% 200 120.3 122.5 119.5 199.1 8.7
11.|-- ??? 100.0 200 0.0 0.0 0.0 0.0 0.0
12.|-- core5.fra.hetzner.com 0.0% 200 143.9 141.8 141.3 147.2 0.5
13.|-- core24.fsn1.hetzner.com 0.0% 200 156.4 151.6 151.2 156.4 0.5
14.|-- ex9k1.dc14.fsn1.hetzner.com 0.0% 200 153.4 152.2 151.3 167.2 2.1
15.|-- final.destination.fsnl.hetzner.com 0.0% 200 162.2 151.9 151.3 162.2 0.8
HE Route #2 [USA Utah > Hetzner DE, big loss)
HOST: xx62 Loss% Snt Last Avg Best Wrst StDev
1.|-- original.server.location 0.0% 200 1.0 1.4 0.7 23.1 2.2
2.|-- core-100ge0-8-1-5.slc01.fiberstate.com 0.0% 200 0.7 1.6 0.3 37.6 4.0
3.|-- e0-2.switch1.slc4.he.net 58.5% 200 1.4 3.9 1.4 37.6 5.0
4.|-- port-channel9.core2.den1.he.net 91.0% 200 12.9 17.8 12.8 47.0 10.4
5.|-- port-channel8.core2.oma1.he.net 77.5% 200 23.9 23.8 22.7 33.5 1.8
6.|-- 100ge0-69.core2.chi1.he.net 87.0% 200 32.6 32.9 31.9 38.8 1.4
7.|-- port-channel13.core3.chi1.he.net 92.5% 200 31.2 34.2 31.0 49.4 4.9
8.|-- port-channel4.core3.nyc4.he.net 88.5% 200 50.8 57.0 48.0 87.9 11.3
9.|-- port-channel20.core3.lon2.he.net 85.5% 200 129.1 128.6 114.7 251.0 28.2
10.|-- port-channel4.core1.ams1.he.net 4.0% 200 119.9 122.6 119.4 191.3 8.6
11.|-- ??? 100.0 200 0.0 0.0 0.0 0.0 0.0
12.|-- core5.fra.hetzner.com 0.0% 200 138.2 138.3 137.8 142.9 0.5
13.|-- core11.nbg1.hetzner.com 0.0% 200 138.4 138.9 138.0 167.3 2.8
14.|-- spine16.cloud1.nbg1.hetzner.com 34.0% 200 1403. 1187. 928.8 1476. 97.1
15.|-- spine2.cloud1.nbg1.hetzner.com 0.0% 200 139.0 142.7 138.4 246.7 11.8
16.|-- ??? 100.0 200 0.0 0.0 0.0 0.0 0.0
17.|-- 13102.your-cloud.host 0.0% 200 140.5 140.5 140.1 150.4 0.8
18.|-- static.some.ip.here.clients.your-server.de 0.0% 200 138.2 138.4 138.0
HE Route #1 [USA Utah > Romania, big loss)
HOST: xx62 Loss% Snt Last Avg Best Wrst StDev
1.|-- original.server.ip 0.0% 200 1.3 1.1 0.7 7.9 0.6
2.|-- core-100ge0-8-1-5.slc01.fiberstate.com 0.0% 200 0.7 1.4 0.3 34.2 3.8
3.|-- e0-2.switch1.slc4.he.net 63.5% 200 2.1 3.8 1.4 22.5 4.3
4.|-- port-channel9.core2.den1.he.net 92.0% 200 14.1 13.0 11.8 21.4 2.6
5.|-- port-channel8.core2.oma1.he.net 79.5% 200 22.0 23.9 21.7 35.2 3.1
6.|-- 100ge0-69.core2.chi1.he.net 81.5% 200 47.3 33.7 30.9 59.9 5.5
7.|-- port-channel13.core3.chi1.he.net 94.0% 200 60.0 41.8 30.0 60.0 10.0
8.|-- port-channel1.core2.nyc4.he.net 84.5% 200 47.4 50.9 46.8 71.5 6.7
9.|-- port-channel20.core3.lon2.he.net 86.5% 200 123.3 121.1 113.5 158.1 11.0
10.|-- ??? 100.0 200 0.0 0.0 0.0 0.0 0.0
11.|-- 10.0.240.146 0.0% 200 156.8 157.1 156.5 166.8 1.2
12.|-- 10.0.240.214 0.0% 200 154.1 154.4 153.4 168.3 1.7
13.|-- 10.0.245.74 0.0% 200 156.2 156.5 155.6 170.7 1.7
14.|-- 92.80.113.70 0.0% 200 156.2 157.8 155.9 177.2 2.6
15.|-- maybe.calins.basement 0.0% 200 156.6 157.6 155.7 170.4 2.1
Cogent #1 [USA Utah > Scaleway FR, no loss)
HOST: xx62 Loss% Snt Last Avg Best Wrst StDev
1.|-- original.server.ip 0.0% 200 1.1 1.2 0.7 5.9 0.7
2.|-- core-100ge0-8-1-5.slc01.fiberstate.com 0.0% 200 1.1 1.3 0.3 31.1 2.8
3.|-- ??? 100.0 200 0.0 0.0 0.0 0.0 0.0
4.|-- be3917.rcr51.b056940-0.slc01.atlas.cogentco.com 0.0% 200 1.5 1.3 1.0 2.0 0.2
5.|-- be2257.ccr32.slc01.atlas.cogentco.com 0.0% 200 2.3 2.2 1.8 2.9 0.2
6.|-- be3038.ccr22.den01.atlas.cogentco.com 0.0% 200 12.7 13.5 12.1 80.4 7.6
7.|-- be3036.ccr22.mci01.atlas.cogentco.com 0.0% 200 23.8 25.0 23.4 102.0 8.1
8.|-- be2832.ccr42.ord01.atlas.cogentco.com 0.0% 200 35.4 36.4 35.1 102.8 6.8
9.|-- be2718.ccr22.cle04.atlas.cogentco.com 0.0% 200 42.0 43.1 41.7 109.9 6.6
10.|-- be2879.ccr22.alb02.atlas.cogentco.com 0.0% 200 52.9 52.6 52.2 54.1 0.2
11.|-- be3600.ccr32.bos01.atlas.cogentco.com 0.0% 200 56.1 56.5 55.7 78.9 2.5
12.|-- be2101.ccr42.lon13.atlas.cogentco.com 0.0% 200 118.0 120.5 117.7 181.0 9.4
13.|-- be12489.ccr42.par01.atlas.cogentco.com 0.0% 200 128.6 130.6 128.0 185.3 9.7
14.|-- be3184.ccr31.par04.atlas.cogentco.com 0.0% 200 129.7 130.3 128.4 177.9 7.1
15.|-- be3750.rcr21.b022890-0.par04.atlas.cogentco.com 0.0% 200 129.4 129.1 128.8 129.7 0.2
16.|-- online.demarc.cogentco.com 0.0% 200 126.0 125.6 125.2 126.2 0.2
17.|-- 51.158.8.183 0.0% 200 129.0 129.2 128.9 130.2 0.2
18.|-- 51.158.8.5 0.0% 200 125.3 125.4 125.1 125.8 0.1
19.|-- final.destination.5.the.movie 0.0% 200 125.8 125.2 124.9 127.3 0.3
Comments
As long as there is no packet loss at the last hop, your destination, everything is OK. Network operators usually have a low priority for ICMP on routers causing this.
? None of the mtr you present here show any packet loss? You should probably do long-running mtrs (>3000) to show any packet loss. But your reports here (with 200) show 0.0% packet loss on the destination.
Ah ok, so it only matters if there is packet loss on the last line of an MTR?
Correct.
This. You'll see the same thing in most routes.
It's somewhat newish and a horrible confusing practice. Fuckery by throttle or outright blocking ICMP is explicit noted in the RFC as bad practice. The legitimate cases here are congestion, floods or whatever kind of bad weather, so one could say this is a sign that there is degradation once you see more than zero loss in the middle of a route.
No. Protecting the control plane is very far from a new practice. Confusing? Sure, but that stems from end-users trying to perform diagnostics on topics they're unfamiliar with.
You could punt millions of ICMPs through a forwarding-plane without issue, you can't say the same for ICMP to the control-plane on the same router.
That's the equivalent of seeing smoke above the train station yet they deliver you at destination in time. Even when everyone says its fine, the smoke is there. QoS of diagnostic protocols is last resort as per spec.
As others said no packet loss at the end point no issues happening.
As others said ICMP is typically limited and dropped from a lot of networking equipment.
Far from it. If you want to draw analogies then it'd be the equivalent of the train driver not responding to every passenger unnecessarily knocking at his door whilst the train is in motion, because the driver is focused on getting the train to the destination on time
If you truly believe there should be no restrictions on ICMP requests to the control-plane then I've nothing further to add to this. I just hope individuals reading your posts decide to fact check for themselves.
Source & experience: Currently work for a Tier 1 in network security, and yourself?
Nah it's an ancient ass protocol that doesn't scale properly to meet spec in the current use cases. It suffers from limitations made in a different age. Surely it's efficient in the grand scale of things to serve such much users but unfortunately there will be knocking on doors and rightfully so and will become more apparent on busy routes over time
As long as packets get where they need to go then it’s not a big issue though. It only causes confusion for people who don’t have experience reading MTRs.
The below link is a very good resource for those with minimal networking experience but want to learn more about correctly interpreting traceroutes/MTRs
https://archive.nanog.org/meetings/nanog47/presentations/Sunday/RAS_Traceroute_N47_Sun.pdf
If enough people start raising a fuss about this, then more providers will start using MPLS on their infra, so you only see somehing like:
There are examples of that in the wild, just don't have any src/dst offhand to show you a real one.
I'm still trying to figure out how the ipv6 route between Tokyo and the Philippines takes 150-200ms (ipv4 is 46-59ms).
I think the lax2.he.net rdns must not really be in Los Angeles. But nonetheless it must be getting routed all around to be that bad. Even if it was getting routed through Singapore it shouldn't be more than 100ms.
Nobody's blocking ICMP requests here. Routers throttle the generation of ICMP time exceeded messages... No drops, they're just not being sent in the first place. The outgoing packets from traceroute aren't ICMP, they're UDP.