Can you tell which route you're using?

david · February 19

If I do a mtr -zur from my server to my home, I can tell there are multiple routes. I notice this also, with my wireguard connection; sometimes the latency will change if I restart wireguard. I assume it's taking a different route.

Is there any way to tell which route it's taking? Are there different tools besides mtr and traceroute to look into it? Or some different parameters I'm missing? And is there a way to get it to prefer one route over the other?

ehab · February 19

i only know

$ ip r

yoursunny · February 19

We take the most direct route.
We'll swim across the river or dig through the mantle, if that's the most direct route.

Levi · February 19

P2P. We meet, we do, we separate.

Neoon · February 19

As a costumer you can barely influence any routing, except you pay for it or the people are willing to optimize routes.
However, you can use your existing virtual servers, to do route bending.

Cutting Edge solution for people that either have no ASN, don't do BGP.
https://github.com/Ne00n/route-bender-4000

You let that run on your Homeserver or Raspberry Pi, profit.
Connect the virtual servers you want to use together with a latency optimized mesh vpn and you are in business!

https://github.com/Ne00n/wg-mesh
10/10 Enjoying my sub 150ms to Tokyo and Hong Kong.

Don't Delay, Try route bending Today!

tentor · February 19

@david said: If I do a mtr -zur from my server to my home, I can tell there are multiple routes. I notice this also, with my wireguard connection; sometimes the latency will change if I restart wireguard. I assume it's taking a different route.

This is due to ECMP and MTR sending multiple packets with different src/dst ports, resulting into different hashes leading to utilization of different paths.

Load balancing by per-packet multipath routing was generally disfavored due to the impact of rapidly changing latency, packet reordering and maximum transmission unit (MTU) differences within a network flow, which could disrupt the operation of many Internet protocols, most notably TCP and path MTU discovery. RFC 2992 analyzed one particular multipath routing strategy involving the assignment of flows through hashing flow-related data in the packet header. This solution is designed to avoid these problems by sending all packets from any particular network flow through the same path while balancing multiple flows over multiple paths in general.

david · February 19

I can tell, once I get a favorable low-latency wireguard connection, it usually holds onto it for an extended time. But sometimes, it seems to let go of it and switch to a worse route. Restarting wireguard, sometimes multiple times, I can reacquire the premium route.

host_c · February 19

@davide

ECMP has it's benefits, but a lot of websites do not like it, mostly financial sites ( banks, paypal, stripe, and so on )

We use it on customers that have same link speed contracted from 2 different ISP's and have no BGP, and filter out the financial department on only 1 link in fail-over.

for torrent, oh boy, you can download at 2Gbps on 2 x 1 Gbps lines easily.

It will work flawless in 90% of the cases.

tentor · February 19

@host_c said: a lot of websites do not like it, mostly financial sites ( banks, paypal, stripe, and so on )

Can you elaborate more about this? I am genuinely interested in why financial sites do not like ECMP, given that the communication with them happens mostly over TCP.

host_c · February 19

@tentor said:

@host_c said: a lot of websites do not like it, mostly financial sites ( banks, paypal, stripe, and so on )

Can you elaborate more about this? I am genuinely interested in why financial sites do not like ECMP, given that the communication with them happens mostly over TCP.

Sure:

ECMP does load-balancing on output interfaces, it balances connections on output interfaces not traffic. It will balance tcp new type of connections from source to the interfaces it is set up.

For the sake of the example, let's assume you have wan1 and wan2 as public interfaces with different sub-nets + NAT from Inside network to the outside.

So from your internal PC, 192.168.8.5 when you access PayPal.com, your connection will go thru wan1, and it will stay on wan1 as long as long as there will be no tcp new connection from your pc.

Depending on vendor of the router, or how it is set up, when the connection table get's flushed ( it get's flushed from time to time ) you have a 50% chance the new connection will go thru wan2 not wan1 you were originally connected, so your Public IP that accesed the webpage will change, and you either will be disconnected/logged out from the website or you will get "CSRF token expired" or other HTTP/S connection related error, as you "present" yourself with a new IP and a good "ticket" .

What we saw and others also battled with, is that financial applications websites ( most of them ) really do not like this, so that is why, in these scenarios, we made separate rules for this group of users to always go thru a specific wan at all times, and fall back to the other one.

Hope I made myself understandable.

tentor · February 19

@host_c said: Depending on vendor of the router, or how it is set up, when the connection table get's flushed ( it get's flushed from time to time ) you have a 50% chance the new connection will go thru wan2 not wan1 you were originally connected, so your Public IP that accesed the webpage will change, and you either will be disconnected/logged out from the website or you will get "CSRF token expired" or other HTTP/S connection related error, as you "present" yourself with a new IP and a good "ticket" .

I believe we are talking about different cases. When implemented by an ISP, ECMP will not alter source IP address, it will only affect next-hop used, so no noticeable difference for the resource unless there is different MTU (which should not be a case either).

host_c · February 19

Yes, @tentor , I was referring about home/business use of ECMP ( where you have NAT ) as OP was talking about a home-use situation.

ECMP + NAT, the problem is the one described by me above, that it is actually not a problem, rather a limitation/side-effect of the setup.

We mostly dropped ECMP + NAT at our customers by now, and went FAIL-OVER setups, 1 fast link and 1 slower link from another provider.

tentor · February 19

@host_c said:
Yes, @tentor , I was referring about home/business use of ECMP ( where you have NAT ) as OP was talking about a home-use situation.

ECMP + NAT, the problem is the one described by me above, that it is actually not a problem, rather a limitation/side-effect of the setup.

We mostly dropped ECMP + NAT at our customers by now, and went FAIL-OVER setups, 1 fast link and 1 slower link from another provider.

You should've mentioned that " + NAT" part in your first message :D

Thanks for detailed explanation btw.

host_c · February 19

@tentor said: You should've mentioned that " + NAT" part in your first message

I was writing from my phone, and well,

david · February 19

In my case, I just notice that the route from my server to my home varies. Sometimes it's routed through NTT and sometimes through Twelve99/Telia/Arelion. One of those is superior, lower latency, better speed, and the other is a bit crappy at times.

host_c · February 19

@david said:
In my case, I just notice that the route from my server to my home varies. Sometimes it's routed through NTT and sometimes through Twelve99/Telia/Arelion. One of those is superior, lower latency, better speed, and the other is a bit crappy at times.

It might be some NTT / Telia / Cognet drama going on, I remember reading something here on the forum?

SRY for making a mess on your tread with @tentor , but I really wanted to point out a weakness/sidefect in ECMP with NAT, so others not waste time if they encounter a situation like the one described.

@david said: And is there a way to get it to prefer one route over the other?

as a home user, no. AS ISP with 2 Net connections and 2 BGP sessions, might, but if the problem is at 3rd or later ASN, mehhhh, difficult.

vsys_host · February 23

While ECMP usually uses a hash of the source port - destination port for load balancing, and the wireguard uses a random port on the client side, you can make some script that will test latency after the wireguard starts and restart the wireguard if the latency is higher than some value.

totally_not_banned · February 23

I sometimes wonder what would happen if one where to do some kind of poor mans bonding where everything gets routed into a VPN with iptables simply sending the resulting even numbered UDP packets over link A and uneven ones over link B.

I'm not sure if wireguard/OpenVPN would care about the mismatch in source IPs on the server side but even if they do that could be fixed pretty easily by iptables rewriting everything to a common source (it would have to do more or less the same flipflopping for destination addresses on outgoing VPN packets as the client does with external interfaces anyways).

I figure such a solution would likely suffer a lot if the latencies of both links aren't more or less even (lots of massively out of order packets) but beyond that it would sidestep any kind of external dislike as everything gets unified at the VPN exit and no outside party could detect that packets originally traveled over different links.

david · February 23

@vsys_host said: While ECMP usually uses a hash of the source port - destination port for load balancing, and the wireguard uses a random port on the client side, you can make some script that will test latency after the wireguard starts and restart the wireguard if the latency is higher than some value.

Yes, this is what I've done for my pi servers at home. A script monitors the latency and restarts wireguard if it gets too high. The down side, at least for the server running asterisk, is that the audio in a phone call drops out for a second while wireguard is restarting. So you don't really want to restart it until it's bad enough to cause dropped packets and poor call quality.

Neoon · February 23

@david said:

@vsys_host said: While ECMP usually uses a hash of the source port - destination port for load balancing, and the wireguard uses a random port on the client side, you can make some script that will test latency after the wireguard starts and restart the wireguard if the latency is higher than some value.

Yes, this is what I've done for my pi servers at home. A script monitors the latency and restarts wireguard if it gets too high. The down side, at least for the server running asterisk, is that the audio in a phone call drops out for a second while wireguard is restarting. So you don't really want to restart it until it's bad enough to cause dropped packets and poor call quality.

You should probably have multiple links running and route traffic always over the best available link. There won't be any interruption and you could restart your links which are currently not used.

david · February 23

@Neoon said: You should probably have multiple links running and route traffic always over the best available link. There won't be any interruption and you could restart your links which are currently not used.

That's a really good idea. I hadn't thought of that.

david · February 23

I found another solution, instead of restarting wireguard, that will switch to a new route with minimal disruption. I'm now using "wg syncconf" to change the ListenPort. The different port is enough to get a different route.

On a phone call, I notice 3-4 dropped SIP packets (20ms each), which is barely perceptible.

The script first runs:

wg showconf wg0 > /blah/wireguard.conf

The configuration in this file is not exactly the same as the wg0.conf in /etc/wireguard. It contains the current, active ListenPort, for example:

[Interface]
ListenPort = 52903

Then it changes the ListenPort to 0, which seems to cause wireguard to choose a new, unused port.

sed -i.bak 's/^ListenPort = .*/ListenPort = 0/' /blah/wireguard.conf

Then, execute the sync:

wg syncconf wg0 /blah/wireguard.conf

Recheck the latency, and repeat if needed until it gets a good route.

Neoon · February 23

I wrote a Tool for to debug port loadbalancing for Wireguard.
https://github.com/Ne00n/PLLP

Basically it brings a wg tunnel up and down a few times and prints the result with the best port and best ping.

david · February 23

Interesting. I'm not convinced the same port will always be routed the same, consistently. Sometimes a good connection, with presumably no port change, just gets re-routed and needs to be reset.

But the port does seem to factor into it, perhaps for no other reason than the load balancer is detecting a new connection.

Howdy, Stranger!

Categories

In this Discussion

Can you tell which route you're using?

Comments

Howdy, Stranger!

Quick Links

Categories

In this Discussion

Can you tell which route you're using?

Comments