Cannot get OVH to respond to terrible network throughput

ralf · June 2022

@OVHcloud_james @OVH_Matt @OVH_UK - sorry for tagging you all, not sure if any of you are able to look into my ticket and advise if there's any way of escalating this. The ticket number is 5700865.

I grabbed a second KS-1 on Monday evening that was finally provisioned yesterday afternoon, to upgrade my 9-year old KS-1 that is the same price but with half the RAM and half the disk.

It soon became apparent that there's a networking issue - throughput to EU is fine, but inbound traffic from the US is abysmal. Hitting a maximum of 30Mbps even using 8 parallel connections in iperf3. With just one, it's about 20Mbps.

I've been back and forth on a support ticket, and they keep saying it's absolutely fine because connectivity to the closest iperf server is OK, but even using OVH's US and Canadian iperf servers shows abysmal performance, when my old KS-1 in the same data centre on a different subnet gets the full 90+Mbps.

The person answering the ticket seems to be being deliberately obtuse, and I'm not sure how to deal with this. I think at this point, I'd be happy to cut my losses and just cancel entirely, but not sure if that's possible and to get a refund (the server was provisioned less than 24 hours ago). Ultimately, I'd prefer the networking issue to just be resolved.

This is the kind of throughput I'm seeing, but it's just as bad to BHS or Clouvider or my own server in the US, and also to Singapore. Basically, anywhere that isn't Europe is terrible, but only on this one machine, my other KS-1 is fine:

goodmachine# date && iperf3 -c hil.proof.ovh.us -i 2 -t 20 -P 5 -4 -R|grep SUM
Wed 15 Jun 11:58:31 BST 2022
[SUM]   0.00-2.00   sec  13.0 MBytes  54.4 Mbits/sec
[SUM]   2.00-4.00   sec  20.5 MBytes  86.0 Mbits/sec
[SUM]   4.00-6.00   sec  22.2 MBytes  93.2 Mbits/sec
[SUM]   6.00-8.00   sec  22.4 MBytes  94.1 Mbits/sec
[SUM]   8.00-10.00  sec  22.4 MBytes  94.1 Mbits/sec
[SUM]  10.00-12.00  sec  21.7 MBytes  90.9 Mbits/sec
[SUM]  12.00-14.00  sec  22.2 MBytes  93.0 Mbits/sec
[SUM]  14.00-16.00  sec  22.4 MBytes  94.1 Mbits/sec
[SUM]  16.00-18.00  sec  22.4 MBytes  94.1 Mbits/sec
[SUM]  18.00-20.00  sec  22.4 MBytes  94.1 Mbits/sec
[SUM]   0.00-20.14  sec   222 MBytes  92.4 Mbits/sec  322             sender
[SUM]   0.00-20.00  sec   212 MBytes  88.8 Mbits/sec                  receiver

root@rescue:~# date && iperf3 -c hil.proof.ovh.us -i 2 -t 20 -P 5 -4 -R|grep SUM
Wed Jun 15 12:58:56 CEST 2022
[SUM]   0.00-2.00   sec  2.05 MBytes  8.61 Mbits/sec
[SUM]   2.00-4.00   sec  2.51 MBytes  10.5 Mbits/sec
[SUM]   4.00-6.00   sec  2.97 MBytes  12.5 Mbits/sec
[SUM]   6.00-8.00   sec  4.12 MBytes  17.3 Mbits/sec
[SUM]   8.00-10.00  sec  4.58 MBytes  19.2 Mbits/sec
[SUM]  10.00-12.00  sec  4.44 MBytes  18.6 Mbits/sec
[SUM]  12.00-14.00  sec  4.48 MBytes  18.8 Mbits/sec
[SUM]  14.00-16.00  sec  4.12 MBytes  17.3 Mbits/sec
[SUM]  16.00-18.00  sec  4.51 MBytes  18.9 Mbits/sec
[SUM]  18.00-20.00  sec  4.03 MBytes  16.9 Mbits/sec
[SUM]   0.00-20.00  sec  40.2 MBytes  16.9 Mbits/sec  137             sender
[SUM]   0.00-20.00  sec  38.2 MBytes  16.0 Mbits/sec                  receiver

ralf · June 2022

Tagging @OVH_APAC because while it's a different org and they probably can't do anything, I just realised that the 3 other accounts haven't been on LET for a very long time.

dahartigan · June 2022

Tagging @jesus because you need some love

MikeA · June 2022

I had a similar issue with an OVH server (not a cheap one) and it took 2 weeks for them to actually do anything. In the end it resulted in them swapping the motherboard completely to fix it, but I assume it didn't have to do with the motherboard actually but related to the NIC/MAC address for the onboard ethernet. Since OVH has issues with their vmac system, and I had been moving vmac/IPs around when this started happening. Maybe not related to you but a coincidence, but good luck. It's a PITA to fix issues like this with them. Since yours is a server with no IPs I wouldn't think it would be related, but it seems awfully similar, speed issues were exactly the same, I couldn't get anything over 20Mbps or so outside of the city the DC was in (seemed like peered/direct connected networks were only fast.)

Smith42 · June 2022

I had 40 servers with bandwidth issues at OVH (each costing $120+) and it took them weeks to acknowledge and months to fix. I also pay extra monthly for premium support. I also got no compensation for the time period the servers under-performed.

I like your patience in the matter.

My opinion, cancel the KS-1 and let it go to another happy soul.

ralf · June 2022

Yeah, it's hard to know what the problem is, as I seem to have full speed even to external systems within Europe. It's just that throughput is failing when going from OVH non-EU to OVH EU. I'm pretty convinced this can only be because they rate-limited someone in the past for abuse.

I've said in the ticket I want to cancel and have a refund as it's still less than 24 hours since it was provisioned, and actually the ticket was opened less than 12 hours after it was provisioned. I'm just kind of cheesed off if I end up having to pay a month AND setup fee for this piece of trash for only a couple of hours use, when if it actually performed correctly it'd be fine.

It's also been stuck in rescue mode for the last 7 hours because they insisted I did that, but nobody has logged into it apart from me.

Since saying I want to cancel, the ticket has at least been updated to say it's been routed to the internal team, but there's no evidence of anyone actually doing anything.

TimboJones · June 2022

I've been back and forth on a support ticket, and they keep saying it's absolutely fine because connectivity to the closest iperf server is OK

If they can get full 100Mbps inbound into their datacenter, I can understand their position as "not our problem". Can you iperf3 between your two servers and observe the problem? That would be a slam dunk confirmation.

ralf · June 2022

You also are misunderstanding the problem. The problem only occurs on connectivity between EU and non-EU OVH networks. I get good performance to other EU networks, so obviously, I get great performance between the two local servers, they're only 2 hops and 0.5ms from each other within the same datacentre. And yes, I have tested that.

The issue is simply that from the new server to ANYWHERE outside the EU, the inbound averages about 20Mbps, sometimes peaks at 25Mbps and occasionally hits high 20s. I have been running an iperf test last lasts a few minutes every hour or so for the last 22 hours since they asked me to put it in rescue mode (but still they haven't attempted to access it). The results are consistent, regardless of time of day.

Outbound is not experiencing these problems, I can happily push at the expected speed, it's just not symmetric. However, as my use case for this extra KS-1 is backup, I need the advertised bandwidth.

Contrast that to the other KS-1 a few racks away. I've had it 9 years, and I've never experienced any significant reduction in bandwidth that entire time. Nor now. It's using the full 100Mbps fine.

There is a networking issue. It is a problem within OVH's network. It's not a temporary outage, or because of transitory issues, it's a constant reduction of service. It doesn't really matter if I can reach EU servers at full speed, there is sill a problem with this machine or OVH's internal network configuration for this subnet.

PulsedMedia · June 2022

Do an MTR both ways, 1000+ packets and submit that to OVH.

ralf · June 2022

@PulsedMedia said:
Do an MTR both ways, 1000+ packets and submit that to OVH.

As they still haven't even bothered logging into the box 24 hours after after they asked me to boot it into rescue mode, and still nobody had logged into it, at 9am this morning I knocked up a script that runs iperf to that server for 60 seconds, waits 30 seconds and repeats. All logged.

So far, I've seen 1 instance that actually hit 30Mbps, although that makes it sound better than it is - I think there's less than 5 were over 25Mbps. The average is a bit under 20Mbps.

When it hit 24 hours since their "internal team" was investigating, I already attached a text summary of the 6 hours that it'd captured. Still no response, so I'll just keep it going until they actually look at it.

PulsedMedia · June 2022

@ralf said:

@PulsedMedia said:
Do an MTR both ways, 1000+ packets and submit that to OVH.

As they still haven't even bothered logging into the box 24 hours after after they asked me to boot it into rescue mode, and still nobody had logged into it, at 9am this morning I knocked up a script that runs iperf to that server for 60 seconds, waits 30 seconds and repeats. All logged.

So far, I've seen 1 instance that actually hit 30Mbps, although that makes it sound better than it is - I think there's less than 5 were over 25Mbps. The average is a bit under 20Mbps.

When it hit 24 hours since their "internal team" was investigating, I already attached a text summary of the 6 hours that it'd captured. Still no response, so I'll just keep it going until they actually look at it.

Do an MTR both ways, 1000+ packets and submit that to OVH.

Arkas · June 2022

I bet If you had a proper OVH dedi that you payed $xxx for, they would reply. I've had a relatively few problems with my KS-1 over the many years, but never had any support done from them, they never bothered replying. Now on the larger dedi I use to have, that was an entirely different story

ralf · June 2022

@PulsedMedia said:
Do an MTR both ways, 1000+ packets and submit that to OVH.

I'm trying out mtr. Surprised if it shows up much, as it's just pings and the routing is pretty similar (although also interesting, not identical and e.g. the route taken by packets from the new machine that has reduced bandwidth is about 2ms faster).

The only anomaly so far is from a non-OVH server in US to both by KS-1s, there's one hop that's apparently got 86% packet loss to both servers. I'm going to try another few external servers because it sounds impossibly high, so I can only assume that it's just not responding to pings very often but still able to pass them through. In the other direction I saw 0 packet loss.

ralf · June 2022

Actually, this is quite interesting... The outbound routes which are fine from both KS-1 ge routed to Cogent in Paris, and mostly stay within Cogent all the way to their destination.

The return path from LA to France start in Cogent but are then routed to San Jose and change network to maybe OVH there. It does seem to be OVH as when I google some of these names from rDNS they belong to OVH, but the format is all host.state.us etc.

The one with packet loss is nyc-nc1-sbb1-8k-nj.us

I think I'll have more of a look at this tomorrow, but it seems that inbound traffic to OVH from everywhere is sucked onto their network quickly and there's that one router that seems to have a lot of packet loss. That wouldn't explain why one server has slowdowns but not the other, but mtr from outside into OVH doesn't receive pings from any of the intermediate hosts.

Anyway, good recommendation!

PulsedMedia · June 2022

@ralf said:

@PulsedMedia said:
Do an MTR both ways, 1000+ packets and submit that to OVH.

I'm trying out mtr. Surprised if it shows up much, as it's just pings and the routing is pretty similar (although also interesting, not identical and e.g. the route taken by packets from the new machine that has reduced bandwidth is about 2ms faster).

The only anomaly so far is from a non-OVH server in US to both by KS-1s, there's one hop that's apparently got 86% packet loss to both servers. I'm going to try another few external servers because it sounds impossibly high, so I can only assume that it's just not responding to pings very often but still able to pass them through. In the other direction I saw 0 packet loss.

So you had no idea even what MTR is? Yet you were bashing OVH for not giving your 5€/month (or was ks-1 less?) dedi concierge support with magic crystal balls etc.?

The onus is on you to give enough evidence of an issue for the staff to escalate / work in it for the fix. It is always with you. No evidence of issue? There is no issue.

Mind you; This is from the perspective of someone who is on the other side of aisle, and it might sound irked, but that's just the reality; No evidence -> Nothing can be done.

9/10 of these types of complaints is just some maintenance somewhere you cannot do anything about, and it goes away in matter of days and absolutely no action should be taken as it's 3rd party non-widespread issue.

So as someone from the other side of the aisle; Take a breather, give it a few days. Then gather evidence in the form a technician can do something about; MTR both ways (absolutely critical it is both ways) with comparisons, and formatted easy to understand. (bullet points and whatnot)

host nyc-nc1-sbb1-8k-nj.us
Host nyc-nc1-sbb1-8k-nj.us not found: 3(NXDOMAIN)

That is not evidence.

ralf · June 2022

@PulsedMedia said:
So you had no idea even what MTR is? Yet you were bashing OVH for not giving your 5€/month (or was ks-1 less?) dedi concierge support with magic crystal balls etc.?

Did you get out of bed on the wrong side today or something? After being helpful before, you've suddenly turned pretty rude.

No, I hadn't ever used mtr before. I've been using ping and traceroute since the mid 90s, but I'd never come across mtr.

The onus is on you to give enough evidence of an issue for the staff to escalate / work in it for the fix. It is always with you. No evidence of issue? There is no issue.

I also said that I'll continue looking into this tomorrow, because it's now late evening here and I'm sick of computers for one day.

And FWIW, I have evidence of an issue. I can clearly see that there is an issue sustained throughput between the OVH network, but I don't know where on the network.

I'm also fairly hamstrung on this - I only have OVH servers in France, 2 in GRA, 1 in RBG. So, realistically, I can only go one way using mtr to test the OVH network. However, I can reproduce the problem when using iperf from OVH CA to OVH France.

Mind you; This is from the perspective of someone who is on the other side of aisle, and it might sound irked, but that's just the reality; No evidence -> Nothing can be done.

I get that, but not only are you now not helping and being pretty rude as well, just because I happened not to have come across one tool that you perhaps use every day. For that, I can only humbly beg your forgiveness.

9/10 of these types of complaints is just some maintenance somewhere you cannot do anything about, and it goes away in matter of days and absolutely no action should be taken as it's 3rd party non-widespread issue.

That's not solving the issue. That's crossing your fingers and hoping for the best.

So as someone from the other side of the aisle; Take a breather, give it a few days. Then gather evidence in the form a technician can do something about; MTR both ways (absolutely critical it is both ways) with comparisons, and formatted easy to understand. (bullet points and whatnot)

Yes, I will do some more digging. For now, I have extensive iperf evidence that shows a throughput issue, and in fact I opened my support ticket with this, to which they responded (clearly not having even read it all) with "we can't do anything without an iperf log".

host nyc-nc1-sbb1-8k-nj.us
Host nyc-nc1-sbb1-8k-nj.us not found: 3(NXDOMAIN)

That is not evidence.

That's also not something I wrote.

I said these domains appeared to be OVH internal as the rDNS was giving back things in the form of host.state.country, and when I googled them, all the results came back with OVH status pages. However, that is still speculation at this point, because the rDNS is basically returning invalid data that can't be resolved back to an IP. Sure, I can run mtr without rDNS tomorrow, but like I said, that can wait until tomorrow now.

But regarding the "maybe something will fix itself in a few days attitude". Sure, "maybe" it will. But just as likely, if nobody complains, it probably won't.

And this doesn't explain why two machines sat in the same datacenter, with supposedly the same hardware (except RAM and disk), that only differ by subnet but are both one router away from otherwise identical routing are having network issues somewhere else in OVH's network. I have logs for over 24 hours now demonstrating the issue, and equally I've had the other OVH machine in the same datacenter for 9 years and NEVER seen any significant loss of throughput EVER.

Yes, I will continue gathering data for another day (because the provider last replied to the ticket for over 34 hours now, but haven't even logged onto the machine that they asked me to leave in rescue mode to even verify my claim - which would be verified running a 10s iperf command).

But frankly, it's already wasted too much of my time, so it's not getting any more than that. It's simply more cost effective for me to write off the set-up cost and monthly fee within the first day of use, and move to another provider. That doesn't mean that I'm not pissed off though, because I know OVH can provide a good service, because I've been experiencing it for the last 9 years.

As you can see, I'm the kind of person who keeps around an old machine for 9 years (bear in mind, the replacement KS-1 from at least 3 years ago is the same price with double RAM and disk, and for months now they've come with 4x the disk), it's actually kind of in their interest to sort out an actual genuine problem on their network, or it will simply be an abrupt end to my business with them. I get it, I'm just one person with 3 machines, but if they don't actually resolve the issue then sooner or later, they'll lose other people too.

Also worth noting is that I've never had to contact support for anything in all those 9 years. I'm not exactly a difficult customer. Soon to be ex-customer at this rate.

But yeah, thanks for being helpful for suggesting to use a tool I'd not heard of, and then turning rude when I said thanks. Hope you have a wonderful day too.

ralf · June 2022

</PMS>

rm_ · June 2022

@ralf said: I said these domains appeared to be OVH internal as the rDNS was giving back things in the form of host.state.country, and when I googled them, all the results came back with OVH status pages. However, that is still speculation at this point, because the rDNS is basically returning invalid data that can't be resolved back to an IP. Sure, I can run mtr without rDNS tomorrow, but like I said, that can wait until tomorrow now.

Press N in mtr and it shows you the IPs instead of hostnames. Then you can just whois the IP on the same console, no need to google for the rDNS or even open a browser at all. And yes OVH is known to use these fake domains with country and state codes, where they are not even close to owning the actual domain. A more clueful approach from them would be to have everything either on .ovh.net, or under the .ovh TLD (which they own anyways) and have it resolve properly back and forth.

Btw another great option to use is mtr --aslookup.

yoursunny · June 2022

When I grow up I don't wanna be like Objectively Very Horrifying.

rm_ · June 2022

@PulsedMedia said: The onus is on you to give enough evidence of an issue for the staff to escalate / work in it for the fix. It is always with you. No evidence of issue? There is no issue.

To be fair the evidence is the consistently low values of iperf (measured over time, periodically, from any location tried), i.e. the actual data transfer speeds were demonstrated to be much lower than expected. Anything further is OVH's job (to figure out and fix), the customer shouldn't be expected to be a networking expert, or to explain to OVH how their own network operates and where is the issue. (Although that could probably help ._.)

Arkas · June 2022

@yoursunny said: When I grow up I don't wanna be like Objectively Very Horrifying.

But don't you want to be One Very Hot member?

Peppery9 · June 2022

@ralf said:
The one with packet loss is nyc-nc1-sbb1-8k-nj.us

I think I'll have more of a look at this tomorrow, but it seems that inbound traffic to OVH from everywhere is sucked onto their network quickly and there's that one router that seems to have a lot of packet loss. That wouldn't explain why one server has slowdowns but not the other, but mtr from outside into OVH doesn't receive pings from any of the intermediate hosts.

This is a red herring, unfortunately. If you're only seeing dropped packets at that one router (and not continuing further along the chain), there is no problem. Routers/switches use hardware for packet forwarding, however ICMP is usually handled by the CPU and is very low priority for the router to respond to.

Highly recommend flicking through this presentation, which describes this in more detail on slide 32-36.

ralf · June 2022

A happy ending: OVH have confirmed there is a problem with the network and refunded the server.

PulsedMedia · June 2022

@ralf said: No, I hadn't ever used mtr before. I've been using ping and traceroute since the mid 90s, but I'd never come across mtr.

MTR is standard diagnostic tool for routing issues.

You should do both ways and submit that, so the support can immediately see if there is an issue.

Support spending hours trying to guess what your issue might be is not good use of anyone's time, where you know exactly and could provide the basic diagnostics. Infact, the only one who can submit the basic diagnostics to start troubleshooting. With networking, it's MTR or nothing.

ralf · June 2022

As I said, I can't do both ways, as I don't have an OVH box outside EU. I can do both ways to a US machine outside the OVH network, but they only want to know about network issues in their own network, which is totally understandable.

But as they've cancelled and refunded the server, I no longer care about the poor network performance as my other OVH servers are fine. It can be someone else's problem.

Howdy, Stranger!

Categories

In this Discussion

Cannot get OVH to respond to terrible network throughput

Comments

Howdy, Stranger!

Quick Links

Categories

In this Discussion

Cannot get OVH to respond to terrible network throughput

Comments