Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!


Clouvider vps lagging but no indications of the problem
New on LowEndTalk? Please Register and read our Community Rules.

All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.

Clouvider vps lagging but no indications of the problem

bobertbobert Member
edited February 22 in Help

I got a vps from clouvider a while ago and it has not had any issues until a week ago when it started lagging during peak hours.

The vps is very lightly loaded, there's no cpu steal, swapping, iowait or interrupts to speak of. And there is no packet loss or ping jitter either.

Any ideas what's wrong?

https://i.imgur.com/3XC0eL3.mp4

Comments

  • @bobert said: And there is no packet loss or ping jitter either.

    Packet loss or ping jitter to where? 1.1.1.1 or where you are connecting from?

  • @cmeerw said: Packet loss or ping jitter to where? 1.1.1.1 or where you are connecting from?

    there's 0 packet loss or jitter to anywhere. including my own ip

  • @bobert said:

    @cmeerw said: Packet loss or ping jitter to where? 1.1.1.1 or where you are connecting from?

    there's 0 packet loss or jitter to anywhere. including my own ip

    Assuming you are sshing into the server: run tcpdump (restricting to port 22) on the server and on your local machine and then compare the tcpdumps to confirm that all packets sent from the server also show up on your local end.

  • @cmeerw said: Assuming you are sshing into the server: run tcpdump (restricting to port 22) on the server and on your local machine and then compare the tcpdumps to confirm that all packets sent from the server also show up on your local end.

    That sounds like a terrible plan no offense to have to line up tcpdumps like that.

    Fortunately the TCP switch on MTR shows the problem. There's massive packet loss and ping spikes when using TCP. Looks like I'm having the same issue as this guy. Kind of weird how clouvider is prioritizing ICMP.

    https://lowendtalk.com/discussion/202919/clouvider-vps-dropping-traffic-and-connection-timeouts-constantly

  • yes, im also having the same issues on NL nodes

  • @bobert said:

    Fortunately the TCP switch on MTR shows the problem. There's massive packet loss and ping spikes when using TCP.

    Hello, thanks for the explanation. I'm just learning about mtr right now :)

    I've also experienced some connection issues with Clouvider (NYC) to Hostbrr, but then I checked my Vultr VPS with the same region. Both had the same problem, so I guess it's a regional thing.

    Could you please post your result of your mtr? I want to compare it with my result:

    This is from Clouvider (NY)

    mtr --report --tcp api.openai.com
    Start: 2025-02-23T02:18:13-0500
    HOST: CLOUVIDER-NYC2              Loss%   Snt   Last   Avg  Best  Wrst StDev
      1.|-- ???                       100.0    10    0.0   0.0   0.0   0.0   0.0
      2.|-- 194.54.144.50              0.0%    10    1.0   1.2   1.0   2.0   0.3
            194.54.144.52                    
      3.|-- 162.158.61.42              0.0%    10    1.9   2.9   1.7   9.6   2.5
            10.1.10.229                      
      4.|-- 10.1.10.225                0.0%    10   38.5   7.8   1.8  38.5  11.2
            162.158.61.117                   
            162.158.61.101                   
            162.158.61.113                   
            162.158.61.221                   
      5.|-- peer-as62240.pr01.lga4.tf 10.0%    10    1.9   2.0   1.6   2.4   0.2
            172.66.0.243                     
      6.|-- 172.66.0.243               0.0%     4    1.8   2.3   1.8   2.7   0.4
            162.158.61.42 
    
    

    This is from Vultr (NY)

    mtr --report --tcp api.openai.com
    Start: 2025-02-23T07:23:59+0000
    HOST: VULTR-NYC5                  Loss%   Snt   Last   Avg  Best  Wrst StDev
      1.|-- ???                       100.0    10    0.0   0.0   0.0   0.0   0.0
      2.|-- 100.100.200.1              0.0%    10    3.5   5.9   3.5  13.5   3.5
      3.|-- 10.64.1.13                 0.0%    10    6.4   5.7   0.9  24.3   7.2
            10.64.1.9                        
      4.|-- 10.64.4.197               10.0%    10   16.7 457.2   1.1 3039. 1024.0
            10.64.4.25                       
            10.64.4.29                       
      5.|-- nyiix.as13335.net          0.0%    10   13.0  11.4   2.1  26.6  10.3
            Unassigned149.nyiix.net          
            de-cix-new-york.as13335.net      
            de-cix-new-york.as13335.net      
      6.|-- 162.158.61.109             0.0%    10    2.4   3.2   1.4  15.8   4.4
            162.158.61.117                   
            162.158.61.115                   
            162.158.61.123                   
            162.158.61.107                   
            162.158.61.103                   
            162.158.61.101                   
      7.|-- 172.66.0.243               0.0%    10    1.6   1.9   1.4   3.6   0.
    
  • amarcamarc Veteran

    Learn how to interpret MTR and what is considered loss and what not/ICMP filtering

  • jsgjsg Member, Resident Benchmarker

    @bobert said:
    ... it started lagging during peak hours.

    Please define

    • what is lagging (network, what exactly, disk, ...)
    • what do you mean? (as compared to what?)
    • peak hours (where?)

    @Jokull said:
    mtr --report --tcp api.openai.com

    (emphasis mine)

    In this case I would assume that the problem is target related.

  • JokullJokull Member
    edited February 23

    Hello @amarc and @jsg,

    Sorry if there's been a misunderstanding :)

    I'm not the one complaining about the network problem :#
    As you can see, everything is normal according to my report on the NY region.

    I just posted it to compare the result with OP, to see if the problem is a regional thing or not.
    I'm using openai as generic target hostname because I assume the traffic route to openai was heavy at the time I ran the test.

  • @jsg said: what is lagging (network, what exactly, disk, ...)

    Look at the video I posted, it should be obvious there is lag. Why does it feel like nobody here looked at it?

    what do you mean? (as compared to what?)

    Any other vps or dedicated server I have. I have over 20 of them and none of them lag like this.

    peak hours (where?)

    From around now to 10 hours from now. Peak traffic times for EU + NA.

    Thanked by 1jsg
  • ClouviderClouvider Member, Patron Provider
    edited February 23

    I'm personally working on a server created on the same node as yours and can see absolutely no issues. StatusCake does neither.

    Peak consumption on our New York network has started 1 hour ago and ends in about 45 minutes.

  • @Clouvider said: I'm personally working on a server created on the same node as yours and can see absolutely no issues. StatusCake does neither.

    There is 0 problem when the link goes over IX.

  • ClouviderClouvider Member, Patron Provider

    @bobert said:

    @Clouvider said: I'm personally working on a server created on the same node as yours and can see absolutely no issues. StatusCake does neither.

    There is 0 problem when the link goes over IX.

    Our office is in London, there's no IX involved.

  • I've narrowed down the issue myself after spending 2 hours compiling evidence while clouvider denies there are any issues.

    The issue is when they route traffic over GTT.

    They don't route traffic over GTT when using ICMP so the only way you can diagnose this issue is if you use the TCP flag on MTR.

    When you use the TCP flag, the packet is load balanced between GTT and Telia. For a few minutes it looks like they turned off Telia and every hop from GTT and afterwards had 1000+ ping. Now it is switched to Telia and there are 0 ping spikes.

    They said they have not changed anything and still deny there is a problem.

    Looking for a new host again :/

    Thanked by 1ScreenReader
  • ClouviderClouvider Member, Patron Provider
    edited February 24

    We checked the connectivity using a VM on the same Hypervisor in the same subnet via multiple routes using all 3 carriers and 2 internet exchanges. We explained the Customer that there is no issue that we could replicate.

    We spent substantial number of hours at senior engineer level to assist the OP. That is despite the OP has decided to name and shame without an active contact with support (ticket opened 22nd February late at night, after previous ticket closed on OP request).

    The OP has pretty much demanded that the entire network configuration be adjusted to what the OP perceives as the “good configuration” which appears to be, at the moment, as this seems to change through the ticket, preference towards being routed through Arelion specifically (or the OP will cancel and presumably continue bashing us over here).

    The issue reported by the OP couldn’t be replicated despite extensive efforts and the OP refused to follow the suggestion to start afresh since the issue appears to be limited to OPs VM only.

    We are sorry we haven’t been able to assist the OP on this occasion. We wish the OP all the best in the future.

  • @Clouvider said: We checked the connectivity using a VM on the same Hypervisor in the same subnet via multiple routes using all 3 carriers and 2 internet exchanges. We explained the Customer that there is no issue that we could replicate.

    I doubt this since it was evident that you were still using ICMP 6 replies later when I said multiple times that this issue does not show up in ICMP.

    We spent substantial number of hours at senior engineer level to assist the OP.

    I'd like to also point out that I am senior level too, and you made me waste several hours compiling multiple videos, bi-directional mtrs and repeatedly explaining myself. If your senior level engineers can't figure out this issue that I've spent several hours making obvious then you should fire them.

    The OP has pretty much demanded that the entire network configuration be adjusted to what the OP perceives as the “good configuration” which appears to be, at the moment, as this seems to change through the ticket, preference towards being routed through Arelion specifically.

    What other configuration did I suggest?

    That is despite the OP has decided to name and shame without active contact with support.

    I had already been in contact with your support 5 days ago.

    There is nothing wrong with having problems pop up with your network.

    The issue is in how you handle them. And it looks like you are still in the mindset of the customer is usually wrong and that anyone paying you less than 1000/month is not worth your time back when I had dedicated servers with you years ago.

Sign In or Register to comment.