Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!


Shells Virtual Desktop
BMail.ag - Secure Email Service
Server.net
CPLicense.net
VPS Server
Buy VPN
Vultr
VMs for AI
HostDare
ReliableSite White-Label Dedicated Hosting for Resellers
InterServer VPS
BMail.ag - Secure Email Service
Best VPN
High-Performance Bare Metal Server Solutions
Karvl.com
Server Mania Cloud Hosting
DataWagon Hosting
AlphaVPS Hosting
Evoxt.com
Clouvider
VPS Hosting with NVMe
Residential IPs in the US & 4G Mobile Proxies in EU & US with Unlimited Bandwidth
ReliableSite White-Label Dedicated Hosting for Resellers
Rabisu - Hosting Solutions
Shells Virtual Desktop
New on LowEndTalk? Please Register and read our Community Rules.

All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.

Clouvider VPS Dropping Traffic and Connection Timeouts constantly

taviitavii Member
edited February 2025 in Help

Hi, ever since I bought my first Clouvider VPS I've had issues with connection timeouts and dropped traffic. I asked @Clouvider to look into it and they said everything is fine.

Between 2%-7% of outgoing DNS requests get no replies and outgoing HTTP requests periodically get an exact 5 second response time added to them or they just get timed out completely.

I moved my services from AWS where I never had these kind of problems. The VPS is hosted in Manchester and it's the cheapest one they offer.

I suspect it is their DDoS protection causing all of this but they denied it. I moved to Google's DoH to fix the DNS problem temporarily (works maybe cause higher timeout or it just doesn't drop that traffic?). I also haven't done anything to Ubuntu/Kernel.

I really don't know what to do anymore cause this is unusable and I prepaid for it, I wanted it to be good..

«1

Comments

  • @Clouvider any idea what it could be?

  • ClouviderClouvider Member, Patron Provider
    edited February 2025

    @tavii said:
    @Clouvider any idea what it could be?

    We don’t provide support through the forum. If you can demonstrate the evidence of the issue within our network, please work with the support on a resolution.

    Thanked by 2vr10 itachikonoha
  • @Clouvider said:

    @tavii said:
    @Clouvider any idea what it could be?

    We don’t provide support through the forum. If you can demonstrate the evidence of the issue within our network, please work with the support on a resolution.

    Okay I will try to make a ticket again. Thanks

  • ClouviderClouvider Member, Patron Provider

    @tavii said:

    @Clouvider said:

    @tavii said:
    @Clouvider any idea what it could be?

    We don’t provide support through the forum. If you can demonstrate the evidence of the issue within our network, please work with the support on a resolution.

    Okay I will try to make a ticket again. Thanks

    👍
    Once raised, please DM me the ticket ID, I will monitor it for you 😉.

  • How often do you make a http connection?

  • jsgjsg Member, Resident Benchmarker

    @Clouvider said:

    @tavii said:
    @Clouvider any idea what it could be?

    We don’t provide support through the forum. If you can demonstrate the evidence of the issue within our network, please work with the support on a resolution.

    OK. But you might want - in your own interest - offer us a statement on the matter. After all you to a large degree live from you "very good network" reputation.

    Unless, of course, OP is not day-dreaming and there actually are some problems. In that case it might seem wiser to just block questions ...

    @tavii

    Between 2%-7% of outgoing DNS requests get no replies

    That in my eyes (a) is not necessarily to do with Clouvider, and (b) is not a live or die issue but rather an annoying but not critical one

    and outgoing HTTP requests periodically get an exact 5 second response time added to them or they just get timed out completely.

    That probably is concerning and it would serve Clouvider well to either prove that it's not their fault or to investigate and solve the problem, and quickly.

  • @jsg Do you know any tool I could use to troubleshoot this?

    This is what I got for now, my service sends a POST request every about 20 seconds to port 18080. Some requests take 5 seconds longer than normal and some completely timeout (possibly a 73 second timeout).

    I also tested it with curl doing different GET/POST, HTTP and HTTPS requests to different websites and the exact same behaviour. I uploaded some screenshots below:

    https://imgur.com/a/ZGEYA3w

    Again, my services work completely fine everywhere else including AWS.

  • @Arirang said:
    How often do you make a http connection?

    Doesn't seem to matter but currently it's every 20 seconds

  • amarcamarc Veteran
    edited February 2025

    So you did not take into account that "service at port 18080" you are making POST request to is "glitching" ?

    Edit:

    I did not see whole imgur page.. So, how does your /etc/resolv look like ? Did you try to curl to IP instead DNS name to isolate DNS resolvers as issue ?

  • @amarc said:
    So you did not take into account that "service at port 18080" you are making POST request to is "glitching" ?

    Edit:

    I did not see whole imgur page.. So, how does your /etc/resolv look like ? Did you try to curl to IP instead DNS name to isolate DNS resolvers as issue ?

    I did curl http://54.84.170.143/user-agent and got the same behaviour:

    root@Alba:~/dnspyre# time curl http://54.84.170.143/user-agent
    {
    "user-agent": "curl/8.5.0"
    }
    real 0m0.177s
    user 0m0.008s
    sys 0m0.006s
    root@Alba:~/dnspyre# time curl http://54.84.170.143/user-agent
    curl: (28) Failed to connect to 54.84.170.143 port 80 after 135992 ms: Couldn't connect to server
    real 2m16.021s
    user 0m0.020s
    sys 0m0.020s

    /etc/resolv.conf looks like this:
    root@Alba:~# cat /etc/resolv.conf
    nameserver 8.8.8.8
    nameserver 8.8.4.4
    nameserver 2001:4860:4860::8888
    nameserver 2001:4860:4860::8844
    search .

  • I use Google DoH through cloudflared for my DNS server but that's separate to the system DNS.

  • amarcamarc Veteran
    edited February 2025

    Yeah, that looks like rate-limiting to me. But it's hard to believe it's on originating side of story.

    Why can't you spin up Nginx on some other random provider/VPS and test this within "your environment" to confirm it's actually Clouvider's VPS network issue. Do not rely some public service is not somehow limiting some requests/user agents/IP's/Providers

  • @amarc said:
    Yeah, that looks like rate-limiting to me. But it's hard to believe it's on originating side of story.

    Why can't you spin up Nginx on some other random provider/VPS and test this within "your environment" to confirm it's actually Clouvider's VPS network issue. Do not rely some public service is not somehow limiting some requests/user agents/IP's/Providers

    Ok so I hosted a caddy server on my home network and curled a text file. Now I don't see the 5 second delays anymore but I still get the random 2 minute response times.

    I tried doing 4 curls in parallel doing 10req/sec each and sometimes they seem to stop at the same time but not consistently.

    It's definitely a clouvider issue, either the ubuntu image they provide or their network.

  • @tavii said: curl: (28) Failed to connect to 54.84.170.143 port 80 after 135992 ms: Couldn't connect to server

    I would try going to other services such as cloudflare.

    is it the same thing if you do
    time curl "https://discord.com/cdn-cgi/trace"
    or
    time curl "https://cloudflare.com/cdn-cgi/trace"

  • @nanankcornering said:

    @tavii said: curl: (28) Failed to connect to 54.84.170.143 port 80 after 135992 ms: Couldn't connect to server

    I would try going to other services such as cloudflare.

    is it the same thing if you do
    time curl "https://discord.com/cdn-cgi/trace"
    or
    time curl "https://cloudflare.com/cdn-cgi/trace"

    Tried both, same thing. Cloudflare, Google, Linode, no matter what I try same problem

  • The weirdest thing is why is it always exactly either 5 seconds or 2 minutes and 12 seconds. Even if it would be a DDoS think it's definitely not intended.

    Also weird that it's only for outgoing traffic. Incoming DNS queries don't get dropped and incoming HTTP requests also seem fine

  • It's weired. I have Clouvider vms in all locations except Manchester you have. I have no problem to make outgoing http connections using Curl every 10 second through ipv4 and ipv6.

  • Does this happen over ICMP as well?

  • @ehhthing said:
    Does this happen over ICMP as well?

    No

  • Could you test out with a few changes to --max-time and --connect-timeout (and maybe --retry-max-time) to see if things fail faster and/or more predictably?

    You could also just reboot into one of the rescue environments and use curl from there (just to rule out some Ubuntu issue).

  • ClouviderClouvider Member, Patron Provider
    edited February 2025

    @Arirang said:
    It's weired. I have Clouvider vms in all locations except Manchester you have. I have no problem to make outgoing http connections using Curl every 10 second through ipv4 and ipv6.

    We spun up a VM in Manchester on each of the Hypervisors there when this thread has showed up and cannot replicate it neither across a series of 1000 attempts to a known good web server outside of our network nor for the DNS resolution.

    Plus it feels like something that even our StatusCake should pick up.

    Thanked by 2jsg Arirang
  • AndreixAndreix Member, Host Rep
    edited February 2025

    Dumb suggestion here, but have you tried an old fashion VM reinstall with a fresh OS (different than what you actually have) ?

    If @Clouvider said they replicated the issue in the same environment with you and couldnt find any issue, I'm thinking maybe some corrupted binaries/libs on your OS.

  • JabJabJabJab Member
    edited February 2025

    It's probably time to look into tcpdump/wireshark and see what is going on there - retransmissions?
    or start with strace to make sure it's not your system getting bottlenecked by something and this is never "send out" at times you think it's send.

    Thanked by 1cmeerw
  • jsgjsg Member, Resident Benchmarker
    edited February 2025

    @tavii said:
    @jsg Do you know any tool I could use to troubleshoot this?

    This is what I got for now, my service sends a POST request every about 20 seconds to port 18080. Some requests take 5 seconds longer than normal and some completely timeout (possibly a 73 second timeout).

    I also tested it with curl doing different GET/POST, HTTP and HTTPS requests to different websites and the exact same behaviour. I uploaded some screenshots below:

    https://imgur.com/a/ZGEYA3w

    Again, my services work completely fine everywhere else including AWS.

    as well re some of @amarcs thoughts

    For a start, I got "error "Imgur is temporarily over capacity. Please try again later."". Can you maybe put it somewhere else as well? If you want on one of your servers + a PM with the URL to me.

    Reading the whole thread so far I see the following basic problems wrt your testing:

    • curl is, well, curl that is, some program whose inner working details you highly likely don't know (+ using "time" makes sense as a first crude step to get a ballpark number but not for precise timing). You see I wrote my own routines connecting to (and downlading from) servers for a reason: The moment you use a library you don't really and in detail know all the small tids and bits - et voilà I saw many, many cases where almost invariably targets as well as very rarely the server I tested acted weirdly like sometimes being very fast and sometimes really slow even up to timing out.
    • the internet often is a weird place and testing from or towards virtual servers one sees even more weirdness, often a spike from a node neighbour being the culprit.
    • the internet also is a complex place, at least when down in the plumbing. Your assertion for example is very easy to brush off and very hard to even properly investigate, let alone prove. One major reason being that it's in its very nature to pretty much always have (a) the source, (b) the target, and (c) a hard to really know number of diverse hops in between, some of which even are deliberately hidden (incl. and especially FWs).

    Frankly, the main reason I don't simply brush your problem off saying "oh well, the internet" is the fact that I have seen @Clouvider's nodes (all over the world) strangely quite often, e.g. testing against their LAX test node I've seen everything in between "impressive!" and "yuck, again in between being snail slow and dead it seems". And please note that my aim is not to bash Clouvider but rather to stress that even supposedly good providers obviously can't control everything (well, that's the internet), although I've seen it more often with Clouvider than with others.

    In your place I'd (a) try to run the same test from different, preferably good quality, providers and (b) towards a few different targets, preferably owned by you and of decent quality.And I'd do that multiple times at different times of day. And I'd strongly suggest to not use a "practical swiss pocket-knife" (like curl) but proper software for specific and detailled tests.
    Oh, and btw, before even starting with that I'd run multiple mtrs to/fro your set of sources and targets to get a first impression and ballpark numbers.

    Finally, sorry, but my first guess would be that it's not a Clouvider (connectivity) problem but either your VPS/node or some kink in the connection, read: some hop(s) on the internet.
    Btw DNS over https? Are you joking? That's adding additional layers and potential problems on top. There's a good reason why I wrote my own DNS test routines for my monitoring software ...

  • ClouviderClouvider Member, Patron Provider

    @jsg said:
    Frankly, the main reason I don't simply brush your problem off saying "oh well, the internet" is the fact that I have seen @Clouvider's nodes (all over the world) strangely quite often, e.g. testing against their LAX test node I've seen everything in between "impressive!" and "yuck, again in between being snail slow and dead it seems". And please note that my aim is not to bash Clouvider but rather to stress that even supposedly good providers obviously can't control everything (well, that's the internet), although I've seen it more often with Clouvider than with others.

    Are we talking iperf nodes ? Often changing the port helps - iperf servers tend to bug out and the process needs restarting from time to time (different port = different process). We have it automated but you might be unlucky with your timing. You will appreciate our iperf servers are very popular, being used in many benchmark scripts, so the 10Gbps shared across up to 10 test users certainly doesn’t help. This doesn’t mean there’s any network issue either, nor is it affecting any services. LA has plenty of capacity, same 100% Juniper network and uses premium providers as in every other PoP of ours.

  • jsgjsg Member, Resident Benchmarker

    @Clouvider said:

    @jsg said:
    Frankly, the main reason I don't simply brush your problem off saying "oh well, the internet" is the fact that I have seen @Clouvider's nodes (all over the world) strangely quite often, e.g. testing against their LAX test node I've seen everything in between "impressive!" and "yuck, again in between being snail slow and dead it seems". And please note that my aim is not to bash Clouvider but rather to stress that even supposedly good providers obviously can't control everything (well, that's the internet), although I've seen it more often with Clouvider than with others.

    Are we talking iperf nodes ? Often changing the port helps - iperf servers tend to bug out and the process needs restarting from time to time (different port = different process). We have it automated but you might be unlucky with your timing. You will appreciate our iperf servers are very popular, being used in many benchmark scripts, so the 10Gbps shared across up to 10 test users certainly doesn’t help. This doesn’t mean there’s any network issue either, nor is it affecting any services. LA has plenty of capacity, same 100% Juniper network and uses premium providers as in every other PoP of ours.

    Nope I was talking about your "download x M|GB test file" speedtest servers. I generally try to avoid Iperf crap as far as any possible.

    As for "culprit(s)" I clearly said that I have no basis to presume it's your fault, I merely said that I experience it often with your speedtest servers, whatever the reason for that may be. As I already often stated the whole internet consists, exaggerating somewhat but I guess my point gets clear, of "culprits" by its very nature.

    Thanked by 1Clouvider
  • JokullJokull Member
    edited February 2025

    @tavii said:
    I did curl http://54.84.170.143/user-agent and got the same behaviour:

    root@Alba:~/dnspyre# time curl http://54.84.170.143/user-agent
    {
    "user-agent": "curl/8.5.0"
    }
    real 0m0.177s
    user 0m0.008s
    sys 0m0.006s
    root@Alba:~/dnspyre# time curl http://54.84.170.143/user-agent
    curl: (28) Failed to connect to 54.84.170.143 port 80 after 135992 ms: Couldn't connect to server
    real 2m16.021s
    user 0m0.020s
    sys 0m0.020s

    In my opinion, this is not a problem on @Clouvider's side, especially not a DDoS or some kind of hacker attack.

    I have edited the script and run it across all my VPS on Clouvider, and all ran flawlessly with responses under 500ms:

    for i in {1..100}; do time curl http://54.84.170.143/user-agent ; sleep 0.5 ; done
    

    The problem might lie in Cloudflare or DoH, if the package is going through them.

    OP, please edit your post, you almost gave me a heart attack.

    Thanked by 1Clouvider
  • If you need to check a network for packet loss, the easiest approach is to use MTR.

    The following command will run a cycle of 20 tests against Google's DNS server and tell you what packet loss you're seeing at each hop

    mtr --report-cycles=20 8.8.8.8

    That IP should be available from pretty much anywhere, but you can replace 8.8.8.8 with any other IP or domain

  • @tavii said:

    @amarc said:
    So you did not take into account that "service at port 18080" you are making POST request to is "glitching" ?

    Edit:

    I did not see whole imgur page.. So, how does your /etc/resolv look like ? Did you try to curl to IP instead DNS name to isolate DNS resolvers as issue ?

    I did curl http://54.84.170.143/user-agent and got the same behaviour:

    root@Alba:~/dnspyre# time curl http://54.84.170.143/user-agent
    {
    "user-agent": "curl/8.5.0"
    }
    real 0m0.177s
    user 0m0.008s
    sys 0m0.006s
    root@Alba:~/dnspyre# time curl http://54.84.170.143/user-agent
    curl: (28) Failed to connect to 54.84.170.143 port 80 after 135992 ms: Couldn't connect to server
    real 2m16.021s
    user 0m0.020s
    sys 0m0.020s

    /etc/resolv.conf looks like this:
    root@Alba:~# cat /etc/resolv.conf
    nameserver 8.8.8.8
    nameserver 8.8.4.4
    nameserver 2001:4860:4860::8888
    nameserver 2001:4860:4860::8844
    search .

    You're using the IP address directly. All this DNS shit is a red herring since its not involved in the above at all.

  • bobertbobert Member
    edited February 2025

    I'm having the same issues.

    Using mtr with TCP flag shows massive packet loss over their transit links (icmp and ix traffic is unaffected). My vps with them is in NJ. This is a very strange problem.

    https://lowendtalk.com/discussion/202952/clouvider-vps-lagging-but-no-indications-of-the-problem#latest

Sign In or Register to comment.