All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.
Help understanding Racknerd reducing my cores from 4 to 1.
I'm relatively new to server administration, and I've been managing a couple of VMs with Racknerd. However, this morning, I received a notification about downtime. Upon checking my emails, I learned that my CPU core count had been reduced to 1 due to overutilization.
When I accessed my grafana account and reviewed the host information, I came across this screenshot: https://imgur.com/a/rZyCaq5. It's evident that around midnight, there was a sudden spike in CPU Steal, reaching 40%. I believe this might have been caused by watchtower updating containers and quickly returning to normal levels. Over the past two weeks, my load average stayed around 15%, significantly below the maximum 30% utilization mentioned in the email.
I'm unsure whether I'm unintentionally straining their system or if this was a momentary excessive usage. Could this situation have arisen because other users sharing the same machine also experienced usage spikes simultaneously, making it challenging to pinpoint which VM was primarily responsible? Any advice on how to avoid this in the future? I asked support who just enabled my cores again without providing me much help to avoid this other than monitor my system which I am already doing
Thanks
Comments
Its just been suspended again grafana node exporter reporting 80% usage with 7.5% steal. #HG06098 and #XM36353. Could you help me @dustinc
Has your bandwidth been doubled?
it has
Has your cores been quattroed?
They renabled the cores again but they don't seem to want to say anything more than use htop. I am trying this to see if node exporter is not doing a good enough job of reporting for some reason
use top command to see what is using all that cpu usage. htop will also help with that
Hi RackNerd support agent.
I'd like to see that email. Never heard of a company physically reducing something vs throttling.
You may want to consider VPS with dedicated vCores
The email:
Yeah, I have been attempting to look for hybrid solutions on ServerHunter to try and find inexpensive VPS's with dedicated CPUs but not having much luck. Any recommendations? After being restricted twice in a few hours and only getting canned responses not directly answering my questions I am getting itchy feet. I have otherwise had a very good experience with Racknerd and im 10 months in
annoying but fair, imo
im based anyways so yeah dont listen to me
@Kmaid
How much is your budget
i have had that happen before where they will limit your cpu usage to 30% if your using more then the fair share cpu usage policy is. my advise is to see what you can do to try and reduce the cpu usage. maybe a runaway program eating up to much resources etc..
That's crazy. I understand them needing to throttle it but to physically change something (which some configurations may depend on) seems extremely heavy handed.
I don't use Racknerd for squat, won't ever use Racknerd for squat.
Recommendation wise, Crunchbits has a great lineup of VDS. What you see is what you get, no questions asked. I recently switched my dedicated from another provider over to a Crunchbits VDS and it's been perfect.
https://crunchbits.com/vds#Plans
Appreciate the mention. The VDS lineup is specifically exactly for your current issues @Kmaid, although the usage stats you are showing would also be fine on our SSD or NVMe VPSes. There is still a live coupon for -30% on the entire Xeon VDS lineup.
Hi @Kmaid -- First off, thank you so much for choosing RackNerd as your provider (for almost a year by now). We sincerely appreciate your business and trust.
In the spirit of complete transparency and to provide additional background here, I've got to tell you that incidents like these where we have to act on a VPS’s CPU utilization are quite rare in general. At the same time, it's important for us to ensure that one client's activities, whether on a single or multiple VMs, do not adversely affect the experience of their neighbors on the same physical server. While such instances are rare, we take immense pride in the level of service we provide, and our goal is to offer an optimal experience for every customer. We swing into action only when our monitoring systems trigger a high-load alert on a specific node. From there, we dig in to identify which VPS is drawing excessive, sustained CPU utilization on that particular node over a certain period. I also want to emphasize that given how rare this is to begin with - we don't rely on automation for any of this. Each case is manually reviewed by one of our team members at the KVM VPS host node level. We’ve found that helps eliminate false positives, which is a benefit given the rare nature of these events.
About the htop recommendation from our support team — it's more a constraint of the nature of unmanaged services more than anything else. We can't look inside your VPS to see what's gobbling up CPU without your root password. That's why our advice is often high-level, focusing on what we can see from the host node level. In other words - from our perspective, we can’t see the individual processes that you’re running within your VPS - we can only see a total CPU % utilized based on the KVM VM ID. Your situation here, based on what I’ve read - sounds like it could be due to some unintended factors, such as an errant process or a misbehaving application, and it's tough for us to pinpoint without that inside look. Also, as an example - we've seen instances in the past where end-users were not knowingly running such rogue processes (which indicates a possible compromised VM).
Since you've been with us for more than 10 months without any hiccups, my thoughts so far is that this is likely one of those isolated incidents mentioned above, perhaps a runaway process or something along those lines. With that being said, as a proposed resolution path (despite the unmanaged nature of your service), I'm willing to go the extra mile. Shoot me an email at [email protected], and we can arrange for temporary access to your VPS. I will have one of our senior systems administrators take a look. From there, we will have a better idea of what's going on and can provide more tailored advice.
I greatly appreciate your time and look forward to sorting this out with you, thanks!
Why change the configuration of the VM instead of just capping the CPU to whatever your FUP states? Seems extremely heavy handed.
What Dustin sent above is the exact reason why I will stick with Racknerd. Good job; you have repeatedly gone above and beyond for your customers. I hope the OP takes advantage of their free help.
Hi @fluffernutter -- just to clarify, this isn't a daily routine for us (as mentioned - quite rare for a situation like this to surface to begin with). Therefore, we acknowledge that our current process may not be 100% perfect and are open to suggestions for future improvement. At the moment - everything is done on a case by case basis, and we usually notify clients first before anything, though that may not always be the case especially if the activity is particularly disruptive. In either case, we always communicate with our end-user (as we did here with the OP).
SolusVM with KVM virtualization doesn't have an efficient way to only limit frequency. We opt for a temporary core reduction (with communication) as a last resort -- we’ve found that this is still better than a suspension or stopping the VPS entirely. Then when the end user is in communication with us and expresses intent to resolve the matter - we restore back to the original cores right away.
Hi @ivlad -- Thank You so much for the kind words and your continued business. We’re not perfect, but what we do aim for is solid service, competitive pricing, and human customer support that's always available & reachable 24x7. We'll continue to refine our processes and continue doing our best for our customers
We look forward to continuing to work together for many years to come!