Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!


CPU Abuse Notices from VirMach
New on LowEndTalk? Please Register and read our Community Rules.

All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.

CPU Abuse Notices from VirMach

MasonRMasonR Community Contributor

Hey all,

I'll preface this post by saying that VirMach is a terrific provider and this thread is not meant to throw shade at them, just is intended to spark a healthy discussion on a possibly faulty system.

tl;dr: got a CPU abuse message, but VPS is mostly idle. Support insisted there is no chance their abuse detection system is wrong. Anyone else have same experience?


Long version:

I posted over on HostBalls about receiving a CPU abuse notice from VirMach and believing it to be a mistake on their part. To my surprise, I got quite a few replies from others saying they've gotten similar notices erroneously.

Here's a little background - this morning I received an automated email that one of my VPSes was using 240% CPU for many hours and to reduce CPU usage or I will have my service temporarily turned off. I monitor my array of VMs regularly (via live server stats) and was surprised to get this message. I immediately logged into the VPS that I was notified about and found the usage/load -

Low CPU usage and a 0.00 load avg as expected. I checked the access logs and there wasn't any suspicious logins. Root login is disabled and fail2ban is also installed. Really the only thing this VPS runs is a small TeamSpeak server, otherwise is idle as seen in the screencap above.

I replied to the ticket that there must be a mistake and that my VPS is barely using any resources. Level 3 support replies that, "this is a system generated message so it won't be wrong," then instructs me how to use the task manager to monitor CPU usage. I reply that I'm using Linux and attach the screenshot above, then get instructions on how to use top/htop to monitor CPU usage (even though the screenshot was a htop cap).

Side note - how does one reach 240% CPU usage with only 2 vCPUs?

How many of you guys have encountered similar issues with VirMach or other providers? Anyone have their VPS temporarily suspended even though they weren't using lots of resources?

Thanked by 1Aidan
«13

Comments

  • qtwrkqtwrk Member

    customer’s services cannot burst to 95-100% usage for more than 5 minutes so if your service may use more than that or needs longer periods of burst then you will need this add-on.
    According to our terms and conditions, customer’s services cannot average higher than 50% usage within a 2 hour period

    well , maybe you hit 100% for over 5 minutes , they have VERY stricted CPU policy.

    I don't have virmach , but I was hit CPU limits on other provider , where I was compiling something that took hours , so after all my VPS CPU was limits to 400 MHz or something , luckily I didn't get suspended.

    Thanked by 1MasonR
  • saibalsaibal Member

    I have received a notice for 114.8% CPU usage for multiple hours on a single vCPU VPS.

    Thanked by 1MasonR
  • MasonRMasonR Community Contributor

    @qtwrk said:
    well , maybe you hit 100% for over 5 minutes , they have VERY stricted CPU policy.

    The message states that I was using "239.6% CPU for multiple hours." There a 0% chance that a small TeamSpeak server was using that kind of CPU at 6am.

  • @saibal said:
    I have received a notice for 114.8% CPU usage for multiple hours on a single vCPU VPS.

    “Yes sir you were using 2 cores on a one core VPS”

    Thanked by 2MasonR yomero
  • HarambeHarambe Member, Host Rep

    Had 1 VPS suspended for the same 'CPU abuse' reason - completely idle box. I have a feeling their monitoring is using system load, not CPU usage.

    So if someone else hammers the I/O or whatever and it causes the load to spike in your VM - they'll auto suspend/send a notice.

    Thanked by 2MasonR emgh
  • AlyssaDAlyssaD Member
    edited July 2018

    Have you asked for logs of this high cpu usage?

    Also, why not spin up an instance of librenms, observium, or something else. That way you have logs you can counter with. This is a huge reason why I monitor all my vms with Librenms. I know if something is amiss, not working, or running weirdly.

    Thanked by 2MasonR maverickp
  • MasonRMasonR Community Contributor

    @AlyssaD said:
    Have you asked for logs of this high cpu usage?

    I have not. I doubt they have them since their system is always right, though :P

    Control panel only has graphs for network traffic and i/o usage.

  • MasonR said: "this is a system generated message so it won't be wrong,"

    We are never wrong!

    Thanked by 1MasonR
  • @MasonR said:

    @AlyssaD said:
    Have you asked for logs of this high cpu usage?

    I have not. I doubt they have them since their system is always right, though :P

    Control panel only has graphs for network traffic and i/o usage.

    Ask them for logs, to prove it to you. Tell them you found zero things that would cause that high load for hours. Say you need extra info to help figure out what is going on.

    Thanked by 1MasonR
  • defkevdefkev Member
    edited July 2018

    Same here with a SSD1G

    First a "warning" on last Sunday

    119.8% CPU for multiple hours

    Today a shutdown, again (coincident?)

    119.8% CPU for multiple hours

    I do monitor all my boxes, high load trigger at 5% over 1/5/15mins and i have absolutely no idea what they are on about, especially not with the "multiple hours" part.

    Either their anti-abuse system is flawed or they have oversold their stuff and try to get rid of people.

    Whatever it is, its getting quite annoying, especially since the box has been up for half a year (setup and forget) and never gave me any problems.

    Thanked by 1MasonR
  • More than likely it is a misconfigured server that is stuck on something.

  • AlyssaDAlyssaD Member
    edited July 2018

    Wait Wait Wait! How long is your /var/log/auth.log?

    fail2ban searches through it every now and then to verify, add, and remove entries.

    https://github.com/fail2ban/fail2ban/issues/1339

  • MasonRMasonR Community Contributor

    @AlyssaD said:
    Wait Wait Wait! How long is your /var/log/auth.log?

    fail2ban searches through it every now and then to verify, add, and remove entries.

    https://github.com/fail2ban/fail2ban/issues/1339

    Couple Megs. Hmm. Auth and fail2ban logs look normal around the time I got the message, though. And I've never experience this issue (cpu abuse notice) on any other VPS I have including LES machines, all running fail2ban. I am running Debian 8 (jessie) if that matters.

  • ChuckChuck Member

    computer is never wrong. That why I would use self-driving car. I would believe when my domestic robot said that I was wrong.

  • @MasonR said:

    @AlyssaD said:
    Wait Wait Wait! How long is your /var/log/auth.log?

    fail2ban searches through it every now and then to verify, add, and remove entries.

    https://github.com/fail2ban/fail2ban/issues/1339

    Couple Megs. Hmm. Auth and fail2ban logs look normal around the time I got the message, though. And I've never experience this issue (cpu abuse notice) on any other VPS I have including LES machines, all running fail2ban. I am running Debian 8 (jessie) if that matters.

    The thing is if you auth.log is long it has to search through ever entry during certain times. It has pegged my CPU in the past.

    Thanked by 1MasonR
  • MikeAMikeA Member, Patron Provider
    edited July 2018

    Flawed system. Exact reason I manually check things before suspending KVM.

  • AnthonySmithAnthonySmith Member, Patron Provider

    My guess.

    They are getting the info from top themselves which is completely unreliable for monitoring actual guest usage and one of the following is true:

    1) You dont have virtio disk

    2) You dont have virtio net adapter

    3) Your CPU is QEMU-CPU

    Which will result in them seeing the emulation overhead on your VPS from TOP which is not your VPS using it it is their host node using it against your qemu process.

    How do you get 240% from a 2 vCPU system you ask? you have a broken monitoring system with bad logic that is how.

    not sure if they have a rep here, but if they do they should seriously look into forcing cpu-passthrough and exact match along with virtio drivers when possible and monitor through something like atopsar or virt-top with scripted aggregation for actual use

    Thanked by 3MikeA MasonR emgh
  • YuraYura Member

    And nobody tagged him still.

    @virmach

  • HarambeHarambe Member, Host Rep

    @AlyssaD said:

    The thing is if you auth.log is long it has to search through ever entry during certain times. It has pegged my CPU in the past.

    I've only had it for a short period at boot/after restarting fail2ban. With a gigantic auth.log you're still only talking minutes not hours.

    Thanked by 1Shazan
  • defkevdefkev Member

    @AlyssaD said:
    Wait Wait Wait! How long is your /var/log/auth.log?

    No PasswordAuthentication, no fail2ban, no exposed services.

  • defkevdefkev Member

    Please update us with the root password for your VPS, we will have a look

    Just got this from a "Tier 3 Technical Support Agent" after responding to the Shutdown ticket with another monitoring graph showing 99% avg idle over three days...

  • When using VPS, how do you aggregate stats for your analysis purposes?

  • MasonRMasonR Community Contributor

    @greattomeetyou said:
    When using VPS, how do you aggregate stats for your analysis purposes?

    From the provider side or the client side?

    I use https://github.com/BotoX/ServerStatus to get live server stats and plop it all on a status page. But you can also use HetrixTools or LibreNMS or something similar if you want to collect and store the usage history.

  • Don't buy cheap and use cheap.

  • @MasonR CPU usage is usually calculated based on historical data from cpu time, this is divided and factored based on the amount of virtual machine CPUs, hostnode CPUs and their respective times. If your VM was to use massive amounts of CPU (i.e. a stress test) you can easily make those calculations return values of around 115% for a single core VM. A core in virtualisation doesn't literally mean it can only use 100%, that's simply sort of what's to be expected by the cpu time split based on the VMs cores.

    Thanked by 2MasonR JohnMiller92
  • JohnMiller92JohnMiller92 Member
    edited July 2018

    Noob question, but how does CPU usage go above 100%? For example, I'm on a i7 2600 and if I use more cores+threads, I will never get past 100%

    edit: I just saw Anthony's answer btw, does help a bit thanks.

  • MasonRMasonR Community Contributor

    @JohnMiller92 said:
    Noob question, but how does CPU usage go above 100%? For example, I'm on a i7 2600 and if I use more cores+threads, I will never get past 100%

    I just read Anthony's answer btw, does help a bit thanks.

    Typically it'd be 100% per fully utilized core on your VPS. So running one core at max - 100%, two cores at max - 200%, etc. Which would also correspond to your load average, one core at max 1.00 load, two cores at max 2.00 load, etc (assuming little to no disk usage).

    @florianb's answer might be closer to the truth, but what I said above is probably a simplification of that.

    Thanked by 1JohnMiller92
  • JohnMiller92JohnMiller92 Member
    edited July 2018

    @MasonR said:

    @JohnMiller92 said:
    Noob question, but how does CPU usage go above 100%? For example, I'm on a i7 2600 and if I use more cores+threads, I will never get past 100%

    I just read Anthony's answer btw, does help a bit thanks.

    Typically it'd be 100% per fully utilized core on your VPS. So running one core at max - 100%, two cores at max - 200%, etc. Which would also correspond to your load average, one core at max 1.00 load, two cores at max 2.00 load, etc (assuming little to no disk usage).

    @florianb's answer might be closer to the truth, but what I said above is probably a simplification of that.

    I see, so it adds them up per core (all having each up to 100%), then accumulates their usages? For example, if you have a 3 vCore box, your "max" CPU usage is essentially 300%, not 100%? If I got that right

    Thanked by 1angstrom
  • There might be some flaws in their system somewhere. My case was on bandwidth spike without even using after benchmark. They are still good provider but not going to fit my taste after some good conversation in series of tickets.

  • edited July 2018

    MasonR said: using 240% CPU for many hours

    They claimed many hours. You just have to prove otherwise? Do you happen to have logs or stats to prove it otherwise?

Sign In or Register to comment.