CPU steal times with various providers

atomi · November 2020

Most of us have various servers with different providers, and I decided to write a bit different review. I usually monitor my servers with many monitors, and one is self-hosted Zabbix. It can tell you many details and here we are mostly talking about CPU steal. Here is a short description for those who are not familiar with this term:

CPU Steal = The percentage of time spent in involuntary wait by the virtual CPU while the hypervisor is servicing another virtual processor.

Now that's one indicator to see if the hosting provider is overselling their host nodes. This is the reason why I thought people would be interested in these stats. You can see CPU steal from your server with 'top'.

These stats are from last 7 days and all these virtual servers are low-end packages that have a single vcore and around 1 gb ram. Some of my servers are running actual services that are done with PHP so servers are basic LEMP environments. These services are pretty similar so no big differences there and some servers even have solid CPU steal when idling. Pink color is CPU steal in these graphs

Netcup VPS G7 (average 12.07%)

BuyVM LU (average 7.99%)

Cloudcone LA (average 14.34%)

Scaleway Start1 (average 5.62%)

Virmach ATL (idle, average 3.22%)

Towwwer.host (idle, average 9.25%)

Now even I highlighted some providers here with CPU steal, it can vary depending on the host node. For example, I have two very similar VPS with BuyVM LU and one doesn't have any steal. Some providers like Cloudcone or Scaleway are having great amount of steal even when you are moved to a different host node.
There are also providers that are not having any CPU steal even when I thought so Hostsolutions.ro, Alphavps, OVHcloud....
During the years I have already killed some VPSs with quite bad results, and the worse was almost 50%.

One thing that I noticed that almost none of these highlighted servers don't have IOwait which can be also one indicator of overselling. I have a couple of servers with high IOwaits and similar LEMP usage than servers above. Maybe I will do a different review of those ones later.

Feel free to share more experiences with your CPU steal times

AC_Fan · November 2020

For NetCup, G7 is 2 generations old, and a VPS with them does not have guaranteed CPU. The RS line has unlimited CPU usage and a guarantee of steal being below 3%, and the 2000 G9 is one of the best options right now, considering CPU performance per dollar.

BuyVM does have unlimited CPU on most plans, so (ab)use is practically guaranteed. CloudCone, Scaleway and VirMach seem to be well-known for their poor CPU performance (somebody provided a screenshot of steal of over 50% on Scaleway too).

atomi · November 2020

I know that G7 is not the latest but RS line is almost same price every month that some of these servers are yearly.

Also Php-friends and Avoro are solid options when looking for dedicated CPUs

AC_Fan · November 2020

@atomi said:
I know that G7 is not the latest but RS line is almost same price every month that some of these servers are yearly.

Also Php-friends and Avoro are solid options when looking for dedicated CPUs

Yup, it goes something like this (assuming well balanced host nodes)
CPU: Avoro ~ NetCup > PHP-Friends > Hetzner AX
RAM: PHP-Friends > NetCup > Hetzner > Avoro
Disk: NetCup > Hetzner > PHP-Friends > Avoro.
Generous and decent bandwidth on all systems.

On average, the best system is NetCup, then Hetzner or PHP-Friends, and Avoro at the end.

seriesn · November 2020

I know @Francisco wont be stealing CPU’s in the future. LU is getting Ryzened up.

About Ryzen, we haven’t had a case of stolen cpu for quite sometime as well

Zerpy · November 2020

CPU steal doesn't always mean oversold hypervisors, but can also be the hypervisor throttling the VM due to too high average (e.g. try use your burst allocation on lightsail and see 85-90% steal).

With that said - nice metrics

I'll add a few just because:

mvps.net:

e2enetworks:

first-root:

heficed (this one had an issue which caused a bunch of downtime):

DigitalFyre:

Ginernet:

flowvps:

Skylonhost:

Providers with absolutely zero iowait (or less than 1% basically):

OVH
ArubaCloud
Hetzner
DigitalOcean
iwstack.com
ServeTheWorld
NetCup Root Server (G8)

yokowasis · November 2020

@AC_Fan said:

@atomi said:
I know that G7 is not the latest but RS line is almost same price every month that some of these servers are yearly.

Also Php-friends and Avoro are solid options when looking for dedicated CPUs

Yup, it goes something like this (assuming well balanced host nodes)
CPU: Avoro ~ NetCup > PHP-Friends > Hetzner AX
RAM: PHP-Friends > NetCup > Hetzner > Avoro
Disk: NetCup > Hetzner > PHP-Friends > Avoro.
Generous and decent bandwidth on all systems.

On average, the best system is NetCup, then Hetzner or PHP-Friends, and Avoro at the end.

Doesn't hetzner ax line is pretty much the best bang for the buck.

i mean for 40 eur you got ryzen dedicated server with 15k passmark.

AC_Fan · November 2020

@yokowasis said:

@AC_Fan said:

@atomi said:
I know that G7 is not the latest but RS line is almost same price every month that some of these servers are yearly.

Also Php-friends and Avoro are solid options when looking for dedicated CPUs

Yup, it goes something like this (assuming well balanced host nodes)
CPU: Avoro ~ NetCup > PHP-Friends > Hetzner AX
RAM: PHP-Friends > NetCup > Hetzner > Avoro
Disk: NetCup > Hetzner > PHP-Friends > Avoro.
Generous and decent bandwidth on all systems.

On average, the best system is NetCup, then Hetzner or PHP-Friends, and Avoro at the end.

Doesn't hetzner ax line is pretty much the best bang for the buck.

i mean for 40 eur you got ryzen dedicated server with 15k passmark.

Depends on the workload. Single threaded? Yup, the 3600 is your best bet. But multi threaded tasks, like most server workloads? The 15.6 euro Root Server has an average GB5 of 3142, while the AX41-NVMe (7194 GB5 average) starts at 34 euro, with ECC an additional 5 euro, and a 39 euro setup fee. Bottom line is, the RS has a roughly 1.18x GB5 per dollar.

Francisco · November 2020

@AC_Fan said: BuyVM does have unlimited CPU on most plans, so (ab)use is practically guaranteed. CloudCone, Scaleway and VirMach seem to be well-known for their poor CPU performance (somebody provided a screenshot of steal of over 50% on Scaleway too).

Not quite, we cap people that rim their shared cores non stop.

The E3's are harder to manage for. We put out the shared plans with good intentions (basically a cheap way for people to get a taste) but since we don't do automatic capping it can get a bit busy.

The Ryzen's provide a mountain of buffer for us with most nodes running around 40% utilization. Some new stallion updates cuts down node side system interrupts a lot as well

Francisco

AC_Fan · November 2020

@Francisco said:

@AC_Fan said: BuyVM does have unlimited CPU on most plans, so (ab)use is practically guaranteed. CloudCone, Scaleway and VirMach seem to be well-known for their poor CPU performance (somebody provided a screenshot of steal of over 50% on Scaleway too).

Not quite, we cap people that rim their shared cores non stop.

The E3's are harder to manage for. We put out the shared plans with good intentions (basically a cheap way for people to get a taste) but since we don't do automatic capping it can get a bit busy.

The Ryzen's provide a mountain of buffer for us with most nodes running around 40% utilization. Some new stallion updates cuts down node side system interrupts a lot as well

Francisco

Yeah, I assumed it was because it was an older node and you prefer showing leniency. I'm sure your Ryzen nodes are lovely, if a bit hard to come by.

Francisco · November 2020

@AC_Fan said: Yeah, I assumed it was because it was an older node and you prefer showing leniency. I'm sure your Ryzen nodes are lovely, if a bit hard to come by.

They aren't in LU yet I'm loading the pallet today.

I was supposed to ship it like 3 weeks ago but we kept siphoning off nodes to feed LV & NY. But, we put on hold any new nodes for those locations until LU gets shipped

Slabs are also going out on this shipment! Couldn't be more excited for that. We've been promising them in LU for at least a year at this point.

Francisco

jahrinc · November 2020

@Zerpy said:
CPU steal doesn't always mean oversold hypervisors, but can also be the hypervisor throttling the VM due to too high average (e.g. try use your burst allocation on lightsail and see 85-90% steal).

With that said - nice metrics

I'll add a few just because:

mvps.net:

e2enetworks:

first-root:

heficed (this one had an issue which caused a bunch of downtime):

DigitalFyre:

Ginernet:

flowvps:

Skylonhost:

Providers with absolutely zero iowait (or less than 1% basically):

OVH

ArubaCloud

Hetzner

DigitalOcean

iwstack.com

ServeTheWorld

NetCup Root Server (G8)

If you don't mind me asking, what app are you using to get these handsome charts?

Zerpy · November 2020

@jahrinc said:
If you don't mind me asking, what app are you using to get these handsome charts?

Stats are gathered using netdata (does per second metrics) - then Prometheus scrapes the endpoints every 10 seconds. Data is then visualized in Grafana.

jahrinc · November 2020

@Zerpy said: Stats are gathered using netdata (does per second metrics) - then Prometheus scrapes the endpoints every 10 seconds. Data is then visualized in Grafana.

Awesome, I already use Netdata, might have to research on how to get prometheus and grafana!

Any guides you followed? will be appreciated!

Thanks

eva2000 · November 2020

@atomi said: Feel free to share more experiences with your CPU steal times

Nice always love seeing stats like this. Here's on 2 of my upcloud servers with Centmin Mod's native cminfo sar-cpu output

7 day average for cpu steal is 0.02%, max is 0.43% and 95% percentile is 0.09% and 99% percentile is 0.20%

lscpu
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                1
On-line CPU(s) list:   0
Thread(s) per core:    1
Core(s) per socket:    1
Socket(s):             1
NUMA node(s):          1
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 85
Model name:            Intel(R) Xeon(R) Gold 6136 CPU @ 3.00GHz
Stepping:              4
CPU MHz:               2992.968
BogoMIPS:              5985.93
Hypervisor vendor:     KVM
Virtualization type:   full
L1d cache:             32K
L1i cache:             32K
L2 cache:              4096K
L3 cache:              16384K
NUMA node0 CPU(s):     0
Flags:                 fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon rep_good nopl xtopology eagerfpu pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single ssbd ibrs ibpb fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx avx512f avx512dq rdseed adx smap clflushopt clwb avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 arat umip pku ospke spec_ctrl

cminfo sar-cpu

------------------------------------------------------------------
 CPU Utilisation % Last 7 days (1 CPU Threads):
------------------------------------------------------------------
%CPU  min:  %user:  2.26   %nice:  0.00  %system:  4.06   %iowait:  0.00   %steal:  0.00  %idle:  0.00
%CPU  avg:  %user:  2.95   %nice:  0.02  %system:  4.88   %iowait:  0.03   %steal:  0.02  %idle:  92.10
%CPU  max:  %user:  77.56  %nice:  4.79  %system:  62.05  %iowait:  11.55  %steal:  0.43  %idle:  93.64
%CPU  50%:  %user:  2.51   %nice:  0.00  %system:  4.60   %iowait:  0.01   %steal:  0.01  %idle:  92.87
%CPU  75%:  %user:  2.62   %nice:  0.00  %system:  4.87   %iowait:  0.01   %steal:  0.02  %idle:  93.12
%CPU  90%:  %user:  2.72   %nice:  0.00  %system:  5.11   %iowait:  0.01   %steal:  0.06  %idle:  93.29
%CPU  95%:  %user:  2.84   %nice:  0.00  %system:  5.26   %iowait:  0.02   %steal:  0.09  %idle:  93.35
%CPU  99%:  %user:  7.39   %nice:  0.02  %system:  6.07   %iowait:  0.29   %steal:  0.20  %idle:  93.47

------------------------------------------------------------------
 CPU Utilisation % Daily Last 7 days (1 CPU Threads):
------------------------------------------------------------------
Nov 13 2020 %CPU
%CPU  min:  %user:  2.26   %nice:  0.00  %system:  4.06   %iowait:  0.00   %steal:  0.00  %idle:  0.00
%CPU  avg:  %user:  5.49   %nice:  0.02  %system:  6.27   %iowait:  0.13   %steal:  0.01  %idle:  88.08
%CPU  max:  %user:  77.56  %nice:  4.48  %system:  62.05  %iowait:  11.55  %steal:  0.24  %idle:  93.63
%CPU  95%:  %user:  24.15  %nice:  0.00  %system:  15.51  %iowait:  0.17   %steal:  0.04  %idle:  93.46
Nov 12 2020 %CPU
%CPU  min:  %user:  2.30  %nice:  0.00  %system:  4.24  %iowait:  0.00  %steal:  0.00  %idle:  85.74
%CPU  avg:  %user:  2.57  %nice:  0.02  %system:  4.69  %iowait:  0.02  %steal:  0.02  %idle:  92.68
%CPU  max:  %user:  7.39  %nice:  4.49  %system:  6.08  %iowait:  1.16  %steal:  0.25  %idle:  93.44
%CPU  95%:  %user:  2.76  %nice:  0.00  %system:  5.17  %iowait:  0.01  %steal:  0.08  %idle:  93.32
Nov 11 2020 %CPU
%CPU  min:  %user:  2.29  %nice:  0.00  %system:  4.09  %iowait:  0.00  %steal:  0.00  %idle:  85.48
%CPU  avg:  %user:  2.61  %nice:  0.02  %system:  4.74  %iowait:  0.02  %steal:  0.03  %idle:  92.59
%CPU  max:  %user:  7.55  %nice:  4.68  %system:  6.28  %iowait:  1.18  %steal:  0.43  %idle:  93.62
%CPU  95%:  %user:  2.82  %nice:  0.00  %system:  5.29  %iowait:  0.01  %steal:  0.15  %idle:  93.35
Nov 10 2020 %CPU
%CPU  min:  %user:  2.29  %nice:  0.00  %system:  4.15  %iowait:  0.00  %steal:  0.00  %idle:  85.36
%CPU  avg:  %user:  2.61  %nice:  0.02  %system:  4.73  %iowait:  0.02  %steal:  0.03  %idle:  92.60
%CPU  max:  %user:  7.42  %nice:  4.65  %system:  6.28  %iowait:  1.35  %steal:  0.32  %idle:  93.52
%CPU  95%:  %user:  2.76  %nice:  0.00  %system:  5.18  %iowait:  0.01  %steal:  0.08  %idle:  93.32
Nov 09 2020 %CPU
%CPU  min:  %user:  2.32  %nice:  0.00  %system:  4.24  %iowait:  0.00  %steal:  0.00  %idle:  85.67
%CPU  avg:  %user:  2.65  %nice:  0.02  %system:  4.81  %iowait:  0.02  %steal:  0.03  %idle:  92.47
%CPU  max:  %user:  7.66  %nice:  4.45  %system:  6.25  %iowait:  1.34  %steal:  0.16  %idle:  93.42
%CPU  95%:  %user:  2.83  %nice:  0.00  %system:  5.28  %iowait:  0.02  %steal:  0.08  %idle:  93.23
Nov 08 2020 %CPU
%CPU  min:  %user:  2.26  %nice:  0.00  %system:  4.19  %iowait:  0.00  %steal:  0.00  %idle:  84.70
%CPU  avg:  %user:  2.58  %nice:  0.02  %system:  4.54  %iowait:  0.02  %steal:  0.01  %idle:  92.83
%CPU  max:  %user:  7.03  %nice:  4.69  %system:  5.93  %iowait:  1.90  %steal:  0.09  %idle:  93.46
%CPU  95%:  %user:  2.81  %nice:  0.00  %system:  4.87  %iowait:  0.02  %steal:  0.02  %idle:  93.34
Nov 07 2020 %CPU
%CPU  min:  %user:  2.26  %nice:  0.00  %system:  4.07  %iowait:  0.00  %steal:  0.00  %idle:  85.42
%CPU  avg:  %user:  2.51  %nice:  0.02  %system:  4.52  %iowait:  0.01  %steal:  0.01  %idle:  92.93
%CPU  max:  %user:  7.10  %nice:  4.68  %system:  5.98  %iowait:  1.46  %steal:  0.06  %idle:  93.64
%CPU  95%:  %user:  2.92  %nice:  0.00  %system:  4.85  %iowait:  0.02  %steal:  0.02  %idle:  93.39
Nov 06 2020 %CPU
%CPU  min:  %user:  2.26  %nice:  0.00  %system:  4.10  %iowait:  0.00  %steal:  0.00  %idle:  84.96
%CPU  avg:  %user:  2.59  %nice:  0.02  %system:  4.77  %iowait:  0.01  %steal:  0.05  %idle:  92.56
%CPU  max:  %user:  7.09  %nice:  4.79  %system:  6.00  %iowait:  1.60  %steal:  0.43  %idle:  93.63
%CPU  95%:  %user:  2.83  %nice:  0.00  %system:  5.29  %iowait:  0.01  %steal:  0.17  %idle:  93.32

7 day average for cpu steal is 0.00%, max is 0.01% and 95% percentile is 0.00% and 99% percentile is 0.00%

lscpu
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                1
On-line CPU(s) list:   0
Thread(s) per core:    1
Core(s) per socket:    1
Socket(s):             1
NUMA node(s):          1
Vendor ID:             AuthenticAMD
CPU family:            23
Model:                 49
Model name:            AMD EPYC 7542 32-Core Processor
Stepping:              0
CPU MHz:               2894.560
BogoMIPS:              5789.12
Hypervisor vendor:     KVM
Virtualization type:   full
NUMA node0 CPU(s):     0
Flags:                 fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm art rep_good nopl extd_apicid eagerfpu pni pclmulqdq ssse3 fma cx16 sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm cmp_legacy cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw topoext perfctr_core retpoline_amd ssbd ibrs ibpb vmmcall fsgsbase tsc_adjust bmi1 avx2 smep bmi2 rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 arat umip spec_ctrl

cminfo sar-cpu

------------------------------------------------------------------
 CPU Utilisation % Last 7 days (1 CPU Threads):
------------------------------------------------------------------
%CPU  min:  %user:  0.03  %nice:  0.00  %system:  0.09  %iowait:  0.00  %steal:  0.00  %idle:  98.01
%CPU  avg:  %user:  0.07  %nice:  0.00  %system:  0.13  %iowait:  0.00  %steal:  0.00  %idle:  99.80
%CPU  max:  %user:  1.63  %nice:  0.07  %system:  0.35  %iowait:  0.20  %steal:  0.01  %idle:  99.87
%CPU  50%:  %user:  0.05  %nice:  0.00  %system:  0.12  %iowait:  0.00  %steal:  0.00  %idle:  99.82
%CPU  75%:  %user:  0.06  %nice:  0.00  %system:  0.13  %iowait:  0.00  %steal:  0.00  %idle:  99.83
%CPU  90%:  %user:  0.07  %nice:  0.00  %system:  0.15  %iowait:  0.00  %steal:  0.00  %idle:  99.84
%CPU  95%:  %user:  0.09  %nice:  0.00  %system:  0.16  %iowait:  0.00  %steal:  0.00  %idle:  99.85
%CPU  99%:  %user:  0.41  %nice:  0.00  %system:  0.25  %iowait:  0.04  %steal:  0.00  %idle:  99.86

------------------------------------------------------------------
 CPU Utilisation % Daily Last 7 days (1 CPU Threads):
------------------------------------------------------------------
Nov 13 2020 %CPU
%CPU  min:  %user:  0.03  %nice:  0.00  %system:  0.09  %iowait:  0.00  %steal:  0.00  %idle:  99.39
%CPU  avg:  %user:  0.06  %nice:  0.00  %system:  0.12  %iowait:  0.00  %steal:  0.00  %idle:  99.82
%CPU  max:  %user:  0.36  %nice:  0.06  %system:  0.27  %iowait:  0.07  %steal:  0.00  %idle:  99.87
%CPU  95%:  %user:  0.09  %nice:  0.00  %system:  0.15  %iowait:  0.00  %steal:  0.00  %idle:  99.85
Nov 12 2020 %CPU
%CPU  min:  %user:  0.04  %nice:  0.00  %system:  0.10  %iowait:  0.00  %steal:  0.00  %idle:  99.37
%CPU  avg:  %user:  0.06  %nice:  0.00  %system:  0.13  %iowait:  0.00  %steal:  0.00  %idle:  99.81
%CPU  max:  %user:  0.39  %nice:  0.05  %system:  0.26  %iowait:  0.18  %steal:  0.00  %idle:  99.86
%CPU  95%:  %user:  0.07  %nice:  0.00  %system:  0.15  %iowait:  0.00  %steal:  0.00  %idle:  99.85
Nov 11 2020 %CPU
%CPU  min:  %user:  0.03  %nice:  0.00  %system:  0.09  %iowait:  0.00  %steal:  0.00  %idle:  98.01
%CPU  avg:  %user:  0.13  %nice:  0.00  %system:  0.14  %iowait:  0.00  %steal:  0.00  %idle:  99.72
%CPU  max:  %user:  1.63  %nice:  0.06  %system:  0.34  %iowait:  0.20  %steal:  0.00  %idle:  99.87
%CPU  95%:  %user:  1.07  %nice:  0.00  %system:  0.27  %iowait:  0.01  %steal:  0.00  %idle:  99.84
Nov 10 2020 %CPU
%CPU  min:  %user:  0.04  %nice:  0.00  %system:  0.09  %iowait:  0.00  %steal:  0.00  %idle:  99.14
%CPU  avg:  %user:  0.06  %nice:  0.00  %system:  0.13  %iowait:  0.00  %steal:  0.00  %idle:  99.81
%CPU  max:  %user:  0.45  %nice:  0.06  %system:  0.35  %iowait:  0.08  %steal:  0.01  %idle:  99.87
%CPU  95%:  %user:  0.08  %nice:  0.00  %system:  0.15  %iowait:  0.00  %steal:  0.00  %idle:  99.85
Nov 09 2020 %CPU
%CPU  min:  %user:  0.03  %nice:  0.00  %system:  0.09  %iowait:  0.00  %steal:  0.00  %idle:  99.35
%CPU  avg:  %user:  0.06  %nice:  0.00  %system:  0.13  %iowait:  0.00  %steal:  0.00  %idle:  99.81
%CPU  max:  %user:  0.38  %nice:  0.07  %system:  0.24  %iowait:  0.14  %steal:  0.01  %idle:  99.87
%CPU  95%:  %user:  0.07  %nice:  0.00  %system:  0.15  %iowait:  0.00  %steal:  0.00  %idle:  99.85
Nov 08 2020 %CPU
%CPU  min:  %user:  0.03  %nice:  0.00  %system:  0.09  %iowait:  0.00  %steal:  0.00  %idle:  99.42
%CPU  avg:  %user:  0.06  %nice:  0.00  %system:  0.12  %iowait:  0.00  %steal:  0.00  %idle:  99.82
%CPU  max:  %user:  0.35  %nice:  0.06  %system:  0.27  %iowait:  0.09  %steal:  0.00  %idle:  99.87
%CPU  95%:  %user:  0.07  %nice:  0.00  %system:  0.15  %iowait:  0.01  %steal:  0.00  %idle:  99.85
Nov 07 2020 %CPU
%CPU  min:  %user:  0.04  %nice:  0.00  %system:  0.10  %iowait:  0.00  %steal:  0.00  %idle:  99.19
%CPU  avg:  %user:  0.07  %nice:  0.00  %system:  0.14  %iowait:  0.00  %steal:  0.00  %idle:  99.80
%CPU  max:  %user:  0.53  %nice:  0.06  %system:  0.24  %iowait:  0.15  %steal:  0.00  %idle:  99.85
%CPU  95%:  %user:  0.08  %nice:  0.00  %system:  0.16  %iowait:  0.00  %steal:  0.00  %idle:  99.83
Nov 06 2020 %CPU
%CPU  min:  %user:  0.04  %nice:  0.00  %system:  0.10  %iowait:  0.00  %steal:  0.00  %idle:  99.34
%CPU  avg:  %user:  0.06  %nice:  0.00  %system:  0.14  %iowait:  0.00  %steal:  0.00  %idle:  99.80
%CPU  max:  %user:  0.40  %nice:  0.06  %system:  0.29  %iowait:  0.05  %steal:  0.00  %idle:  99.86
%CPU  95%:  %user:  0.09  %nice:  0.00  %system:  0.18  %iowait:  0.00  %steal:  0.00  %idle:  99.84

VirMach · November 2020

@AC_Fan said: CloudCone, Scaleway and VirMach seem to be well-known for their poor CPU performance (somebody provided a screenshot of steal of over 50% on Scaleway too).

If anyone ever reports a high CPU steal, we investigate the matter and always at the very least offer free migration. We have automation in that there are maximum thresholds for CPU usage before a node is locked off. We constantly improve this and have recently added more specific scenarios for accuracy. Right now, we have 7 nodes out of hundreds that are sub-optimal in terms of CPU usage where some abnormal bursts of CPU steal may occur, but they are all already locked off and they're not anywhere close to being "overloaded." Actually, right after typing that I decided to spin up an SSD256 on our highest CPU usage server.

First, I'll provide the CPU usage & load levels.

For reference, the second-worst server is 5% lower CPU usage than this. Definitely not optimal, but again, this is our worst server. We miscalculated the quantity of OpenVZ VMs that got moved here because we didn't correctly anticipate who would be using it at what time since we were also moving dozens of other servers, and there was also a glitch with SolusVM (recently patched) that caused high idle CPU usage per VM running certain operating systems (and this server ended up with a lot of those.)

Now for steal time/other, arbitrarily grabbing it every few seconds --

%Cpu(s):  0.3 us,  0.0 sy,  0.0 ni, 97.7 id,  0.0 wa,  0.0 hi,  0.0 si,  2.0 st
%Cpu(s):  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu(s):  0.3 us,  0.0 sy,  0.0 ni, 98.7 id,  0.0 wa,  0.0 hi,  0.0 si,  1.0 st
%Cpu(s):  0.3 us,  0.0 sy,  0.0 ni, 98.7 id,  0.0 wa,  0.0 hi,  0.0 si,  1.0 st
%Cpu(s):  0.0 us,  0.0 sy,  0.0 ni, 99.3 id,  0.0 wa,  0.0 hi,  0.0 si,  0.7 st
%Cpu(s):  0.3 us,  0.0 sy,  0.0 ni, 99.3 id,  0.0 wa,  0.0 hi,  0.0 si,  0.3 st
%Cpu(s):  0.0 us,  0.0 sy,  0.0 ni, 99.0 id,  0.0 wa,  0.0 hi,  0.0 si,  1.0 st
%Cpu(s):  0.0 us,  0.3 sy,  0.0 ni, 99.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.7 st
%Cpu(s):  0.0 us,  0.0 sy,  0.0 ni, 99.3 id,  0.0 wa,  0.0 hi,  0.0 si,  0.7 st
%Cpu(s):  0.0 us,  0.0 sy,  0.0 ni, 98.7 id,  0.0 wa,  0.0 hi,  0.0 si,  1.3 st

Bonus -- bench.sh

----------------------------------------------------------------------
 CPU Model             : Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
 CPU Cores             : 1
 CPU Frequency         : 2499.998 MHz
 CPU Cache             : 4096 KB
 Total Disk            : 9.6 GB (0.9 GB Used)
 Total Mem             : 242 MB (83 MB Used)
 Total Swap            : 255 MB (0 MB Used)
 System uptime         : 0 days, 0 hour 6 min
 Load average          : 0.00, 0.00, 0.00
 OS                    : Debian GNU/Linux 8
 Arch                  : x86_64 (64 Bit)
 Kernel                : 3.16.0-4-amd64
 TCP CC                : cubic
 Virtualization        : KVM
 Organization          : AS36352 ColoCrossing
 Location              : Paris / US
 Region                : Maine
----------------------------------------------------------------------
 I/O Speed(1st run)    : 612 MB/s
 I/O Speed(2nd run)    : 696 MB/s
 I/O Speed(3rd run)    : 583 MB/s
 Average I/O speed     : 630.3 MB/s
----------------------------------------------------------------------
 Node Name        Upload Speed      Download Speed      Latency
 Speedtest.net    397.99 Mbps       1315.76 Mbps        81.42 ms
 Beijing    CU    0.22 Mbps         94.43 Mbps          213.16 ms
 Shanghai   CT    0.53 Mbps         874.67 Mbps         143.71 ms
 Guangzhou  CT    0.31 Mbps         528.52 Mbps         181.81 ms
 Shenzhen   CM    35.76 Mbps        223.17 Mbps         168.27 ms
 Hongkong   CN    3.29 Mbps         470.51 Mbps         251.34 ms
 Singapore  SG    205.52 Mbps       818.65 Mbps         188.49 ms
 Tokyo      JP    64.49 Mbps        88.04 Mbps          98.63 ms
----------------------------------------------------------------------

Bonus -- Steal percentage/other as partial Geekbench runs

%Cpu(s):  6.5 us,  1.8 sy,  0.0 ni, 85.7 id,  2.8 wa,  0.0 hi,  0.2 si,  3.0 st
%Cpu(s):  2.6 us, 12.5 sy,  0.0 ni,  0.0 id, 77.0 wa,  0.0 hi,  0.0 si,  8.0 st
%Cpu(s):  4.8 us, 13.2 sy,  0.0 ni,  0.0 id, 72.0 wa,  0.0 hi,  0.0 si, 10.0 st
%Cpu(s): 60.9 us,  2.0 sy,  0.0 ni, 32.8 id,  2.7 wa,  0.0 hi,  0.0 si,  1.7 st
%Cpu(s): 97.7 us,  1.3 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  1.0 st
%Cpu(s): 96.0 us,  0.7 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  3.3 st
%Cpu(s): 57.5 us,  1.0 sy,  0.0 ni, 33.2 id,  2.0 wa,  0.0 hi,  0.0 si,  6.3 st
%Cpu(s): 83.7 us,  0.7 sy,  0.0 ni, 15.3 id,  0.0 wa,  0.0 hi,  0.0 si,  0.3 st
%Cpu(s): 87.2 us,  7.5 sy,  0.0 ni,  0.0 id,  1.3 wa,  0.0 hi,  0.0 si,  3.9 st
%Cpu(s): 97.7 us,  0.3 sy,  0.0 ni,  0.0 id,  1.0 wa,  0.0 hi,  0.0 si,  1.0 st
%Cpu(s): 61.1 us,  1.3 sy,  0.0 ni, 33.2 id,  2.3 wa,  0.0 hi,  0.0 si,  2.0 st
%Cpu(s): 97.0 us,  0.0 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  3.0 st
%Cpu(s): 93.4 us,  0.0 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  6.6 st
%Cpu(s): 70.4 us,  0.0 sy,  0.0 ni, 28.6 id,  0.0 wa,  0.0 hi,  0.0 si,  1.0 st
%Cpu(s):  4.3 us,  0.0 sy,  0.0 ni, 32.9 id, 61.2 wa,  0.0 hi,  0.0 si,  1.6 st
%Cpu(s): 78.6 us,  8.8 sy,  0.0 ni,  0.0 id,  9.1 wa,  0.3 hi,  0.0 si,  3.2 st
%Cpu(s): 39.7 us,  0.7 sy,  0.0 ni, 58.0 id,  1.3 wa,  0.0 hi,  0.0 si,  0.3 st
%Cpu(s):  0.0 us,  0.3 sy,  0.0 ni, 99.3 id,  0.0 wa,  0.0 hi,  0.0 si,  0.3 st
%Cpu(s):  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st

Bonus -- Geekbench5 but with 1GB RAM (due to error with 256MB.)
https://browser.geekbench.com/v5/cpu/4707640

Then again, if you feel like there's a problem, go ahead and check it out and if you see high steal time or other objective issues, create a ticket with your findings and there's usually no reason for us not to move you to another node. If the server's actually higher than we'd like it to be, we sometimes reach out and request some people to move voluntarily, it's a win-win situation.

AC_Fan · November 2020

@VirMach My apologies, I mixed you in with CloudCone and Scaleway by mistake. Your performance isn't necessarily poor, but rather restricted. You have your limits quite explicitly defined in your FUP/AUP, which is certainly different from the norm. Certainly haven't heard bad things about you overall, just that your CPU policy is quite unusual.

VirMach · November 2020

@AC_Fan said:
@VirMach My apologies, I mixed you in with CloudCone and Scaleway by mistake. Your performance isn't necessarily poor, but rather restricted. You have your limits quite explicitly defined in your FUP/AUP, which is certainly different from the norm. Certainly haven't heard bad things about you overall, just that your CPU policy is quite unusual.

Our intention with being clear wasn't to be strict or restrictive, but rather set a line so we cannot arbitrarily take action underneath that. This is to try to avoid going by a policy that says "whatever we deem to be abuse" because then customers have no idea what that could end up being.

We always try to be lenient where possible.

Outside of some rare edge cases, this generally means for CPU, as long as it's available and you are not bursting to maximum usage for multiple hours, you should be fine. We have almost entirely switched to an automated system which is much more lenient and gives ample time and explanation in every situation for the customer to decide to either lower their usage (and a lot of times with assistance, it ends up being that they had unknown malware or software acting up) or to upgrade their service.

There are under 0.04% of VMs flagged for evaluation per day, on most days.
Out of those, 70%~ are thrown out and forgiven.
The other 30%~ received an initial notice, no powerdowns or suspension
About 1 or fewer servers get powered down per day or two for multiple notices
Zero suspensions in sights, going back weeks

We've gotten very good at balancing out nodes and being lenient where it matters and generally keeping everyone happy.

Perhaps we'll consider changing our AUP to be more in line with what we actually do since we're much more lenient. It would just make it more difficult to take action when required and we wouldn't want to invoke the "whatever we deem abuse" rule as much as other providers do since it's vague.

Zerpy · November 2020

@jahrinc said:
Any guides you followed? will be appreciated!

Well initially when I set up Prometheus first time, I followed a guide to get the config right, but Netdata has documentation for a prometheus setup as well, after all Netdata simply exposes a prometheus friendly scrape endpoint.

Graphs are graphs, and you configure them as you see fit in Grafana - I guess we all have our different ways of doing the same thing.

serv_ee · November 2020

@AC_Fan said:

@yokowasis said:

@AC_Fan said:

@atomi said:
I know that G7 is not the latest but RS line is almost same price every month that some of these servers are yearly.

Also Php-friends and Avoro are solid options when looking for dedicated CPUs

Yup, it goes something like this (assuming well balanced host nodes)
CPU: Avoro ~ NetCup > PHP-Friends > Hetzner AX
RAM: PHP-Friends > NetCup > Hetzner > Avoro
Disk: NetCup > Hetzner > PHP-Friends > Avoro.
Generous and decent bandwidth on all systems.

On average, the best system is NetCup, then Hetzner or PHP-Friends, and Avoro at the end.

Doesn't hetzner ax line is pretty much the best bang for the buck.

i mean for 40 eur you got ryzen dedicated server with 15k passmark.

Depends on the workload. Single threaded? Yup, the 3600 is your best bet. But multi threaded tasks, like most server workloads? The 15.6 euro Root Server has an average GB5 of 3142, while the AX41-NVMe (7194 GB5 average) starts at 34 euro, with ECC an additional 5 euro, and a 39 euro setup fee. Bottom line is, the RS has a roughly 1.18x GB5 per dollar.

Cause ECC RAM and CPU steal have...what in common exactly?

AC_Fan · November 2020

@serv_ee said:

@AC_Fan said:

@yokowasis said:

@AC_Fan said:

@atomi said:
I know that G7 is not the latest but RS line is almost same price every month that some of these servers are yearly.

Also Php-friends and Avoro are solid options when looking for dedicated CPUs

Yup, it goes something like this (assuming well balanced host nodes)
CPU: Avoro ~ NetCup > PHP-Friends > Hetzner AX
RAM: PHP-Friends > NetCup > Hetzner > Avoro
Disk: NetCup > Hetzner > PHP-Friends > Avoro.
Generous and decent bandwidth on all systems.

On average, the best system is NetCup, then Hetzner or PHP-Friends, and Avoro at the end.

Doesn't hetzner ax line is pretty much the best bang for the buck.

i mean for 40 eur you got ryzen dedicated server with 15k passmark.

Depends on the workload. Single threaded? Yup, the 3600 is your best bet. But multi threaded tasks, like most server workloads? The 15.6 euro Root Server has an average GB5 of 3142, while the AX41-NVMe (7194 GB5 average) starts at 34 euro, with ECC an additional 5 euro, and a 39 euro setup fee. Bottom line is, the RS has a roughly 1.18x GB5 per dollar.

Cause ECC RAM and CPU steal have...what in common exactly?

Nothing. Because that specific discussion was about CPU performance per dollar, and it was (IMO, validly) assumed that ECC was a requirement. So, we need to add that cost to the base price for a valid comparison. I genuinely worry about the attention span of the average LET reader, sometimes.

TimboJones · November 2020

@VirMach said:

@AC_Fan said: CloudCone, Scaleway and VirMach seem to be well-known for their poor CPU performance (somebody provided a screenshot of steal of over 50% on Scaleway too).

If anyone ever reports a high CPU steal, we investigate the matter and always at the very least offer free migration.

What is considered a high CPU steal? Unfortunately, I had two hostnames in new-relic with same hostname, so I don't have long term stats, but watching it for the last hour its been between 3-10 with bursts to 15 on SEAKVM14. CPU load average reports 0.05, 0.17, 0.12. On SEAKVM5, I haven't seen the steal above 0.5 for the same watched period.

Also, for BF specials, I don't think people feel free to ticket asking for less congested servers, they'll just take it and bitch about it being overloaded (and half expect that to be the case, anyway). I paid $6.99 for 1 CPU and keep that in mind for expectations.

Bonus -- Geekbench5 but with 1GB RAM (due to error with 256MB.)
https://browser.geekbench.com/v5/cpu/4707640

Then again, if you feel like there's a problem, go ahead and check it out and if you see high steal time or other objective issues, create a ticket with your findings and there's usually no reason for us not to move you to another node. If the server's actually higher than we'd like it to be, we sometimes reach out and request some people to move voluntarily, it's a win-win situation.

FYI, I don't think a GB5 score of 334 is anything to boast about. I've got several Virmach's with GB5 scores in the 300's, whereas every other server I have except one (3TB storage server, 6% average steal) has scores 500-636, all Low end. The Virmach CPU used is E5-2660 v2 @ 2.20GHz and on another VPS provider with a lower generation CPU E5-2660 @ 2.20GHz, it has a GB5 scores of 554 and 565. So my general impression is less CPU/dollar with Virmach vs other providers, especially with a 4 core server having a GB5 multi of 1062 and a 2 core GB5 multi from another provider is 1070.

Edit: last time I updated my spreadsheet, I had two servers with E5-2660 v2 @ 2.20GHz and now both have Intel Xeon E312xx (Sandy Bridge), so that sucks for me. One hopes to get better CPU's each change, not worse ones.

p.s. billing.virmach.com (despite 14ms ping from me), is hella slow (25 seconds to load pages) in responding atm.

eva2000 · November 2020

@TimboJones said: What is considered a high CPU steal?

It's relative but CPU steal is of concern if it impacts how much of your VPS/guest server's cpu it can use or if it's consistent over time (which is why I measure not just it's average but min, max and percentile metrics for daily and weekly averages from above cminfo sar-cpu output). For example when 0% cpu idle reported (cpu is use) and you see cpu steal or you see cpu steal and you aren't able to use 100% of your cpu.

doughnet · November 2020

anyone able to do a test for contabo SSD vps. They have a great bang for buck.

Daniel15 · December 2020

@TimboJones said: What is considered a high CPU steal?

Depends on your use case, but I consider anything above 10% to be very high.

I have three VPSes with BuyVM: One "Slice 4096" (which has "dedicated CPU usage") in Las Vegas, and two "Slice 512" (which has "fair share" CPU usage) in Luxembourg and New York. All three have relatively low CPU steal percentages compared to other providers. You'd think that the Slice 4096 would have lower CPU steal, but in reality the small New York one has consistently lower CPU steal, at least up until the past day.

(green is Luxembourg, red is Las Vegas, blue is New York)

I spoke to Francisco about it and he said there's some known issues with some config thing that may be causing issues with CPU steal, so I'll check it again in a few weeks.

Some providers I use have very good CPU steal percentages, even though they don't have dedicated CPU. My @VirMach is almost always below 1.0%:

And CPU steal on two of my other VPSes (@QuantumCore, and a BudgetNode @Ishaq storage VPS) is so small that it barely registers on a graph. Usually less than 0.1%.

Note that some providers patch their kernel to not report stolen CPU time... In those cases you usually just see 0.0% all the time. I've seen BinaryLane do that.

Howdy, Stranger!

Categories

In this Discussion

CPU steal times with various providers

Comments

Howdy, Stranger!

Quick Links

Categories

In this Discussion

CPU steal times with various providers

Comments