Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!


High CPU Usage on KVM SolusVM
New on LowEndTalk? Please Register and read our Community Rules.

All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.

High CPU Usage on KVM SolusVM

I got a dedicated server, 16cores, 32 threads, 6TB of HDD SOFT/JBOD and 128GB of RAM from OVH, Gigabit network speed. I installed solusvm and installed it for KVM platform and now i am facing the following issues

(I contacted SolusVM before contacting you and they said:
'This will be related to CPU, memory and Disk array on the server. You can check the hardware with your DC')

I have 9 VPS Installed and running

  1. High CPU Load on Main Node.
    top - 00:09:36 up 4 days, 19:31, 2 users, load average: 27.42, 19.99, 18.51
    Cpu(s): 4.6%us, 3.9%sy, 0.0%ni, 88.5%id, 3.0%wa, 0.0%hi, 0.0%si, 0.0%st

ALL VPS are Currently IDLE and no client is whatsoever connected or using resources

14173 qemu 20 0 10.8g 451m 5484 S 36.2 0.4 15:13.68 qemu-kvm
14524 qemu 20 0 8647m 447m 5476 S 19.9 0.3 14:04.47 qemu-kvm
14085 qemu 20 0 10.8g 756m 5484 S 19.6 0.6 11:51.64 qemu-kvm
14238 qemu 20 0 10.5g 454m 5484 S 19.6 0.4 15:52.27 qemu-kvm
14664 qemu 20 0 10.5g 448m 5484 S 19.6 0.3 15:08.18 qemu-kvm
14425 qemu 20 0 10.3g 776m 5484 S 18.9 0.6 12:28.16 qemu-kvm
13601 qemu 20 0 10.8g 1.8g 5484 S 18.6 1.4 15:17.75 qemu-kvm
14361 qemu 20 0 10.6g 449m 5484 S 18.2 0.3 14:40.62 qemu-kvm
14568 qemu 20 0 8647m 481m 5440 S 17.6 0.4 13:41.01 qemu-kvm

I have tried giving them all 32 cores or even lower it down, nothing helped.

On the vps, an IDLE vps, following is the load
load average: 1.03, 0.78, 0.75
Cpu(s): 0.6%us, 0.7%sy, 0.0%ni, 95.6%id, 5.2%wa, 0.0%hi, 0.0%si, 0.0%st

While a friend of mine hosted my custom iso on his vps, using kvm (he has 30 vps runnong on this node) and had the following load
load average: 0.09, 0.31, 0.38
Cpu(s): 0.4%us, 0.4%sy, 0.0%ni, 97.1%id, 1.9%wa, 0.1%hi, 0.0%si, 0.0%st

What can be the issue?

Comments

  • letboxletbox Member, Patron Provider
    edited April 2016

    Seems the wa pretty high for unused services. Did you check your drives? You are doing software what? Raid 1?

  • On the node settings, what do you have set for Disk Cache?

    Can you give us the full output of "top", if you can just print screen it may be easier to format within LET.

  • frequency007frequency007 Member
    edited April 2016

    Disk Cache is Node Default while Disk driver is Virtio
    Raid 5 probably that OVH Provide.

    Left is Node while right is a box. Partial active now

  • You have high IO %WA, running on what looks to be RAID5, your going to have a high load during this time.

    I would also suggest setting the Disk Cache to none when using KVM & LVM, change at node and then any VM's you have built and restart the VM's.

    What does output of "cat /proc/mdstat" show?

  • gestiondbigestiondbi Member, Patron Provider

    Seem to be a HDD issue. Check them to be sure there is not one defect or Raid syncing.

  • frequency007frequency007 Member
    edited April 2016

    output of "cat /proc/mdstat"

  • RAID looks fine, disable the cache as above and restart all the VM's. Also check if any heavy I/O within the VM's.

    If you install iotop (yum install epel-release -y && yum install iotop -y) on the node you can use to check how much read / write I/O each VM is creating.

  • What's personalities [faulty], never noticed that before.

  • iotop

    iostat

  • Seems kvm103 is doing what is probably random writes, which will cause I/O wait on Raid 5 to cause the load to increase.

    Do you need the size of Raid 5, or could you do with Raid1?

    Once you have a few VM's doing random reads & writes, you soon start to see I/O wait on a Raid 5 setup with no form of Raid Controller with a cache.

  • My friend is actually running the normal VM on raid 1 which works fine. RAID 1 would do the job.

  • Yes, Raid 1 removes the large latency from Raid 5 that is causing the pain your seeing in the above config.

    Glad your problem is resolved, however I would still recommend changing Disk Cache to none as per default for all VM's.

  • Problem isnt pretty much resolved because my associate is out of town who sees this work. How can i move it to Raid 1? All VM's are set to disk cache as None and rebooted

  • Your need to backup any data/VM's and reinstall via OVH with Raid 1.

    You can't move from Raid 5 to Raid 1 live without data loss.

    Thanked by 1frequency007
  • Right. Thank you so much. I would let you know more about this. Thanks alot.

  • Ok, I Reinstalled as Raid1, space got almost halved but servers still got a bit higher load, Disk Cache was default on all boxes, set it to none since it was double caching and now load is alot less. I have one question now. Following is what is on solusvm slave node:

    hdparm -W /dev/sda

    /dev/sda:
    write-caching = 1 (on)

    Shall the write cache be on or off on all hard drives?

  • That is a decision you need to make, you will receive a small performance boost with it being enabled, however if the server hard loses power you may risk corruption of files/data that was held on the cache during the power loss.

    Thanked by 1frequency007
  • Thank you Ashley. You have resolved my problem. Thanks alot. I am running almost 17 boxes right now and everything seems stable.

  • @frequency007 said:
    Thank you Ashley. You have resolved my problem. Thanks alot. I am running almost 17 boxes right now and everything seems stable.

    Great to hear, and no problems.

    Thanked by 1frequency007
  • dedicadosdedicados Member
    edited April 2016

    damn,

    @frequenzy007 you should invite @AshleyUk for a dinner or a beer.

    *correction

  • frequency007frequency007 Member
    edited April 2016

    @dedicados said:
    damn,

    frequenzy007 you should invite AshleyUk to the movies or for a dinner

    :o*

    Yes, sure why not? I am gonna invite him but where are you taking us? :D

  • @dedicados said:
    damn,

    frequenzy007 you should invite AshleyUk to the movies or for a dinner

    :o*

    Don't be jealous ;), I'm a guy btw :)

  • i've corrected, lets go for a beer

  • smansman Member
    edited April 2016

    Is that problem KVM running a swap file by some chance? I have seen this problem when people set up large swap files on their KVM guests to try get around memory limits. Just one more thing I don't like about KVM. Customers can't do that on OpenVZ.

Sign In or Register to comment.