Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!


OpenVZ Node Slow?
New on LowEndTalk? Please Register and read our Community Rules.

All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.

OpenVZ Node Slow?

shovenoseshovenose Member, Host Rep
edited January 2013 in General

Having some trouble here, SolusVM says the following about my node "openvz1"
18:11:44 up 37 days, 5:24, 1 user, load average: 11.91, 13.58, 12.76
Obviously that is not acceptable, I checked with iotop to see what is going on, well, almost nothing!
same with top, even though I had 49.9%wa?
Here's the RAM info about the node: 2.93 GB of 15.57 GB used / 12.64 GB free

«13

Comments

  • Hi, even though it is probably only showing a few hundred K/s, is there any high percentages in the IO> column, often this is missed,

    Ben

  • AlexBarakovAlexBarakov Patron Provider, Veteran

    Is the node with Software RAID?

  • shovenoseshovenose Member, Host Rep
    edited January 2013

    @BenND not on my computer anymore I'll have to use ssh on my phone to look so it might take me a minute
    @Alex_LiquidHost yes software RAID 1

  • @shovenose said: yes software RAID 1

    Is the raid healthy? (cat /proc/mdstat)

  • @shovenose said: even though I had 49.9%wa?

    This says it all. 49.9% of the server is waiting for disk i/o to complete. Another way to put it: every other second is spent waiting for disk i/o. Or, 30 seconds out of every minute.

  • shovenoseshovenose Member, Host Rep

    @MiguelQ what should it say?

  • @shovenose said: what should it say?

    Something like this:
    md0 : active raid1 sda1[0] sdb1[1] 131060 blocks super 1.2 [2/2] [UU]

  • shovenoseshovenose Member, Host Rep

    yeah it says that twice.

  • @shovenose said: yeah it says that twice.

    Do you have smartmontools installed? If so, check both disks for errors. While you are at it, check dmesg output for errors as well.

  • shovenoseshovenose Member, Host Rep

    I don't know. When I'm at home I'll find and install those things.
    Or I could just call OVH im sure they could fix it.

  • @shovenose said: Or I could just call OVH im sure they could fix it.

    If it's an OVH default installation, smartctl is probably blocked (no permissions), so you have to fix those first

  • shovenoseshovenose Member, Host Rep

    OK, well I rebooted it from OVH Manager, now my load is at 6, which is more manageable IMO than 15. But there is still something wrong... looks to me like it's rebuilding the RAID array? does that mean the drive is bad?

  • Try doing:

    mdadm --detail /dev/md2

    It will give you information on what it's doing.

  • shovenoseshovenose Member, Host Rep

    got an error three times before i realized I was trying to type that into the wrong putty window, of the new ipxcore vps i just got. lol...

    [root@openvz1 ~]# mdadm --detail /dev/md2
    /dev/md2:
    Version : 0.90
    Creation Time : Sun Dec 23 12:33:14 2012
    Raid Level : raid1
    Array Size : 1932012480 (1842.51 GiB 1978.38 GB)
    Used Dev Size : 1932012480 (1842.51 GiB 1978.38 GB)
    Raid Devices : 2
    Total Devices : 2
    Preferred Minor : 2
    Persistence : Superblock is persistent

    Update Time : Tue Jan 29 21:12:32 2013
          State : active, resyncing
    

    Active Devices : 2
    Working Devices : 2
    Failed Devices : 0
    Spare Devices : 0

    Resync Status : 4% complete

           UUID : 641aaab6:e4182967:a4d2adc2:26fd5302
         Events : 0.90
    
    Number   Major   Minor   RaidDevice State
       0       8        2        0      active sync   /dev/sda2
       1       8       18        1      active sync   /dev/sdb2
    

    [root@openvz1 ~]#

  • I dont blame you for asking about help... but you have 49.9% wa and you are asking what's wrong....

  • shovenoseshovenose Member, Host Rep

    @Corey: I assumed that was abnormal, we are just trying to find the root cause of it so my clients can be happy. Do you know if my RAID issue might have anything to do with it? Thank you.

  • Resync Status : 4% complete

    I thought that's pretty obvious?

  • shovenoseshovenose Member, Host Rep

    Yes, but what I'm trying to understand is should I get OVH to put in a new disk, or is normal to rebuild?

  • If the rebuild succeeds, the disks should be fine. If you rebooted uncleanly (not from the console) I'm not surprised it's resyncing.

    Either way you just have to wait it out for the resync. Probably an hour's wait that's about it. If iowait is still high after that, your RAID1 isn't cutting it.

  • NateN34NateN34 Member
    edited January 2013

    You might want to increase your resyncing speed.

    You currently are resyncing at 11 MB/s.. At that rate, it says it will finish in 2 days lol.

  • shovenoseshovenose Member, Host Rep

    That's ok it can take two days I don't care :)

  • shovenoseshovenose Member, Host Rep

    I rebooted from OVH Manager.

  • shovenoseshovenose Member, Host Rep

    Well I guess SMART is fine because

    [root@openvz1 ~]# smartctl -H /dev/sda
    smartctl 5.42 2011-10-20 r3458 [x86_64-linux-2.6.32-042stab068.8] (local build)
    Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net

    === START OF READ SMART DATA SECTION ===
    SMART overall-health self-assessment test result: PASSED

    [root@openvz1 ~]# smartctl -H /dev/sdb
    smartctl 5.42 2011-10-20 r3458 [x86_64-linux-2.6.32-042stab068.8] (local build)
    Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net

    === START OF READ SMART DATA SECTION ===
    SMART overall-health self-assessment test result: PASSED

  • mdadm resyncing is 'normal' but an annoying consequence. It will strike at the worst times.

    Here's the top header for a server that we're considering full:

    Tasks: 4881 total,   3 running, 4869 sleeping,   1 stopped,   8 zombie
    Cpu(s): 14.6%us,  3.8%sy,  0.0%ni, 80.3%id,  0.5%wa,  0.0%hi,  0.7%si,  0.0%st
    Mem:  32783676k total, 32553456k used,   230220k free,  5851044k buffers
    Swap:  2086904k total,   637188k used,  1449716k free, 16399948k cached
    

    This server is running a really good hardware raid card, the LSI MegaRAID SAS 9260‑4i. You may want to consider including a hardware raid card in your next server; it's the difference between night and day.

  • shovenoseshovenose Member, Host Rep

    @Damian: yeah the VPS i have with you is performing very well :)
    Does that RAID card need a BBU?

  • DamianDamian Member
    edited January 2013

    @shovenose said: Does that RAID card need a BBU?

    Yes. Newer cards, such as the 9266, can use either a standard BBU, or the "CacheVault", which is a slightly-more-expensive cache that uses flash memory instead of a BBU to keep DRAM alive: http://www.lsi.com/channel/products/storagecomponents/Pages/MegaRAIDSAS9266-4i.aspx

    The CacheVault is more expensive initially, but pays for itself when you don't have to replace BBUs anymore.

  • shovenoseshovenose Member, Host Rep
    edited January 2013

    @Damian What are the rest of the specs on that server? I'm guess Xeon E5 with 32 or 64GB RAM?
    Also what types of disks do you prefer?
    I mean, honestly right now I'm stuck with Software RAID1 or RAID0(which I would never use), on OVH. I could do Software RAID10 with DataShack or WholeSaleInternet... but then I'd have to increase my prices.

  • @NateN34 said: You might want to increase your resyncing speed.

    Why on earth would he speed up the rebuild rate? It's already causing performance issues, speeding it up would just make performance for end users worse. Either take it down and let it rebuild at full speed, or let it go as long as you can stomach and minimize impact.

  • shovenoseshovenose Member, Host Rep

    I took down a couple VPS that were using the most resources for a short while, it jumped about 5% nicely. But I booted them back up and I'm waiting it out for now. If anybody complains about performance I'll happily give them some account credit or something.

  • shovenoseshovenose Member, Host Rep

    Shut down most of the VPS on the node, rebuild progressed significantly more quickly, now everybody's back up with great performance.
    1073741824 bytes (1.1 GB) copied, 7.81418 s, 137 MB/s
    Thanks to everybody who contributed to the thread, and thanks to my customers for waiting through the period of poor performance.

Sign In or Register to comment.