All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.
unexplained high load on openvz hostnode
hi lowendtalk,
I have an issue with one of my openvz hostnodes. I figured someone here could help me out as I've been scratching my head over this.
the dedicated server has got 2 drives in SW RAID1 (mdadm) and atop is reporting them as constantly busy causing load averages of 20-30 sometimes (fluctuates). There is no high r/w while this is happening which is why I'm confused. smartctl reports the disks as PASSED.
load average: 32.37, 21.31, 14.76
There is a raid check going on but this also happens when there are no checks:
Personalities : [raid1]
md0 : active raid1 sdb1[2] sda1[3]
511936 blocks super 1.0 [2/2] [UU]
md2 : active raid1 sdb3[2] sda3[3]
1944481792 blocks super 1.1 [2/2] [UU]
[========>............] check = 44.8% (872383040/1944481792) finish=26473.8min speed=674K/sec
bitmap: 9/15 pages [36KB], 65536KB chunk
md1 : active raid1 sdb2[2] sda2[3]
8380416 blocks super 1.1 [2/2] [UU]
dd if=/dev/zero of=test bs=64k count=16k conv=fdatasync; unlink test
16384+0 records in
16384+0 records out
1073741824 bytes (1.1 GB) copied, 47.2936 s, 22.7 MB/s
DSK | sdb | busy 97% | read 738 | write 996 | avio 5.49 ms |
DSK | sda | busy 95% | read 717 | write 1007 | avio 5.41 ms |
/dev/sda:
Timing buffered disk reads: 60 MB in 3.18 seconds = 18.89 MB/sec
/dev/sdb:
Timing buffered disk reads: 52 MB in 3.14 seconds = 16.56 MB/sec
The drives are 2TB western digital RE4s.
iotop:
Total DISK READ: 0.00 B/s | Total DISK WRITE: 1048.89 K/s
TID PRIO USER DISK READ DISK WRITE SWAPIN IO> COMMAND
341821 idle root 0.00 B 64.00 K 0.00 % 44.91 % [jbd2/ploop23591]
1004348 be/3 root 0.00 B 44.00 K 0.00 % 31.19 % [jbd2/p~op29098]
5229 idle root 0.00 B 44.00 K 0.00 % 20.07 % [jbd2/ploop31426]
5194 idle root 0.00 B 8.00 K 0.00 % 15.73 % [jbd2/ploop13867]
972 be/3 root 0.00 B 116.00 K 0.00 % 15.48 % [jbd2/md2-8]
2292 be/3 root 0.00 B 68.00 K 0.00 % 7.82 % auditd
543312 be/3 root 0.00 B 44.00 K 0.00 % 7.54 % [jbd2/ploop47686]
13863 be/3 root 0.00 B 4.00 K 0.00 % 6.52 % [jbd2/ploop52010]
4520 be/3 root 0.00 B 0.00 B 0.00 % 6.23 % [jbd2/ploop45534]
729122 be/4 7796 0.00 B 64.00 K 0.00 % 5.87 % qmail-send
4195 idle root 0.00 B 92.00 K 0.00 % 5.74 % [jbd2/ploop56464]
5114 be/3 root 0.00 B 44.00 K 0.00 % 5.74 % [jbd2/ploop58038]
353581 be/4 110 8.00 K 24.00 K 0.00 % 5.71 % mysqld -~ort=3306
618746 be/3 root 0.00 B 0.00 B 0.00 % 5.23 % [jbd2/ploop17859]
top:
Cpu(s): 8.3%us, 2.9%sy, 0.0%ni, 70.4%id, 18.1%wa, 0.0%hi, 0.3%si, 0.0%s
is this normal? any pointers or assistance is appreciated.


Comments
Sounds like disk problem? Install sysstat then run iostat (iostat -k -h -n 5 maybe?). Or try running htop and filtering processes by the state "D" at top.
Could be numerous things.
thanks for the reply. yeah I presumed it was disk. what i'm trying to find out is why i'm having this issue only on this machine and not others.
Can you please run htop?
sure. what are you looking for specifically?
there's only about 3-4 processes in D state
If you have compiled with openVz support, you should be able to see container specific usage. Just one additional step for troubleshooting. > @ztk said:
I've attempted to compile the latest htop tarball with the required dev tools but with no luck. How did you manage it?
yum/apt-get install htop
On CentOS you might need epel-release installed.
@MikeA
I'm aware of installing the binary from repo, I was referring to compiling it with openvz support as suggested by @seriesn
Oh sorry, I should read. (I had a feeling I was answering a question that was too obvious)
no worries.
still looking for suggestions on how to determine the cause of this high load issue.
This can sometimes be down to a single VM with for example 1 core assigned maxing out CPU to the point their VM load is in the 30's, and with OpenVZ this passes through to host.
Have you checked per a VM load when the host is sitting in the 30's?
there are the 3 highest containers:
doesn't seem like anything that would cause 20-30 load avg on the host
plus the CPUs are 80% idle as shown in the OP
if anyone is interested this is the command I used to sort the containers by highest load and number of processes:
vzlist -o vpsid,laverage,numproc -s -laverageHave tried pausing / cancelling the Raid Check and then waiting around 30 minutes and compare the VM Load values alongside host?
Have you checked Ram use? If the VM's are heavily eating into SWAP will increase the load, specially during heavy I/O from Raid Check.
Have you tried a full reboot of the node as a last resort? Obviously during a scheduled window.
It's purely a coincidence that i'm posting while the raid check is going on, I've seen it at 10-20 load avg while the raid array was healthy without a check running.
ram is mostly cached, and doesn't look like swap is fully utilized:
after a reboot the loads are 100+ until all the containers are booted then it's back to 10-20. right now it's at 11-12.
Try SysDig. Also do a "perf top" Perf - you might be able to recognize the system calls taking up CPU.
I highly doubt it's CPU causing this as the CPUs are 70-80% idle, this looks like iowait to me for which I cannot find the cause of.
I know. You typically want to track down the kernel code-path that is causing problems - identify the device driver. Is it your RAID driver, or ext4 etc.. Most kernels are sufficiently instrumented that you can guess the problem from the system call trace taken over time. If its still hard to pin down, record some traces and generate a flame graph.
seems to me more like a container or process is generating a huge amount of IOPS which is why mdadm is crawling along and your sequential speed is so low.
yeah, this is a good assumption. it does look like one container might be causing a lot of writes or reads from the array.
do you know the best way of tracking the number of IOPS on a per container basis?
sounds a bit too complicated for me unfortunately, any guides on how to operate these tools to get the results i'm looking for?
atop -d and vzpid is a good start
the binary htop should have openvz support, you just need to learn to add columns (like CTID and disk) and sort.
atop looks cool too, you can do the -d switch as @anthonysmith said or just hit 'd' once you are in the tool.
RAID check is finished:
Load:
load average: 11.15, 13.31, 14.14I have been using these already but no particular process seems abusive.
yep, I have been using this already coupled with vzpid to find the CTID of the process.
Just added CTID and disk R/W columns but nothing over 1MB/s is coming up after sorting it by I/O.
Install perf (For Ubuntu, I think its the 'linux-tools' package)
Run
perf top. Watch the results for some time, maybe you will see a pattern.I haven't used Sysdig, but it has a GUI, and container support. The installation is a bit intrusive and installs DKMS (dynamic kernel modules).
tried installing the centos perf binary and running it is just spewing errors all over the place
Maybe a version mismatch between kernel and perf. Perf source code is part of the linux kernel repo.
Does CentOS have multiple (versioned) packages for perf?
EDIT:
perfis unstable for Linux kernel 2.6.x and CentOS 6.x