unexplained high load on openvz hostnode

ztk · October 2016

hi lowendtalk,

I have an issue with one of my openvz hostnodes. I figured someone here could help me out as I've been scratching my head over this.

the dedicated server has got 2 drives in SW RAID1 (mdadm) and atop is reporting them as constantly busy causing load averages of 20-30 sometimes (fluctuates). There is no high r/w while this is happening which is why I'm confused. smartctl reports the disks as PASSED.

load average: 32.37, 21.31, 14.76

There is a raid check going on but this also happens when there are no checks:

Personalities : [raid1]
md0 : active raid1 sdb1[2] sda1[3]
      511936 blocks super 1.0 [2/2] [UU]

md2 : active raid1 sdb3[2] sda3[3]
      1944481792 blocks super 1.1 [2/2] [UU]
      [========>............]  check = 44.8% (872383040/1944481792) finish=26473.8min speed=674K/sec
      bitmap: 9/15 pages [36KB], 65536KB chunk

md1 : active raid1 sdb2[2] sda2[3]
      8380416 blocks super 1.1 [2/2] [UU]


dd if=/dev/zero of=test bs=64k count=16k conv=fdatasync; unlink test
16384+0 records in
16384+0 records out
1073741824 bytes (1.1 GB) copied, 47.2936 s, 22.7 MB/s




DSK |          sdb | busy     97% | read     738 | write    996 | avio 5.49 ms |
DSK |          sda | busy     95% | read     717 | write   1007 | avio 5.41 ms |



/dev/sda:
 Timing buffered disk reads:  60 MB in  3.18 seconds =  18.89 MB/sec

/dev/sdb:
 Timing buffered disk reads:  52 MB in  3.14 seconds =  16.56 MB/sec

The drives are 2TB western digital RE4s.

SDA: http://termbin.com/2z5g

SDB: http://termbin.com/phcam

iotop:

Total DISK READ: 0.00 B/s | Total DISK WRITE: 1048.89 K/s
TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN     IO>    COMMAND
341821 idle root          0.00 B     64.00 K  0.00 % 44.91 % [jbd2/ploop23591]
1004348 be/3 root          0.00 B     44.00 K  0.00 % 31.19 % [jbd2/p~op29098]
 5229 idle root          0.00 B     44.00 K  0.00 % 20.07 % [jbd2/ploop31426]
 5194 idle root          0.00 B      8.00 K  0.00 % 15.73 % [jbd2/ploop13867]
  972 be/3 root          0.00 B    116.00 K  0.00 % 15.48 % [jbd2/md2-8]
 2292 be/3 root          0.00 B     68.00 K  0.00 %  7.82 % auditd
543312 be/3 root          0.00 B     44.00 K  0.00 %  7.54 % [jbd2/ploop47686]
13863 be/3 root          0.00 B      4.00 K  0.00 %  6.52 % [jbd2/ploop52010]
 4520 be/3 root          0.00 B      0.00 B  0.00 %  6.23 % [jbd2/ploop45534]
729122 be/4 7796          0.00 B     64.00 K  0.00 %  5.87 % qmail-send
 4195 idle root          0.00 B     92.00 K  0.00 %  5.74 % [jbd2/ploop56464]
 5114 be/3 root          0.00 B     44.00 K  0.00 %  5.74 % [jbd2/ploop58038]
353581 be/4 110           8.00 K     24.00 K  0.00 %  5.71 % mysqld -~ort=3306
618746 be/3 root          0.00 B      0.00 B  0.00 %  5.23 % [jbd2/ploop17859]

top:

Cpu(s): 8.3%us, 2.9%sy, 0.0%ni, 70.4%id, 18.1%wa, 0.0%hi, 0.3%si, 0.0%s

is this normal? any pointers or assistance is appreciated.

MikeA · October 2016

Sounds like disk problem? Install sysstat then run iostat (iostat -k -h -n 5 maybe?). Or try running htop and filtering processes by the state "D" at top.

Could be numerous things.

ztk · October 2016

@MikeA said:
Sounds like disk problem? Install sysstat then run iostat (iostat -k -h -n 5 maybe?). Or try running htop and filtering processes by the state "D" at top.

Could be numerous things.

thanks for the reply. yeah I presumed it was disk. what i'm trying to find out is why i'm having this issue only on this machine and not others.

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           9.12    0.00    3.55    5.32    0.00   82.01

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda             177.00   111.00  679.00   53.00 112776.00  1426.00   156.01     5.17    7.12    5.03   33.85   1.21  88.70
sdb             192.00   117.00  666.00   47.00 113152.00  1426.00   160.70    12.07   16.99   15.26   41.53   1.40 100.10
md1               0.00     0.00    1.00    0.00     8.00     0.00     8.00     0.00    0.00    0.00    0.00   0.00   0.00
md2               0.00     0.00    0.00  158.00     0.00  1392.00     8.81     0.00    0.00    0.00    0.00   0.00   0.00
md0               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           8.99    0.00    4.35    4.81    0.00   81.84

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda             245.00    79.00  388.00   52.00 81152.00  1064.00   186.85     5.93   13.64   12.53   21.94   2.08  91.60
sdb             290.00    87.00  340.00   44.00 80768.00  1064.00   213.10     7.14   18.78   17.01   32.43   2.60  99.90
md1               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
md2               0.00     0.00    0.00  245.00     0.00  2056.00     8.39     0.00    0.00    0.00    0.00   0.00   0.00
md0               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00

seriesn · October 2016

Can you please run htop?

ztk · October 2016

@seriesn said:
Can you please run htop?

sure. what are you looking for specifically?

there's only about 3-4 processes in D state

seriesn · October 2016

If you have compiled with openVz support, you should be able to see container specific usage. Just one additional step for troubleshooting. > @ztk said:

@seriesn said:
Can you please run htop?

sure. what are you looking for specifically?

there's only about 3-4 processes in D state

ztk · October 2016

@seriesn said:
If you have compiled with openVz support, you should be able to see container specific usage. Just one additional step for troubleshooting. > @ztk said:

@seriesn said:
Can you please run htop?

sure. what are you looking for specifically?

there's only about 3-4 processes in D state

I've attempted to compile the latest htop tarball with the required dev tools but with no luck. How did you manage it?

MikeA · October 2016

@ztk said:

@seriesn said:
If you have compiled with openVz support, you should be able to see container specific usage. Just one additional step for troubleshooting. > @ztk said:

@seriesn said:
Can you please run htop?

sure. what are you looking for specifically?

there's only about 3-4 processes in D state

I've attempted to compile the latest htop tarball with the required dev tools but with no luck. How did you manage it?

yum/apt-get install htop

On CentOS you might need epel-release installed.

ztk · October 2016

@MikeA

I'm aware of installing the binary from repo, I was referring to compiling it with openvz support as suggested by @seriesn

MikeA · October 2016

@ztk said:
@MikeA

I'm aware of installing the binary from repo, I was referring to compiling it with openvz support as suggested by @seriesn

Oh sorry, I should read. (I had a feeling I was answering a question that was too obvious)

ztk · October 2016

@MikeA said: Oh sorry, I should read. (I had a feeling I was answering a question that was too obvious)

no worries.

still looking for suggestions on how to determine the cause of this high load issue.

AshleyUk · October 2016

This can sometimes be down to a single VM with for example 1 core assigned maxing out CPU to the point their VM load is in the 30's, and with OpenVZ this passes through to host.

Have you checked per a VM load when the host is sitting in the 30's?

ztk · October 2016

@AshleyUk said:
This can sometimes be down to a single VM with for example 1 core assigned maxing out CPU to the point their VM load is in the 30's, and with OpenVZ this passes through to host.

Have you checked per a VM load when the host is sitting in the 30's?

there are the 3 highest containers:

CTID       LAVERAGE      NPROC
519 1.86/2.06/2.09        417
401 1.38/0.95/0.83         62
496 1.07/0.78/0.79        179

doesn't seem like anything that would cause 20-30 load avg on the host

plus the CPUs are 80% idle as shown in the OP

ztk · October 2016

if anyone is interested this is the command I used to sort the containers by highest load and number of processes:

vzlist -o vpsid,laverage,numproc -s -laverage

AshleyUk · October 2016

Have tried pausing / cancelling the Raid Check and then waiting around 30 minutes and compare the VM Load values alongside host?

Have you checked Ram use? If the VM's are heavily eating into SWAP will increase the load, specially during heavy I/O from Raid Check.

Have you tried a full reboot of the node as a last resort? Obviously during a scheduled window.

ztk · October 2016

@AshleyUk said: Have tried pausing / cancelling the Raid Check and then waiting around 30 minutes and compare the VM Load values alongside host?

It's purely a coincidence that i'm posting while the raid check is going on, I've seen it at 10-20 load avg while the raid array was healthy without a check running.

AshleyUk said: Have you checked Ram use? If the VM's are heavily eating into SWAP will increase the load, specially during heavy I/O from Raid Check.

ram is mostly cached, and doesn't look like swap is fully utilized:

     total       used       free     shared    buffers     cached
Mem:   70G        69G       1.1G       380M       9.8G        49G
-/+ buffers/cache: 10G      60G
Swap:  8.0G       5.8G       2.2G

AshleyUk said: Have you tried a full reboot of the node as a last resort? Obviously during a scheduled window.

after a reboot the loads are 100+ until all the containers are booted then it's back to 10-20. right now it's at 11-12.

rincewind · October 2016

Try SysDig. Also do a "perf top" Perf - you might be able to recognize the system calls taking up CPU.

ztk · October 2016

@rincewind said:
Try SysDig. Also do a "perf top" Perf - you might be able to recognize the system calls taking up CPU.

I highly doubt it's CPU causing this as the CPUs are 70-80% idle, this looks like iowait to me for which I cannot find the cause of.

rincewind · October 2016

I know. You typically want to track down the kernel code-path that is causing problems - identify the device driver. Is it your RAID driver, or ext4 etc.. Most kernels are sufficiently instrumented that you can guess the problem from the system call trace taken over time. If its still hard to pin down, record some traces and generate a flame graph.

AnthonySmith · October 2016

seems to me more like a container or process is generating a huge amount of IOPS which is why mdadm is crawling along and your sequential speed is so low.

ztk · October 2016

@AnthonySmith said:
seems to me more like a container or process is generating a huge amount of IOPS which is why mdadm is crawling along and your sequential speed is so low.

yeah, this is a good assumption. it does look like one container might be causing a lot of writes or reads from the array.

do you know the best way of tracking the number of IOPS on a per container basis?

ztk · October 2016

@rincewind said:
I know. You typically want to track down the kernel code-path that is causing problems - identify the device driver. Is it your RAID driver, or ext4 etc.. Most kernels are sufficiently instrumented that you can guess the problem from the system call trace taken over time. If its still hard to pin down, record some traces and generate a flame graph.

sounds a bit too complicated for me unfortunately, any guides on how to operate these tools to get the results i'm looking for?

AnthonySmith · October 2016

atop -d and vzpid is a good start

miamiconsultant · October 2016

ztk said: sounds a bit too complicated for me unfortunately, any guides on how to operate these tools to get the results i'm looking for?

the binary htop should have openvz support, you just need to learn to add columns (like CTID and disk) and sort.

atop looks cool too, you can do the -d switch as @anthonysmith said or just hit 'd' once you are in the tool.

ztk · October 2016

RAID check is finished:

Personalities : [raid1]
md0 : active raid1 sdb1[2] sda1[3]
      511936 blocks super 1.0 [2/2] [UU]

md2 : active raid1 sdb3[2] sda3[3]
      1944481792 blocks super 1.1 [2/2] [UU]
      bitmap: 8/15 pages [32KB], 65536KB chunk

md1 : active raid1 sdb2[2] sda2[3]
      8380416 blocks super 1.1 [2/2] [UU]

unused devices: <none>

Load:

load average: 11.15, 13.31, 14.14

ztk · October 2016

ztk · October 2016

@AnthonySmith said:
atop -d and vzpid is a good start

I have been using these already but no particular process seems abusive.

@miamiconsultant said: atop looks cool too, you can do the -d switch as @anthonysmith said or just hit 'd' once you are in the tool.

yep, I have been using this already coupled with vzpid to find the CTID of the process.

ztk · October 2016

@miamiconsultant said: the binary htop should have openvz support, you just need to learn to add columns (like CTID and disk) and sort.

Just added CTID and disk R/W columns but nothing over 1MB/s is coming up after sorting it by I/O.

rincewind · October 2016

Install perf (For Ubuntu, I think its the 'linux-tools' package)
Run perf top. Watch the results for some time, maybe you will see a pattern.

I haven't used Sysdig, but it has a GUI, and container support. The installation is a bit intrusive and installs DKMS (dynamic kernel modules).

ztk · October 2016

@rincewind said:
Install perf (For Ubuntu, I think its the 'linux-tools' package)
Run perf top. Watch the results for some time, maybe you will see a pattern.

I haven't used Sysdig, but it has a GUI, and container support. The installation is a bit intrusive and installs DKMS (dynamic kernel modules).

tried installing the centos perf binary and running it is just spewing errors all over the place

rincewind · October 2016

Maybe a version mismatch between kernel and perf. Perf source code is part of the linux kernel repo.

Does CentOS have multiple (versioned) packages for perf?

EDIT: perf is unstable for Linux kernel 2.6.x and CentOS 6.x

Howdy, Stranger!

Categories

In this Discussion

unexplained high load on openvz hostnode

Comments

Howdy, Stranger!

Quick Links

Categories

In this Discussion

unexplained high load on openvz hostnode

Comments