weird MariaDB slowness and high IOwait on htop output

qtwrk · October 2024

Hi,

I have a weird issue I don't understand , hope I can get some wisdom and enlightenment

I have a Hetzner AX102 running , and yesterday I grabbed an AX52.

and here comes the problem

on AX102 , when I do mysqldump and mysql < dump.sql or mydumper and myloader, for database export/import , as a sample of 35MB database, it takes like 1,2 seconds

on AX52 , same operation takes like 15 - 20 seconds to import , I would expect maybe little longer as difference in hardware, but this kind of way off

I also have a Netcup vServer 2000 , in that server with same set up as AX52, it takes like 4 seconds for import, this is reasonable time as I expected.

when I run htop , enable IO for CPU , I can see at least one core gives like 80-90% IOWait status during import.

YABS IO on AX102:

fio Disk Speed Tests (Mixed R/W 50/50) (Partition /dev/md1):
---------------------------------
Block Size | 4k            (IOPS) | 64k           (IOPS)
  ------   | ---            ----  | ----           ----
Read       | 506.13 MB/s (126.5k) | 2.22 GB/s    (34.6k)
Write      | 507.47 MB/s (126.8k) | 2.23 GB/s    (34.8k)
Total      | 1.01 GB/s   (253.4k) | 4.45 GB/s    (69.5k)
           |                      |
Block Size | 512k          (IOPS) | 1m            (IOPS)
  ------   | ---            ----  | ----           ----
Read       | 2.50 GB/s     (4.8k) | 2.47 GB/s     (2.4k)
Write      | 2.63 GB/s     (5.1k) | 2.63 GB/s     (2.5k)
Total      | 5.14 GB/s    (10.0k) | 5.11 GB/s     (4.9k)

YABS IO on AX52

fio Disk Speed Tests (Mixed R/W 50/50) (Partition /dev/md2):
---------------------------------
Block Size | 4k            (IOPS) | 64k           (IOPS)
  ------   | ---            ----  | ----           ----
Read       | 644.62 MB/s (161.1k) | 2.01 GB/s    (31.4k)
Write      | 646.32 MB/s (161.5k) | 2.02 GB/s    (31.6k)
Total      | 1.29 GB/s   (322.7k) | 4.04 GB/s    (63.1k)
           |                      |
Block Size | 512k          (IOPS) | 1m            (IOPS)
  ------   | ---            ----  | ----           ----
Read       | 2.50 GB/s     (4.8k) | 2.58 GB/s     (2.5k)
Write      | 2.63 GB/s     (5.1k) | 2.75 GB/s     (2.6k)
Total      | 5.14 GB/s    (10.0k) | 5.33 GB/s     (5.2k)

both are on soft RAID 1

I tried: apply exact same conf , copy-paste from my old server to new server, no change

also tried reinstall, nothing is changed, only yum update and then install database and do import/export test, all takes 15-20 seconds

tried different versions of mariadb like 10.3 , 10.6 , 11.4 ...etc , still same

I thought might be raid sync after initial system installation , but I waited overnight for it completes , still same.

that high IOWait is just so weird on NVMe for 35MB database import though ...

maybe faulty disk ? but YABS output looks alright-ish.

nvme-cli output like 7, 8% life usage , should still consider a healthy enough.

what else can I try ? or what could cause this ?

thanks in advance

egoror · October 2024

Just five cents from the top of my mind.
Try something like iostat -x 1 (yum install sysstat) to monitor i/o and iotop (yum install iotop) or second tab of htop to see what is loading i/o. And compare data.

emgh · October 2024

@qtwrk said: what else can I try ?

May not provide anything but I'd try to import a 35MB PostgreSQL database to both machines.

That way, you can isolate if it's hardware/OS-related or directly related to a combination of the hardware/OS with MariaDB.

qtwrk · October 2024

thanks for the hint , iotop puts me onto the right direction , it gives me high IO on jbd2/md2-8 , now let me google out what's with it...

emgh · October 2024

@qtwrk said:
thanks for the hint , iotop puts me onto the right direction , it gives me high IO on jbd2/md2-8 , now let me google out what's with it...

TIL: jdb2 is ("jbd2 is a kernel thread that updates the filesystem journal."), and "md2-8" is the name of linux software RAID "md (multiple device) " and originally "Mirror Disk".

Source: https://forums.sabnzbd.org/viewtopic.php?t=24179

How did you install the OS on the machines? Hetzner’s ’auto installer’, rescue or some other way? Did you do it differently on the different servers? What filesystem are they running?

qtwrk · October 2024

system is installed by hetzner rescure -> installimage , I only changed hostname , partition size (comment out the swap since this is 64GB server , no really need the SWAP) , then install it , nothing is touched

and I absolutely have no idea about file system stuff, never touched anything about it

let me try try the automated installation Alma8 , see how it goes ...

emgh · October 2024

@qtwrk said:
system is installed by hetzner rescure -> installimage , I only changed hostname , partition size (comment out the swap since this is 64GB server , no really need the SWAP) , then install it , nothing is touched

and I absolutely have no idea about file system stuff, never touched anything about it

let me try try the automated installation Alma8 , see how it goes ...

I don’t know about RHEL stuff, but you can change filesystem in the InstallImage config. I’d try XFS on the slow server since it’s probably EXT4 and see if that changes anything

Edit: But yeah, try the auto installer as well

qtwrk · October 2024

thanks for the hint , let me play around a bit both autoinstall and xfs , see how it goes

BackboneDirect · October 2024

If still not helped, checked if there is not a resync or anything running bottlenecking you as well:

cat /proc/mdstat

Also, log into mariadb using 'mysql' and then do

show full processlist

qtwrk · October 2024

@BackboneDirect said:
If still not helped, checked if there is not a resync or anything running bottlenecking you as well:

cat /proc/mdstat

Also, log into mariadb using 'mysql' and then do

show full processlist

I've checked the mdstat , behaves same when it was sync'ing and it has finished sync'ing

for 2nd one .... I just reinstalled system , need little time to reinstall database to test

emgh · October 2024

@qtwrk said:

@BackboneDirect said:
If still not helped, checked if there is not a resync or anything running bottlenecking you as well:

cat /proc/mdstat

Also, log into mariadb using 'mysql' and then do

show full processlist

I've checked the mdstat , behaves same when it was sync'ing and it has finished sync'ing

for 2nd one .... I just reinstalled system , need little time to reinstall database to test

While you’re waiting on that, ever considered the EX-series? I’m very happy with the EX44, the price/performance is amazing.

qtwrk · October 2024

@emgh said:

@qtwrk said:

@BackboneDirect said:
If still not helped, checked if there is not a resync or anything running bottlenecking you as well:

cat /proc/mdstat

Also, log into mariadb using 'mysql' and then do

show full processlist

I've checked the mdstat , behaves same when it was sync'ing and it has finished sync'ing

for 2nd one .... I just reinstalled system , need little time to reinstall database to test

While you’re waiting on that, ever considered the EX-series? I’m very happy with the EX44, the price/performance is amazing.

I need like at least about 1TB disk ... so if I order 2 1TB , it will like 80-ish , pretty much same price as AX52...

darkimmortal · October 2024

SSD models? Non enterprise SSDs can be very slow for database workloads / synced writes. YABS does not test for that

qtwrk · October 2024

@darkimmortal said:
SSD models? Non enterprise SSDs can be very slow for database workloads / synced writes

based on model number , it should be Samsung PM9A1

I know it may not be good as "Datacenter Edition" ,but this is really way off , I think

aj_potc · October 2024

@darkimmortal said:
SSD models? Non enterprise SSDs can be very slow for database workloads / synced writes. YABS does not test for that

I've wasted countless hours troubleshooting a very similar issue as the OP's, and this was the answer. Apparently, the SSDs used in the server I was troubleshooting were consumer-grade Microns. They were well rated desktop SSDs for the time, but an absolute disaster for database server performance. When dealing with concurrent operations that needed IO, they were useless. Moving to Intel datacenter SSDs fixed the problem.

I don't think Hetzner uses consumer-grade SSDs, but I would still focus on the storage as the most likely culprit here.

qtwrk · October 2024

@emgh said:
I don’t know about RHEL stuff, but you can change filesystem in the InstallImage config. I’d try XFS on the slow server since it’s probably EXT4 and see if that changes anything

well , things still don't add up on XFS system

[root@alma8 ~]# time myloader -u ... -p "..." --purge-mode DROP -e -B xxx -o --directory /xxx

real    0m18.383s
user    0m0.062s
sys 0m0.065s

still taking way too long , or

[root@alma8 ~]# time mysql -u ... -p"..." xxx < xxx.sql

real    0m18.412s
user    0m0.064s
sys 0m0.014s

now I don't see any weird kernel thread , but only mariadb itself showing up in iotop output ...

darkimmortal · October 2024

You can try testing worst-case synchronised 4k writes with fio:

fio --name=x --ioengine=posixaio --rw=write --bs=4k --iodepth=1 --size=512MB --fsync=1 --filename=test.img

Generally good consumer NVMe's such as the Samsung model you mentioned will be around 300-500 iops, enterprise around 10x faster

Once you take parallelism and queuing out of the equation, and expect every write to be coherent across power loss, consumer NVMes aren't much better than 7200rpm hard drives. You have to pay the extra for an enterprise NVMe to see an improvement in this kind of worst-case workload

Of course you can trade power loss safety for better performance - plenty of knobs to tweak for that

egoror · October 2024

Will mount -o remount,commit=60 /dev/md2 help?
If innodb_flush_log_at_trx_commit is set to 1 in mariadb config file, try setting it to 2.

qtwrk · October 2024

@darkimmortal said:
You can try testing worst-case synchronised 4k writes with fio:

fio --name=x --ioengine=posixaio --rw=write --bs=4k --iodepth=1 --size=512MB --fsync=1 --filename=test.img

Generally good consumer NVMe's such as the Samsung model you mentioned will be around 300-500 iops, enterprise around 10x faster

Once you take parallelism and queuing out of the equation, and expect every write to be coherent across power loss, consumer NVMes aren't much better than 7200rpm hard drives. You have to pay the extra for an enterprise NVMe to see an improvement in this kind of worst-case workload

Of course you can trade power loss safety for better performance - plenty of knobs to tweak for that

on AX102:

iops : min=14428, max=16140, avg=15414.06, stdev=556.25, samples=16

on AX52:

iops : min= 56, max= 164, avg=118.71, stdev=24.77, samples=2205

okay , this really solves all the puzzle

egoror · October 2024

@qtwrk said: iops : min= 56, max= 164, avg=118.71, stdev=24.77, samples=2205

That's too low, you might be running with write cache disabled.

qtwrk · October 2024

@egoror said:

@qtwrk said: iops : min= 56, max= 164, avg=118.71, stdev=24.77, samples=2205

That's too low, you might be running with write cache disabled.

[root@Alma-8-latest-amd64-base ~]# cat /sys/block/nvme0n1/queue/write_cache
write back
[root@Alma-8-latest-amd64-base ~]# cat /sys/block/nvme1n1/queue/write_cache
write back

emmm???

egoror · October 2024

@qtwrk said: emmm???

Interesting.
Try this stuff then

@egoror said:
Will mount -o remount,commit=60 /dev/md2 help?
If innodb_flush_log_at_trx_commit is set to 1 in mariadb config file, try setting it to 2.

qtwrk · October 2024

mount -o remount,commit=60 /dev/md3
innodb_flush_log_at_trx_commit = 2

mount makes it bit slower , innodb param make it little faster

but still nowhere close to AX102 performance, I guess it's fundamentally disk issue ...

lowprofile · October 2024

Check the disk wearout - you will have to install some packages.

sudo apt-get install nvme-cli
sudo nvme smart-log /dev/nvme0n1

qtwrk · October 2024

@lowprofile said:
Check the disk wearout - you will have to install some packages.

sudo apt-get install nvme-cli
sudo nvme smart-log /dev/nvme0n1

already checked , life was used like 7, 8 % on both

lowprofile · October 2024

Your my.cnf is same on both servers ?

qtwrk · October 2024

@lowprofile said:
Your my.cnf is same on both servers ?

yeah , tried same conf , default conf , but ultimately , I guess it's disk thing ...

rdes · October 2024

Maybe try fstrim?
Had similar problem with high iowait ont NVMe some time ago and this helped.

emgh · October 2024

Can you skip RAID fully and try the two drives separately?

qtwrk · October 2024

@rdes said:
Maybe try fstrim?
Had similar problem with high iowait ont NVMe some time ago and this helped.

I am not exactly familiar with this command , could you please give a quick example ?

@emgh said:
Can you skip RAID fully and try the two drives separately?

you are about 1 second early before I clicked the cancellation button , lol

let me try that scenario

rdes · October 2024

@rdes said:
Maybe try fstrim?
Had similar problem with high iowait ont NVMe some time ago and this helped.

I am not exactly familiar with this command , could you please give a quick example ?

fstrim --verbose --all

In my case I had around 500 GB of unused blocks and it really slowed a drive during intensive tasks like backup.

More about that: https://www.blunix.com/blog/increasing-ssd-write-performance-on-linux-with-fstrim.html

Howdy, Stranger!

Categories

In this Discussion

weird MariaDB slowness and high IOwait on htop output

Comments

Howdy, Stranger!

Quick Links

Categories

In this Discussion

weird MariaDB slowness and high IOwait on htop output

Comments